Model Inference Platform on Kubernetes
for Trusted AI

Get Started

  • KServe is a standard Model Inference Platform on Kubernetes, built for highly scalable use cases.
  • Support modern serverless inference workload with Autoscaling including Scale to Zero on GPU.
  • Provides high scalability, density packing and intelligent routing using ModelMesh
  • Simple and Pluggable production serving for production ML serving including prediction, pre/post processing, monitoring and explainability.
  • Advanced deployments with canary rollout, experiments, ensembles and transformers.

Provides Serverless deployment of single model inference on CPU/GPU for common ML frameworks Scikit-Learn, , Tensorflow, as well as pluggable custom model runtime.

ModelMesh Serving

ModelMesh

ModelMesh is designed for high-scale, high-density and frequently-changing model use cases. ModelMesh intelligently loads and unloads AI models to and from memory to strike an intelligent trade-off between responsiveness to users and computational footprint.

Model Explainability

Provides ML model inspection and interpretation, KServe integrates , AI Explainability 360, to help explain the predictions and gauge the confidence of those predictions.

Model Monitoring

Enables payload logging, outlier, adversarial and drift detection, KServe integrates , AI Fairness 360, to help monitor the ML models on production.

Advanced Deployments

Advanced deployments

Supports canary rollout, model experiments/ensembles and feature transformers including as well as custom pre/post processing.


Home - 图9

Home - 图11

Home - 图13