Kubeflow

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.

Read the Kubeflow overview for an introduction to the Kubeflow architecture and to see how you can use Kubeflow to manage your ML workflow.

Follow the to set up your environment and install Kubeflow.

Watch the following video which provides an introduction to Kubeflow.

What is Kubeflow?

Kubeflow is the machine learning toolkit for Kubernetes. Learn about .

Download and run the Kubeflow deployment binary.
Customize the resulting configuration files.
Run the specified script to deploy your containers to your specific environment.

You can adapt the configuration to choose the platforms and services that you want to use for each stage of the ML workflow: data preparation, model training, prediction serving, and service management.

You can choose to deploy your Kubernetes workloads locally, on-premises, or to a cloud environment.

Read the Kubeflow overview for more details.

Our goal is to make scaling machine learning (ML) models and deploying them to production as simple as possible, by letting Kubernetes do what it’s great at:

Easy, repeatable, portable deployments on a diverse infrastructure (for example, experimenting on a laptop, then moving to an on-premises cluster or to the cloud)
Deploying and managing loosely-coupled microservices

Because ML practitioners use a diverse set of tools, one of the key goals is to customize the stack based on user requirements (within reason) and let the system take care of the “boring stuff”. While we have started with a narrow set of technologies, we are working with many different projects to include additional tooling.

History

Kubeflow started as an open sourcing of the way Google ran TensorFlow internally, based on a pipeline called . It began as just a simpler way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi-architecture, multi-cloud framework for running entire machine learning pipelines.

To see what’s coming up in future versions of Kubeflow, refer to the Kubeflow roadmap.

The following components also have roadmaps:

Getting involved

There are many ways to contribute to Kubeflow, and we welcome contributions! Read the to get started on the code, and get to know the community in the community guide.