Istio Usage in Kubeflow

Out of date

This guide contains outdated information pertaining to Kubeflow 1.0. This guide needs to be updated for Kubeflow 1.1.

Kubeflow v0.6 onwards deploys Istio along with configuration to enable end-to-end authentication and access control. This setup is the foundation of multi-tenancy support in Kubeflow. A Kubeflow deployment without Istio is not possible.

Most modern applications are built using a distributed microservices architecture. This ensures that each individual service is simple and has a well defined responsibility. Complex systems and platforms are generally built by combining many such microservices. Each microservice defines its own APIs and the services interact with each other using these APIs in order to serve end-user requests.

Istio is a pioneering and highly performant open source implementation of service mesh by Google. For further details, you can read the of Istio.

Kubeflow is a collection of tools, frameworks and services that are deployed together into a single Kubernetes cluster to enable end-to-end ML workflows. Most of these components or services are developed independently and help with different parts of the workflow. Developing a complete ML workflow or an ML development environment requires combining multiple services and components. Kubeflow provides the underlying infrastructure that makes it possible to put such disparate components together.

Kubeflow uses Istio as a uniform way to secure, connect, and monitor microservices. Specifically:

  • Securing service-to-service communication in a Kubeflow deployment with strong identity-based authentication and authorization.
  • A policy layer for supporting access controls and quotas.

  1. The user request is intercepted by an identification proxy which talks to a SSO service provider such as IAM on Cloud Services Provider or Active Directory/LDAP on-premises.
  2. When the user is authenticated, the request is modified by the Istio Gateway to include a JWT Header token containing the identity of the user. All requests throughout the service mesh carry this token along.
  3. The Istio RBAC policies are applied on the incoming request to validate the access to the service and the requested namespace. If either of those are inaccessible to the user, an error response is sent back.
  4. Notebooks Controller validates authorization with Kubernetes RBAC and creates the notebook pod in the namespace that the user requested.

Further actions by the user with the notebook to create training jobs or other resources in the namespace go through a similar process. Profiles Controller manages the creation of profiles, and creates and applies appropriate Istio policies. For more details, please see multi-user isolation.

Currently it is not possible to deploy Kubeflow without Istio. Kubeflow needs the Istio Custom Resource Definitions (CRDs) to express the new route to access the created Notebook from the Gateway.