PyTorch Serving

This guide walks you through serving a PyTorch trained model in Kubeflow.

We use seldon-core component deployed following instructions to serve the model.

See also this Example module which contains the code to wrap the model with Seldon.

We will wrap this class into a seldon-core microservice which we can then deploy as a REST or GRPC API server.

So you can just run this image to get a pre-trained model from the shared persistent disk
Serving your own model using this server, exposing predict service as GRPC API

You can use the below command to build your own image to wrap your model, also check script example that calls the docker Seldon wrapper to build our server image, exposing the predict service as GRPC API.

You can then push the image by running .

You can find more details about wrapping a model with seldon-core here

This section has not yet been converted to kustomize, please refer to .

Create an environment variable, , to represent a conceptualdeployment environment such as development, test, staging, or production, asdefined by ksonnet. For this example, we use the environment. You canread more about Kubeflow’s use of ksonnet in the Kubeflowksonnet component guide.

Then modify the Ksonnet component to use your specific image.

Seldon Core component uses ambassador to route it’s requests to our model server. To send requests to the model, you can port-forward the ambassador container locally:

And send a request, for our example we know is not a torch MNIST image, so it will return an error 500