Model Serving Runtimes

    • Scale to and from Zero
    • Request based Autoscaling on CPU/GPU
    • Revision Management
    • Batching
    • Request/Response logging
    • Traffic management
    • Distributed Tracing
    • Out-of-the-box metrics
    • Ingress/Egress control