Load balancing

    Activator pods are scaled horizontally, so there may be multiple Activators in a deployment. In general, the system will perform best if the number of revision pods is larger than the number of Activator pods, and those numbers divide equally.

    The Activator load balancing algorithm works as follows:

    • If concurrency is set to a value less or equal than 3, the Activator will send the request to the first pod that has capacity. Otherwise, requests will be balanced in a round robin fashion, with respect to container concurrency.

    Configuring target burst capacity

    Target burst capacity is mainly responsible for determining whether the Activator is in the request path outside of scale-from-zero scenarios.

    • Setting the targeted concurrency limits for the revision. See .
    • Setting the target burst capacity. You can configure target burst capacity using the key in the ConfigMap. See Setting the target burst capacity.
    • Setting the Activator capacity by using the ConfigMap. See .