Locality Weighted Load Balancing

Setup your sandbox environment with Docker and Docker Compose, and clone the Envoy repository with Git.

Used to make requests.

This example demonstrates the locality weighted load balancing feature in Envoy proxy. The demo simulates a scenario that a backend service resides in two local zones and one remote zone.

The components used in this demo are as follows:

A client container: runs Envoy proxy
Backend container in the same locality as the client, with priority set to 0, referred to as local-1.
Backend container in the same locality as the client, with priority set to 1, referred to as local-2.
Backend container in the the remote locality, with priority set to 2, referred to as remote-2.

The client Envoy proxy configures the 4 backend containers in the same Envoy cluster, so that Envoy handles load balancing to those backend servers. From here we can see, we have localities with 3 different priorities:

priority 0: local-1
priority 1: local-2 and remote-1
priority 2: remote-2

In Envoy, when the healthiness of a given locality drops below a threshold (71% by default), the next priority locality will start to share the request loads. The demo below will show this behavior.

In terminal, move to the examples/locality_load_balancing directory.

To build this sandbox example and start the example services, run the following commands:

The locality configuration is set in the client container via static Envoy configuration file. Please refer to the cluster section of the proxy configuration file.

# all requests to local-1
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Failed: 0

If locality local-1 becomes unhealthy (i.e. fails the Envoy health check), the requests should be load balanced among the subsequent priority localities, which are local-2 and . They both have priority 1. We then send 100 requests to the backend cluster, and check the responders.

# bring down local-1
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8000/unhealthy
[backend-local-1] Set to unhealthy
# local-2 and remote-1 localities split the traffic 50:50
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: 51, 51.0%
Hello from backend-local-2!: 49, 49.0%
Failed: 0

Now if local-2 becomes unhealthy also, priority 1 locality is only 50% healthy. Thus priority 2 locality starts to share the request load. Requests will be sent to both remote-1 and remote-2.

Before moving on, we need to server local-1 and local-2 first.

# recover local-1 and local-2 after the demo
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8000/healthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8000/healthy

To demonstrate how locality based load balancing works in multiple replicas setup, let’s now scale up the local-1 locality to 5 replicas.

$ docker-compose up --scale backend-local-1=5 -d

We are going to show the scenario that local-1 is just partially healthy. So let’s bring down 4 of the replicas in .

Then we check the endpoints again:

$ docker-compose exec -T client-envoy curl -s localhost:8001/clusters | grep health_flags
backend::172.28.0.4:8000::health_flags::/failed_active_hc
backend::172.28.0.2:8000::health_flags::/failed_active_hc
backend::172.28.0.5:8000::health_flags::/failed_active_hc
backend::172.28.0.6:8000::health_flags::/failed_active_hc
backend::172.28.0.7:8000::health_flags::healthy
backend::172.28.0.8:8000::health_flags::healthy
backend::172.28.0.3:8000::health_flags::healthy

We can confirm that 4 backend endpoints become unhealthy.

Now we send the 100 requests again.

# watch traffic change
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: actual weight 37.0%
Hello from backend-local-2!: actual weight 36.0%
Failed: 0

As local-1 does not have enough healthy workloads, requests are partially shared by secondary localities.