Debugging Envoy and Istiod

    If you want to try the commands described below, you can either:

    OR

    • Use similar commands against your own application running in a Kubernetes cluster.

    The proxy-status command allows you to get an overview of your mesh. If you suspect one of your sidecars isn’t receiving configuration or is out of sync then proxy-status will tell you this.

    If a proxy is missing from this list it means that it is not currently connected to a Istiod instance so will not be receiving any configuration.

    • SYNCED means that Envoy has acknowledged the last configuration Istiod has sent to it.
    • NOT SENT means that Istiod hasn’t sent anything to Envoy. This usually is because Istiod has nothing to send.
    • STALE means that Istiod has sent an update to Envoy but has not received an acknowledgement. This usually indicates a networking issue between Envoy and Istiod or a bug with Istio itself.

    Retrieve diffs between Envoy and Istiod

    The proxy-status command can also be used to retrieve a diff between the configuration Envoy has loaded and the configuration Istiod would send, by providing a proxy ID. This can help you determine exactly what is out of sync and where the issue may lie.

    1. $ istioctl proxy-status details-v1-6dcc6fbb9d-wsjz4.default
    2. --- Istiod Clusters
    3. +++ Envoy Clusters
    4. @@ -374,36 +374,14 @@
    5. "edsClusterConfig": {
    6. "edsConfig": {
    7. "ads": {
    8. }
    9. },
    10. "serviceName": "outbound|443||public-cr0bdc785ce3f14722918080a97e1f26be-alb1.kube-system.svc.cluster.local"
    11. - },
    12. - "connectTimeout": "1.000s",
    13. - "circuitBreakers": {
    14. - "thresholds": [
    15. - {
    16. -
    17. - }
    18. - ]
    19. - }
    20. - }
    21. - },
    22. - {
    23. - "cluster": {
    24. - "name": "outbound|53||kube-dns.kube-system.svc.cluster.local",
    25. - "type": "EDS",
    26. - "edsClusterConfig": {
    27. - "edsConfig": {
    28. - "ads": {
    29. -
    30. - }
    31. - },
    32. - "serviceName": "outbound|53||kube-dns.kube-system.svc.cluster.local"
    33. },
    34. "connectTimeout": "1.000s",
    35. "circuitBreakers": {
    36. "thresholds": [
    37. {
    38. }
    39. Listeners Match
    40. Routes Match (RDS last loaded at Tue, 04 Aug 2020 11:52:54 IST)

    Here you can see that the listeners and routes match but the clusters are out of sync.

    1. $ istioctl proxy-config cluster -n istio-system istio-ingressgateway-7d6874b48f-qxhn5
    2. SERVICE FQDN PORT SUBSET DIRECTION TYPE DESTINATION RULE
    3. BlackHoleCluster - - - STATIC
    4. agent - - - STATIC
    5. details.default.svc.cluster.local 9080 - outbound EDS details.default
    6. istio-ingressgateway.istio-system.svc.cluster.local 80 - outbound EDS
    7. istio-ingressgateway.istio-system.svc.cluster.local 443 - outbound EDS
    8. istio-ingressgateway.istio-system.svc.cluster.local 15021 - outbound EDS
    9. istio-ingressgateway.istio-system.svc.cluster.local 15443 - outbound EDS
    10. istiod.istio-system.svc.cluster.local 443 - outbound EDS
    11. istiod.istio-system.svc.cluster.local 853 - outbound EDS
    12. istiod.istio-system.svc.cluster.local 15010 - outbound EDS
    13. istiod.istio-system.svc.cluster.local 15012 - outbound EDS
    14. istiod.istio-system.svc.cluster.local 15014 - outbound EDS
    15. kube-dns.kube-system.svc.cluster.local 53 - outbound EDS
    16. kube-dns.kube-system.svc.cluster.local 9153 - outbound EDS
    17. kubernetes.default.svc.cluster.local 443 - outbound EDS
    18. ...
    19. productpage.default.svc.cluster.local 9080 - outbound EDS
    20. prometheus_stats - - - STATIC
    21. ratings.default.svc.cluster.local 9080 - outbound EDS
    22. reviews.default.svc.cluster.local 9080 - outbound EDS
    23. sds-grpc - - - STATIC
    24. xds-grpc - - - STRICT_DNS
    25. zipkin - - - STRICT_DNS

    In order to debug Envoy you need to understand Envoy clusters/listeners/routes/endpoints and how they all interact. We will use the proxy-config command with the -o json and filtering flags to follow Envoy as it determines where to send a request from the productpage pod to the reviews pod at reviews:9080.

    1. If you query the listener summary on a pod you will notice Istio generates the following listeners:

      • A virtual listener per service IP, per each non-HTTP for outbound TCP/HTTPS traffic.
      • A virtual listener on the pod IP for each exposed port for inbound traffic.
      • A virtual listener on 0.0.0.0 per each HTTP port for outbound HTTP traffic.
      1. $ istioctl proxy-config listeners productpage-v1-6c886ff494-7vxhs
      2. ADDRESS PORT MATCH DESTINATION
      3. 10.96.0.10 53 ALL Cluster: outbound|53||kube-dns.kube-system.svc.cluster.local
      4. 0.0.0.0 80 App: HTTP Route: 80
      5. 0.0.0.0 80 ALL PassthroughCluster
      6. 10.100.93.102 443 ALL Cluster: outbound|443||istiod.istio-system.svc.cluster.local
      7. 10.111.121.13 443 ALL Cluster: outbound|443||istio-ingressgateway.istio-system.svc.cluster.local
      8. 10.96.0.1 443 ALL Cluster: outbound|443||kubernetes.default.svc.cluster.local
      9. 10.100.93.102 853 App: HTTP Route: istiod.istio-system.svc.cluster.local:853
      10. 10.100.93.102 853 ALL Cluster: outbound|853||istiod.istio-system.svc.cluster.local
      11. 0.0.0.0 9080 App: HTTP Route: 9080
      12. 0.0.0.0 9080 ALL PassthroughCluster
      13. 0.0.0.0 9090 App: HTTP Route: 9090
      14. 0.0.0.0 9090 ALL PassthroughCluster
      15. 10.96.0.10 9153 App: HTTP Route: kube-dns.kube-system.svc.cluster.local:9153
      16. 10.96.0.10 9153 ALL Cluster: outbound|9153||kube-dns.kube-system.svc.cluster.local
      17. 0.0.0.0 15001 ALL PassthroughCluster
      18. 0.0.0.0 15006 Addr: 10.244.0.22/32:15021 inbound|15021|mgmt-15021|mgmtCluster
      19. 0.0.0.0 15006 Addr: 10.244.0.22/32:9080 Inline Route: /*
      20. 0.0.0.0 15006 Trans: tls; App: HTTP TLS; Addr: 0.0.0.0/0 Inline Route: /*
      21. 0.0.0.0 15006 App: HTTP; Addr: 0.0.0.0/0 Inline Route: /*
      22. 0.0.0.0 15006 App: Istio HTTP Plain; Addr: 10.244.0.22/32:9080 Inline Route: /*
      23. 0.0.0.0 15006 Addr: 0.0.0.0/0 InboundPassthroughClusterIpv4
      24. 0.0.0.0 15006 Trans: tls; App: TCP TLS; Addr: 0.0.0.0/0 InboundPassthroughClusterIpv4
      25. 0.0.0.0 15010 App: HTTP Route: 15010
      26. 0.0.0.0 15010 ALL PassthroughCluster
      27. 10.100.93.102 15012 ALL Cluster: outbound|15012||istiod.istio-system.svc.cluster.local
      28. 0.0.0.0 15014 App: HTTP Route: 15014
      29. 0.0.0.0 15014 ALL PassthroughCluster
      30. 0.0.0.0 15021 ALL Inline Route: /healthz/ready*
      31. 10.111.121.13 15021 App: HTTP Route: istio-ingressgateway.istio-system.svc.cluster.local:15021
      32. 10.111.121.13 15021 ALL Cluster: outbound|15021||istio-ingressgateway.istio-system.svc.cluster.local
      33. 0.0.0.0 15090 ALL Inline Route: /stats/prometheus*
      34. 10.111.121.13 15443 ALL Cluster: outbound|15443||istio-ingressgateway.istio-system.svc.cluster.local
    2. From the above summary you can see that every sidecar has a listener bound to 0.0.0.0:15006 which is where IP tables routes all inbound pod traffic to and a listener bound to 0.0.0.0:15001 which is where IP tables routes all outbound pod traffic to. The 0.0.0.0:15001 listener hands the request over to the virtual listener that best matches the original destination of the request, if it can find a matching one. Otherwise, it sends the request to the PassthroughCluster which connects to the destination directly.

    3. Our request is an outbound HTTP request to port 9080 this means it gets handed off to the 0.0.0.0:9080 virtual listener. This listener then looks up the route configuration in its configured RDS. In this case it will be looking up route 9080 in RDS configured by Istiod (via ADS).

      1. $ istioctl proxy-config listeners productpage-v1-6c886ff494-7vxhs -o json --address 0.0.0.0 --port 9080
      2. ...
      3. "rds": {
      4. "configSource": {
      5. "ads": {},
      6. "resourceApiVersion": "V3"
      7. },
      8. "routeConfigName": "9080"
      9. }
      10. ...
    4. The 9080 route configuration only has a virtual host for each service. Our request is heading to the reviews service so Envoy will select the virtual host to which our request matches a domain. Once matched on domain Envoy looks for the first route that matches the request. In this case we don’t have any advanced routing so there is only one route that matches on everything. This route tells Envoy to send the request to the outbound|9080||reviews.default.svc.cluster.local cluster.

      1. $ istioctl proxy-config routes productpage-v1-6c886ff494-7vxhs --name 9080 -o json
      2. [
      3. {
      4. "name": "9080",
      5. "virtualHosts": [
      6. {
      7. "name": "reviews.default.svc.cluster.local:9080",
      8. "domains": [
      9. "reviews.default.svc.cluster.local",
      10. "reviews.default.svc.cluster.local:9080",
      11. "reviews",
      12. "reviews:9080",
      13. "reviews.default.svc.cluster",
      14. "reviews.default.svc.cluster:9080",
      15. "reviews.default.svc",
      16. "reviews.default.svc:9080",
      17. "reviews.default",
      18. "reviews.default:9080",
      19. "10.98.88.0",
      20. "10.98.88.0:9080"
      21. ],
      22. "routes": [
      23. {
      24. "name": "default",
      25. "match": {
      26. "prefix": "/"
      27. },
      28. "route": {
      29. "timeout": "0s",
      30. }
      31. ]
      32. ...
    5. This cluster is configured to retrieve the associated endpoints from Istiod (via ADS). So Envoy will then use the serviceName field as a key to look up the list of Endpoints and proxy the request to one of them.

      1. $ istioctl proxy-config cluster productpage-v1-6c886ff494-7vxhs --fqdn reviews.default.svc.cluster.local -o json
      2. [
      3. {
      4. "name": "outbound|9080||reviews.default.svc.cluster.local",
      5. "type": "EDS",
      6. "edsClusterConfig": {
      7. "edsConfig": {
      8. "ads": {},
      9. "resourceApiVersion": "V3"
      10. },
      11. "serviceName": "outbound|9080||reviews.default.svc.cluster.local"
      12. },
      13. "connectTimeout": "10s",
      14. "circuitBreakers": {
      15. "thresholds": [
      16. {
      17. "maxConnections": 4294967295,
      18. "maxPendingRequests": 4294967295,
      19. "maxRequests": 4294967295,
      20. "maxRetries": 4294967295
      21. }
      22. ]
      23. },
      24. }
      25. ]

    Inspecting bootstrap configuration

    So far we have looked at configuration retrieved (mostly) from Istiod, however Envoy requires some bootstrap configuration that includes information like where Istiod can be found. To view this use the following command:

    1. $ istioctl proxy-config bootstrap -n istio-system istio-ingressgateway-7d6874b48f-qxhn5
    2. {
    3. "bootstrap": {
    4. "node": {
    5. "id": "router~172.30.86.14~istio-ingressgateway-7d6874b48f-qxhn5.istio-system~istio-system.svc.cluster.local",
    6. "cluster": "istio-ingressgateway",
    7. "metadata": {
    8. "CLUSTER_ID": "Kubernetes",
    9. "EXCHANGE_KEYS": "NAME,NAMESPACE,INSTANCE_IPS,LABELS,OWNER,PLATFORM_METADATA,WORKLOAD_NAME,MESH_ID,SERVICE_ACCOUNT,CLUSTER_ID",
    10. "INSTANCE_IPS": "10.244.0.7",
    11. "ISTIO_PROXY_SHA": "istio-proxy:f98b7e538920abc408fbc91c22a3b32bc854d9dc",
    12. "ISTIO_VERSION": "1.7.0",
    13. "LABELS": {
    14. "app": "istio-ingressgateway",
    15. "chart": "gateways",
    16. "heritage": "Tiller",
    17. "istio": "ingressgateway",
    18. "pod-template-hash": "68bf7d7f94",
    19. "release": "istio",
    20. "service.istio.io/canonical-name": "istio-ingressgateway",
    21. "service.istio.io/canonical-revision": "latest"
    22. },
    23. "MESH_ID": "cluster.local",
    24. "NAME": "istio-ingressgateway-68bf7d7f94-sp226",
    25. "NAMESPACE": "istio-system",
    26. "OWNER": "kubernetes://apis/apps/v1/namespaces/istio-system/deployments/istio-ingressgateway",
    27. "ROUTER_MODE": "sni-dnat",
    28. "SDS": "true",
    29. "SERVICE_ACCOUNT": "istio-ingressgateway-service-account",
    30. "WORKLOAD_NAME": "istio-ingressgateway"
    31. },
    32. "userAgentBuildVersion": {
    33. "version": {
    34. "majorNumber": 1,
    35. "minorNumber": 15
    36. },
    37. "metadata": {
    38. "build.type": "RELEASE",
    39. "revision.sha": "f98b7e538920abc408fbc91c22a3b32bc854d9dc",
    40. "revision.status": "Clean",
    41. "ssl.version": "BoringSSL"
    42. }
    43. },
    44. },
    45. ...

    Verifying connectivity to Istiod is a useful troubleshooting step. Every proxy container in the service mesh should be able to communicate with Istiod. This can be accomplished in a few simple steps:

    1. Create a sleep pod:

      1. $ kubectl create namespace foo
      2. $ kubectl apply -f <(istioctl kube-inject -f samples/sleep/sleep.yaml) -n foo
    2. Test connectivity to Istiod using curl. The following example invokes the v1 registration API using default Istiod configuration parameters and mutual TLS enabled:

      1. $ kubectl exec $(kubectl get pod -l app=sleep -n foo -o jsonpath={.items..metadata.name}) -c sleep -n foo -- curl -sS istiod.istio-system:15014/debug/endpointz

    You should receive a response listing the “service” and “endpoint” for each service in the mesh.

    What Envoy version is Istio using?

    To find out the Envoy version used in deployment, you can exec into the container and query the endpoint: