Debugging 502s
Linkerd turns connection errors into HTTP 502 responses. This can make issues which were previously undetected suddenly visible. This is a good thing. Linkerd also changes the way that connections to your application are managed: it re-uses persistent connections and establishes an additional layer of connection tracking. Managing connections in this way can sometimes expose underlying application or infrastructure issues such as misconfigured connection timeouts which can manifest as connection errors.
From the Linkerd proxy’s perspective, it just sees its connections to the application refused or closed without explanation. This makes it nearly impossible for Linkerd to report any error message in the 502 response. However, if these errors coincide with the introduction of Linkerd, it does suggest that the problem is related to connection re-use or connection tracking. Here are some common reasons why the application may be refusing or terminating connections.
To remedy this, ensure that your server’s idle timeouts are sufficiently long so that they do not close connections which are actively in use.
Half-closed Connection Timeouts
During the shutdown of a TCP connection, each side of the connection must be closed independently. When one side is closed but the other is not, the connection is said to be “half-closed”. It is valid for the connection to be in this state, however, the operating system’s connection tracker can lose track of connections which remain half-closed for long periods of time. This can lead to responses not being delivered and to port conflicts when establishing new connections which manifest as 502 responses.
One solution would be to update your application to not leave connections half-closed for long periods of time or to stop using software that does this. Unfortunately, this is not always an option.
Another option is to increase the connection tracker’s timeout for half-closed connections. The default value of this timeout is platform dependent but is typically 1 minute or 1 hour. You can view the current value by looking at the file in any injected container. To increase this value, you can use the flag with . Note, however, that setting this flag will also set the field of the proxy init container to true. Setting this timeout to 1 hour is usually sufficient and matches the .