Distributed tracing with Linkerd
This guide will walk you through configuring and enabling tracing for emojivoto. Jump to the end for some recommendations on the best way to make use of distributed tracing with Linkerd.
To use distributed tracing, you’ll need to:
- Add a tracing backend to explore traces.
- Modify your application to emit spans.
- Configure Linkerd’s proxies to emit spans.
In the case of emojivoto, once all these steps are complete there will be a topology that looks like:
)
Topology
- To use this guide, you’ll need to have Linkerd installed on your cluster. Follow the Installing Linkerd Guide if you haven’t already done this.
Install Trace Collector & Jaeger
The first step of getting distributed tracing setup is installing a collector onto your cluster. This component consists of “receivers” that consume spans emitted from the mesh and your applications as well as “exporters” that convert spans and forward them to a backend.
Next, we will need Jaeger in your cluster. The configuration will store the traces, make them searchable and provide visualization of all the data being emitted.
Linkerd now has add-ons which enables users to install extra components that integrate well with Linkerd. Tracing is such one add-on which includes OpenCensus Collector and
To add the Tracing Add-On to your cluster, run:
First, we would need a configuration file where we enable the Tracing Add-On.
This configuration file can also be used to apply Add-On configuration (not just specific to tracing Add-On). More information on the configuration fields allowed, can be found here
Now, the above configuration can be applied using the file with CLI or through values.yaml
with Helm.
linkerd upgrade --addon-config config.yaml | kubectl apply -f -
You will now have a linker-collector
and linkerd-jaeger
deployments in the linkerd namespace that are running as part of the mesh. Collector has been configured to:
- Receive spans from OpenCensus clients
- Export spans to a Jaeger backend
The collector is extremely configurable and can use the or exporter of your choice.
Jaeger itself is made up of many . The all-in-one image bundles all these components into a single container to make demos and showing tracing off a little bit easier.
Before moving onto the next step, make sure everything is up and running with kubectl
:
kubectl -n linkerd rollout status deploy/linkerd-collector
kubectl -n linkerd rollout status deploy/linkerd-jaeger
Install Emojivoto
It is possible to use linkerd inject
to add the proxy to emojivoto as outlined in . Alternatively, annotations can do the same thing. You can patch these onto the running application with:
spec:
template:
metadata:
annotations:
linkerd.io/inject: enabled
config.alpha.linkerd.io/trace-collector-service-account: linkerd-collector
'
Before moving onto the next step, make sure everything is up and running with :
kubectl -n emojivoto rollout status deploy/web
Unlike most features of a service mesh, distributed tracing requires modifying the source of your application. Tracing needs some way to tie incoming requests to your application together with outgoing requests to dependent services. To do this, some headers are added to each request that contain a unique ID for the trace. Linkerd uses the b3 propagation format to tie these things together.
We’ve already modified emojivoto to instrument its requests with this information, this shows how this was done. For most programming languages, it simply requires the addition of a client library to take care of this. Emojivoto uses the OpenCensus client, but others can be used.
To enable tracing in emojivoto, run:
This command will add an environment variable that enables the applications to propagate context and emit spans.
Explore Jaeger
With vote-bot
starting traces for every request, spans should now be showing up in Jaeger. To get to the UI, start a port forward and send your browser to .
kubectl -n linkerd port-forward svc/linkerd-jaeger 16686
Jaeger
You can search for any service in the dropdown and click Find Traces. vote-bot
is a great way to get started.
)
Search
Clicking on a specific trace will provide all the details, you’ll be able to see the spans for every proxy!
Search
There sure are a lot of linkerd-proxy
spans in that output. Internally, the proxy has a server and client side. When a request goes through the proxy, it is received by the server and then issued by the client. For a single request that goes between two meshed pods, there will be a total of 4 spans. Two will be on the source side as the request traverses that proxy and two will be on the destination side as the request is received by the remote proxy.
)
Linkerd-Jaeger
Cleanup
To cleanup, remove the tracing components along with emojivoto by running:
kubectl delete ns tracing emojivoto
The Linkerd proxy uses the format. Some client libraries, such as Jaeger, use different formats by default. You’ll want to configure your client library to use the b3 format to have the proxies participate in traces.
Instead of requiring complex client configuration to ensure spans are encrypted in transit, Linkerd relies on its mTLS implementation. This means that it is required the collector is part of the mesh. If you are using a service account other than default
for the collector, the proxies must be configured to use this as well with the config.alpha.linkerd.io/trace-collector-service-account
annotation.
Recommendations
The ingress is an especially important component for distributed tracing because it creates the root span of each trace and is responsible for deciding if that trace should be sampled or not. Having the ingress make all sampling decisions ensures that either an entire trace is sampled or none of it is, and avoids creating “partial traces”.
Distributed tracing systems all rely on services to propagate metadata about the current trace from requests that they receive to requests that they send. This metadata, called the trace context, is usually encoded in one or more request headers. There are many different trace context header formats and while we hope that the ecosystem will eventually converge on open standards like , we only use the b3 format today. Being one of the earliest widely used formats, it has the widest support, especially among ingresses like Nginx.
This reference architecture includes a simple Nginx config that samples 50% of traces and emits trace data to the collector (using the Zipkin protocol). Any ingress controller can be used here in place of Nginx as long as it:
- Supports probabilistic sampling
- Encodes trace context in the b3 format
- Emits spans in a protocol supported by the OpenCensus collector
If using helm to install ingress-nginx, you can configure tracing by using:
While it is possible for services to manually propagate trace propagation headers, it’s usually much easier to use a library which does three things:
- Propagates the trace context from incoming request headers to outgoing request headers
- Modifies the trace context (i.e. starts a new span)
- Transmits this data to a trace collector
We recommend using OpenCensus in your service and configuring it with:
- (this is the default)
- the OpenCensus agent exporter
The OpenCensus agent exporter will export trace data to the OpenCensus collector over a gRPC API. The details of how to configure OpenCensus will vary language by language, but there are . You can also see an end-to-end example of this in Go with our example application, Emojivoto.
You may notice that the OpenCensus project is in maintenance mode and will become part of . Unfortunately, OpenTelemetry is not yet production ready and so OpenCensus remains our recommendation for the moment.
It is possible to use many other tracing client libraries as well. Just make sure the b3 propagation format is being used and the client library can export its spans in a format the collector has been configured to receive.
Collector: OpenCensus
The OpenCensus collector receives trace data from the OpenCensus agent exporter and potentially does translation and filtering before sending that data to Jaeger. Having the OpenCensus exporter send to the OpenCensus collector gives us a lot of flexibility: we can switch to any backend that OpenCensus supports without needing to interrupt the application.
Jaeger is one of the most widely used tracing backends and for good reason: it is easy to use and does a great job of visualizing traces. However, can be used instead.
Linkerd
- Set the annotation on the namespace or pod specs that you want to participate in traces. This should be set to the address of the OpenCensus collector service.
- Set the
config.alpha.linkerd.io/trace-collector-service-account
annotation on the namespace of pod specs that you want to participate in traces. This should be set to the name of the service account of the collector and is used to ensure secure communication between the proxy and the collector. This can be omitted if the collector is running as the default service account.
While Linkerd can only actively participate in traces that use the b3 propagation format, Linkerd will always forward unknown request headers transparently, which means it will never interfere with traces that use other propagation formats.