Tracing Ceph With Blkin

    In general, Blkin implements the Dapper tracing semanticsin order to show the causal relationships between the differentprocessing phases that an IO request may trigger. The goal is anend-to-end visualisation of the request’s route in the system,accompanied by information concerning latencies in each processingphase. Thanks to LTTng this can happen with a minimal overhead andin realtime. The LTTng traces can then be visualized with Twitter’s.

    You can install Markos Kogias’ upstream Blkin by hand.:

    1. cd blkin/
    2. make && make install

    or build distribution packages using , which also comes withpkgconfig support. If you choose the latter, then you must generate theconfigure and make files first.:

    1. cd blkin
    2. autoreconf -i

    Configuring Ceph with Blkin

    If you built and installed Blkin by hand, rather than building andinstalling packages, then set these variables before configuringCeph.:

    1. export BLKIN_CFLAGS=-Iblkin/

    Blkin support in Ceph is disabled by default, so you maywant to configure with something like:

    1. ./do_cmake -DWITH_BLKIN=ON

    Config option for blkin must be set to true in ceph.conf to gettraces from rbd through OSDC and OSD:

    1. rbd_blkin_trace_all = true

    It’s easy to test Ceph’s Blkin tracing. Let’s assume you don’t haveCeph already running, and you compiled Ceph with Blkin support butyou didn’t install it. Then launch Ceph with the scriptin Ceph’s src directory so you can see the possible tracepoints.:

    1. UST events:

    PID: 8987 - Name: ./ceph-osd zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint) zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint) zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint) lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)

    PID: 8407 - Name: ./ceph-mon zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint) zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint) zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint) lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)

    Next, stop Ceph so that the tracepoints can be enabled.:

    1. ./stop.sh

    Start up an LTTng session and enable the tracepoints.:

    1. lttng create blkin-test
    2. lttng enable-event --userspace zipkin:timestamp
    3. lttng enable-event --userspace zipkin:keyval_integer
    4. lttng enable-event --userspace zipkin:keyval_string
    5. lttng start

    Then start up Ceph again.:

    1. OSD=3 MON=3 RGW=1 ./vstart.sh -n

    You may want to check that ceph is up.:

    1. ./ceph status

    You could also use the example in or .

    Then stop the LTTng session and see what was collected.:

    1. lttng view

    You’ll see something like::

    1. [15:33:08.884275486] (+0.000225472) ubuntu zipkin:timestamp: { cpu_id = 53 }, { trace_name = "op", service_name = "Objecter", port_no = 0, ip = "0.0.0.0", trace_id = 5485970765435202833, span_id = 5485970765435202833, parent_span_id = 0, event = "osd op reply" }
    2. [15:33:08.884614135] (+0.000002839) ubuntu zipkin:keyval_integer: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "tid", val = 2 }
    3. [15:33:08.884616431] (+0.000002296) ubuntu zipkin:keyval_string: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "entity type", val = "client" }

    One of the points of using Blkin is so that you can look at the tracesusing Zipkin. Users should run Zipkin as a tracepoints collector andalso a web service. The executable jar runs a collector on port 9410 andthe web interface on port 9411

    Download Zipkin Package:

    1. git clone https://github.com/openzipkin/zipkin && cd zipkin
    2. wget -O zipkin.jar 'https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec'
    3. java -jar zipkin.jar

    Show Ceph’s Blkin Traces in Zipkin-web

    Download babeltrace-zipkin project. This project takes the tracesgenerated with blkin and sends them to a Zipkin collector using scribe:

    1. git clone https://github.com/vears91/babeltrace-zipkin
    2. cd babeltrace-zipkin

    Send lttng data to Zipkin:

    1. python3 babeltrace_zipkin.py ${lttng-traces-dir}/${blkin-test}/ust/uid/0/64-bit/ -p ${zipkin-collector-port(9410 by default)} -s ${zipkin-collector-ip}

    Check Ceph traces on webpage:

    1. Browse http://${zipkin-collector-ip}:9411