Monitoring ArangoDB using collectd

    Solution

    is an excellent tool to gather all kinds of metrics from a system,and deliver it to a central monitoring like Graphiteand / or .

    For this recipe you need to install the following tools:

    Configuring collectd

    For aggregating the values we will use the cURL-JSON plug-in.We will store the values using the (RRD) which can later on present to you.

    We assume your collectd comes from your distribution and reads its config from /etc/collectd/collectd.conf. Since this file tends to become pretty unreadable quickly, we use the include mechanism:

    This way we can make each metric group on compact set config files. It consists of three components:

    • loading the plug-in
    • adding metrics to the TypesDB
    • the configuration for the plug-in itself

    We will use the Round-Robin-Database as storage backend for now. It creates its own database files of fixed size for each specific time range. Later you may choose more advanced writer-plug-ins, which may do network distribution of your metrics or integrate the above mentioned Graphite or your already established monitoring, etc.

    1. # Load the plug-in:
    2. LoadPlugin rrdtool
    3. <Plugin rrdtool>
    4. DataDir "/var/lib/collectd/rrd"
    5. # CacheTimeout 120
    6. # CacheFlush 900
    7. # WritesPerSecond 30
    8. # CreateFilesAsync false
    9. # RandomTimeout 0
    10. #
    11. # The following settings are rather advanced
    12. # and should usually not be touched:
    13. # StepSize 10
    14. # HeartBeat 20
    15. # RRARows 1200
    16. # RRATimespan 158112000
    17. # XFF 0.1
    18. </Plugin>

    cURL JSON

    Collectd comes with a wide range of metric aggregation plug-ins.Many tools today use as data formatting grammar; so does ArangoDB.

    Therefore a plug-in offering to fetch JSON documents via HTTP is the perfect match to query ArangoDBsadministrative Statistics interface:

    1. # Load the plug-in:
    2. LoadPlugin curl_json
    3. # we need to use our own types to generate individual names for our gauges:
    4. # TypesDB "/etc/collectd/arangodb_types.db"
    5. <Plugin curl_json>
    6. # Adjust the URL so collectd can reach your arangod:
    7. <URL "http://localhost:8529/_db/_system/_admin/statistics">
    8. # Set your authentication to Aardvark here:
    9. User "root"
    10. # Password "bar"
    11. <Key "http/requestsTotal">
    12. Type "gauge"
    13. </Key>
    14. <Key "http/requestsPatch">
    15. Type "gauge"
    16. </Key>
    17. <Key "http/requestsPut">
    18. Type "gauge"
    19. </Key>
    20. <Key "http/requestsOther">
    21. Type "gauge"
    22. </Key>
    23. <Key "http/requestsAsync">
    24. Type "gauge"
    25. </Key>
    26. <Key "http/requestsPost">
    27. Type "gauge"
    28. </Key>
    29. <Key "http/requestsOptions">
    30. Type "gauge"
    31. </Key>
    32. <Key "http/requestsHead">
    33. Type "gauge"
    34. </Key>
    35. <Key "http/requestsGet">
    36. Type "gauge"
    37. </Key>
    38. <Key "http/requestsDelete">
    39. Type "gauge"
    40. </Key>
    41. <Key "system/minorPageFaults">
    42. Type "gauge"
    43. </Key>
    44. <Key "system/majorPageFaults">
    45. Type "gauge"
    46. </Key>
    47. <Key "system/userTime">
    48. </Key>
    49. <Key "system/systemTime">
    50. Type "gauge"
    51. </Key>
    52. <Key "system/numberOfThreads">
    53. Type "gauge"
    54. </Key>
    55. <Key "system/virtualSize">
    56. Type "gauge"
    57. </Key>
    58. <Key "system/residentSize">
    59. Type "gauge"
    60. </Key>
    61. <Key "system/residentSizePercent">
    62. Type "gauge"
    63. </Key>
    64. <Key "server/threads/running">
    65. Type "gauge"
    66. </Key>
    67. <Key "server/threads/queued">
    68. Type "gauge"
    69. </Key>
    70. <Key "server/threads/working">
    71. Type "gauge"
    72. </Key>
    73. <Key "server/threads/blocked">
    74. Type "gauge"
    75. </Key>
    76. <Key "server/uptime">
    77. Type "gauge"
    78. </Key>
    79. <Key "server/physicalMemory">
    80. Type "gauge"
    81. </Key>
    82. <Key "server/v8Context/available">
    83. Type "gauge"
    84. </Key>
    85. <Key "server/v8Context/max">
    86. Type "gauge"
    87. </Key>
    88. <Key "server/v8Context/busy">
    89. Type "gauge"
    90. </Key>
    91. <Key "server/v8Context/dirty">
    92. Type "gauge"
    93. </Key>
    94. <Key "server/v8Context/free">
    95. Type "gauge"
    96. </Key>
    97. <Key "client/totalTime/count">
    98. Type "client_totalTime_count"
    99. </Key>
    100. <Key "client/totalTime/sum">
    101. Type "client_totalTime_sum"
    102. </Key>
    103. <Key "client/totalTime/counts/0">
    104. Type "client_totalTime_counts0"
    105. </Key>
    106. <Key "client/bytesReceived/count">
    107. Type "client_bytesReceived_count"
    108. <Key "client/bytesReceived/sum">
    109. Type "client_bytesReceived_sum"
    110. </Key>
    111. <Key "client/bytesReceived/counts/0">
    112. Type "client_bytesReceived_counts0"
    113. </Key>
    114. <Key "client/requestTime/count">
    115. Type "client_requestTime_count"
    116. </Key>
    117. <Key "client/requestTime/sum">
    118. Type "client_requestTime_sum"
    119. </Key>
    120. <Key "client/requestTime/counts/0">
    121. Type "client_requestTime_counts0"
    122. </Key>
    123. <Key "client/connectionTime/count">
    124. Type "client_connectionTime_count"
    125. </Key>
    126. <Key "client/connectionTime/sum">
    127. Type "client_connectionTime_sum"
    128. </Key>
    129. <Key "client/connectionTime/counts/0">
    130. Type "client_connectionTime_counts0"
    131. </Key>
    132. <Key "client/queueTime/count">
    133. Type "client_queueTime_count"
    134. </Key>
    135. <Key "client/queueTime/sum">
    136. Type "client_queueTime_sum"
    137. </Key>
    138. <Key "client/queueTime/counts/0">
    139. Type "client_queueTime_counts0"
    140. </Key>
    141. <Key "client/bytesSent/count">
    142. Type "client_bytesSent_count"
    143. </Key>
    144. <Key "client/bytesSent/sum">
    145. Type "client_bytesSent_sum"
    146. </Key>
    147. <Key "client/bytesSent/counts/0">
    148. Type "client_bytesSent_counts0"
    149. </Key>
    150. <Key "client/ioTime/count">
    151. Type "client_ioTime_count"
    152. </Key>
    153. <Key "client/ioTime/sum">
    154. Type "client_ioTime_sum"
    155. </Key>
    156. <Key "client/ioTime/counts/0">
    157. Type "client_ioTime_counts0"
    158. </Key>
    159. <Key "client/httpConnections">
    160. Type "gauge"
    161. </Key>
    162. </URL>
    163. </Plugin>

    To circumvent the shortcoming of the curl_JSON plug-in to only take the last path element as name for the metric, we need to give them a name using our own types.db file in /etc/collectd/arangodb_types.db:

    Please note that you probably need to uncomment this line from the main collectd.conf:

    1. # TypesDB "/usr/share/collectd/types.db" "/etc/collectd/my_types.db"

    in order to make it still load its main types definition file.

    You may want to monitor your own metrics from ArangoDB. Here is a simple example how to use the config:

    1. {
    2. "testArray":[1,2],
    3. "testArrayInbetween":[{"blarg":3},{"blub":4}],
    4. "testDirectHit":5,
    5. "testSubLevelHit":{"oneMoreLevel":6}
    6. }

    Get it served

    Now we will (re)start collectd so it picks up our configuration:

    1. /etc/init.d/collectd start

    We will inspect the syslog to revalidate nothing went wrong:

    1. Mar 3 13:59:52 localhost collectd[11276]: Starting statistics collection and monitoring daemon: collectd.
    2. Mar 3 13:59:52 localhost systemd[1]: Started LSB: manage the statistics collection daemon.

    adds the hostname to the directory address, so now we should have files like these:

    Now we start kcollectd to view the values in the RRD file:

    Since we started putting values in just now, we need to choose ‘last hour’ and zoom in a little more to inspect the values.