Metrics Quick Start

    Getting Started with metrics in DC/OS

    Prerequisites:

    • You must have the and be logged in as a superuser via the command.
    1. Optional: Deploy a sample Marathon app for use in this quick start guide. If you already have tasks running on DC/OS, you can skip this setup step.

      1. Create the following Marathon app definition and save as test-metrics.json.

      2. Deploy the app with the following CLI command:

        1. dcos marathon app add test-metrics.json
    2. To get the Mesos ID of the node that is running your app, run dcos task followed by dcos node. For example:

      1. Running dcos task shows that host 10.0.0.193 is running the Marathon task test-metrics.93fffc0c-fddf-11e6-9080-f60c51db292b.

        1. dcos task
        2. NAME HOST USER STATE ID
        3. test-metrics 10.0.0.193 root R test-metrics.93fffc0c-fddf-11e6-9080-f60c51db292b
      2. Running dcos node shows that host 10.0.0.193 has the Mesos ID 7749eada-4974-44f3-aad9-42e2fc6aedaf-S1.

        1. dcos node
        2. HOSTNAME IP ID
        3. 10.0.0.193 10.0.0.193 7749eada-4974-44f3-aad9-42e2fc6aedaf-S1
      • Container metrics for a specific task

        For an overview of the resource consumption for a specific container, execute the following command:

        1. dcos task metrics summary <task-id>

        The output should resemble:

        The metrics summary command displays a summary of raw and percentage utilization of CPU, Memory and Disk resources using the metrics documented in the metrics reference summary.

        In particular, the following metrics and formula are used to compute the displayed values:

        1. CPU usage:
        1. cpus.system_time_secs + cpus.user_time_secs (raw)
        2. (cpus.system_time_secs + cpus.user_time_secs) / cpus.throttled_time_secs (percentage)
        1. Memory usage:
        1. mem.total_bytes (raw)
        2. mem.total_byes/mem.limit_bytes (percentage)
        1. Disk usage:
        1. disk.used_bytes (raw)
        2. disk.used_bytes/disk.total_bytes (percentage)
      • All metrics for a specific task

        To get a detailed list of all metrics related to a task, execute the following command:

        1. dcos task metrics details <task-id>

        The CPU, disk, and memory statistics come from container data supplied by Mesos. The statsd_tester.time.uptime statistic comes from the application itself.

      • Host level metrics

        For task data, host-level metrics are available in the form of a summary or a detailed table. To view host-level metrics, execute the following command:

        1. dcos node metrics details <mesos-id>

        The output displays the statistics about available resources on the node and their utilization. For example:

        1. cpu.idle 99.56%
        2. cpu.system 0.09%
        3. cpu.user 0.25%
        4. cpu.wait 0.01%
        5. filesystem.capacity.free 134.75GiB path: /
        6. filesystem.capacity.total 143.02GiB path: /
        7. filesystem.capacity.used 2.33GiB path: /
        8. filesystem.inode.free 38425263 path: /
        9. filesystem.inode.total 38504832 path: /
        10. filesystem.inode.used 79569 path: /
        11. load.15min 0
        12. load.1min 0
        13. load.5min 0
        14. memory.buffers 0.08GiB
        15. memory.cached 2.41GiB
        16. memory.free 12.63GiB
        17. memory.total 15.67GiB
        18. process.count 175
        19. swap.free 0.00GiB
        20. swap.total 0.00GiB
        21. swap.used 0.00GiB
        22. system.uptime 28627
      • Programmatic use of metrics

        All dcos-cli metrics commands can be executed with the --json for use in scripts. For example:

        1. dcos node metrics summary <mesos-id> --json

        The output displays the same data, but in JSON format, for convenient parsing:

        1. [
        2. {
        3. "name": "cpu.total",
        4. "timestamp": "2018-04-09T23:46:16.834008315Z",
        5. "value": 0.32,
        6. "unit": "percent"
        7. "name": "memory.total",
        8. "timestamp": "2018-04-09T23:46:16.834650407Z",
        9. "value": 16830304256,
        10. "unit": "bytes"
        11. },
        12. {
        13. "name": "memory.free",
        14. "timestamp": "2018-04-09T23:46:16.834650407Z",
        15. "value": 13553008640,
        16. "unit": "bytes"
        17. },
        18. {
        19. "name": "filesystem.capacity.total",
        20. "timestamp": "2018-04-09T23:46:16.834373702Z",
        21. "value": 153567944704,
        22. "tags": {
        23. "path": "/"
        24. },
        25. "unit": "bytes"
        26. },
        27. {
        28. "name": "filesystem.capacity.used",
        29. "timestamp": "2018-04-09T23:46:16.834373702Z",
        30. "value": 2498990080,
        31. "tags": {
        32. "path": "/"
        33. },
        34. "unit": "bytes"
        35. }