Viewing cluster dashboards

    The OpenShift Logging dashboard contains charts that show details about your Elasticsearch instance at a cluster level, including cluster resources, garbage collection, shards in the cluster, and Fluentd statistics.

    The Logging/Elasticsearch Nodes dashboard contains charts that show details about your Elasticsearch instance, many at node level, including details on indexing, shards, resources, and so forth.

    You can view the Logging/Elasticsearch Nodes and OpenShift Logging dashboards in the OKD web console.

    Procedure

    To launch the dashboards:

    1. On the Dashboards page, select Logging/Elasticsearch Nodes or OpenShift Logging from the Dashboard menu.

      For the Logging/Elasticsearch Nodes dashboard, you can select the Elasticsearch node you want to view and set the data resolution.

      The appropriate dashboard is displayed, showing multiple charts of data.

    2. Optional: Select a different time range to display or refresh rate for the data from the Time Range and Refresh Interval menus.

    For information on the dashboard charts, see and About the Logging/Elastisearch Nodes dashboard.

    The OpenShift Logging dashboard contains charts that show details about your Elasticsearch instance at a cluster-level that you can use to diagnose and anticipate problems.

    The Logging/Elasticsearch Nodes dashboard contains charts that show details about your Elasticsearch instance, many at node-level, for further diagnostics.

    Elasticsearch status

    The Logging/Elasticsearch Nodes dashboard contains the following charts about the status of your Elasticsearch instance.

    Table 2. Elasticsearch status fields
    MetricDescription

    Cluster status

    The cluster health status during the selected time period, using the Elasticsearch green, yellow, and red statuses:

    • 0 - Indicates that the Elasticsearch instance is in green status, which means that all shards are allocated.

    • 2 - Indicates that the Elasticsearch instance is in red status, which means that at least one primary shard and its replicas are not allocated.

    Cluster nodes

    The total number of Elasticsearch nodes in the cluster.

    Cluster data nodes

    The number of Elasticsearch data nodes in the cluster.

    Cluster pending tasks

    The number of cluster state changes that are not finished and are waiting in a cluster queue, for example, index creation, index deletion, or shard allocation. A growing trend indicates that the cluster is not able to keep up with changes.

    Elasticsearch cluster index shard status

    Each Elasticsearch index is a logical group of one or more shards, which are basic units of persisted data. There are two types of index shards: primary shards, and replica shards. When a document is indexed into an index, it is stored in one of its primary shards and copied into every replica of that shard. The number of primary shards is specified when the index is created, and the number cannot change during index lifetime. You can change the number of replica shards at any time.

    The index shard can be in several states depending on its lifecycle phase or events occurring in the cluster. When the shard is able to perform search and indexing requests, the shard is active. If the shard cannot perform these requests, the shard is non–active. A shard might be non-active if the shard is initializing, reallocating, unassigned, and so forth.

    Index shards consist of a number of smaller internal blocks, called index segments, which are physical representations of the data. An index segment is a relatively small, immutable Lucene index that is created when Lucene commits newly-indexed data. Lucene, a search library used by Elasticsearch, merges index segments into larger segments in the background to keep the total number of segments low. If the process of merging segments is slower than the speed at which new segments are created, it could indicate a problem.

    When Lucene performs data operations, such as a search operation, Lucene performs the operation against the index segments in the relevant index. For that purpose, each segment contains specific data structures that are loaded in the memory and mapped. Index mapping can have a significant impact on the memory used by segment data structures.

    The Logging/Elasticsearch Nodes dashboard contains the following charts about the Elasticsearch index shards.

    Elasticsearch node metrics

    Each Elasticsearch node has a finite amount of resources that can be used to process tasks. When all the resources are being used and Elasticsearch attempts to perform a new task, Elasticsearch put the tasks into a queue until some resources become available.

    The Logging/Elasticsearch Nodes dashboard contains the following charts about resource usage for a selected node and the number of tasks waiting in the Elasticsearch queue.

    Table 4. Elasticsearch node metric charts
    MetricDescription

    ThreadPool tasks

    The number of waiting tasks in individual queues, shown by task type. A long–term accumulation of tasks in any queue could indicate node resource shortages or some other problem.

    CPU usage

    The amount of CPU being used by the selected Elasticsearch node as a percentage of the total CPU allocated to the host container.

    Memory usage

    The amount of memory being used by the selected Elasticsearch node.

    Disk usage

    The total disk space being used for index data and metadata on the selected Elasticsearch node.

    The rate that documents are indexed on the selected Elasticsearch node.

    Indexing latency

    The time taken to index the documents on the selected Elasticsearch node. Indexing latency can be affected by many factors, such as JVM Heap memory and overall load. A growing latency indicates a resource capacity shortage in the instance.

    Search rate

    The number of search requests run on the selected Elasticsearch node.

    Search latency

    The time taken to complete search requests on the selected Elasticsearch node. Search latency can be affected by many factors. A growing latency indicates a resource capacity shortage in the instance.

    Documents count (with replicas)

    The number of Elasticsearch documents stored on the selected Elasticsearch node, including documents stored in both the primary shards and replica shards that are allocated on the node.

    Documents deleting rate

    The number of Elasticsearch documents being deleted from any of the index shards that are allocated to the selected Elasticsearch node.

    Documents merging rate

    The number of Elasticsearch documents being merged in any of index shards that are allocated to the selected Elasticsearch node.

    Elasticsearch node fielddata

    is an Elasticsearch data structure that holds lists of terms in an index and is kept in the JVM Heap. Because fielddata building is an expensive operation, Elasticsearch caches the fielddata structures. Elasticsearch can evict a fielddata cache when the underlying index segment is deleted or merged, or if there is not enough JVM HEAP memory for all the fielddata caches.

    The Logging/Elasticsearch Nodes dashboard contains the following charts about Elasticsearch fielddata.

    Elasticsearch node query cache

    If the data stored in the index does not change, search query results are cached in a node-level query cache for reuse by Elasticsearch.

    The Logging/Elasticsearch Nodes dashboard contains the following charts about the Elasticsearch node query cache.

    Table 6. Elasticsearch node query charts
    MetricDescription

    Query cache size

    The total amount of memory used for the query cache for all the shards allocated to the selected Elasticsearch node.

    Query cache evictions

    The number of query cache evictions on the selected Elasticsearch node.

    Query cache hits

    The number of query cache hits on the selected Elasticsearch node.

    Query cache misses

    The number of query cache misses on the selected Elasticsearch node.

    Elasticsearch index throttling

    When indexing documents, Elasticsearch stores the documents in index segments, which are physical representations of the data. At the same time, Elasticsearch periodically merges smaller segments into a larger segment as a way to optimize resource use. If the indexing is faster then the ability to merge segments, the merge process does not complete quickly enough, which can lead to issues with searches and performance. To prevent this situation, Elasticsearch throttles indexing, typically by reducing the number of threads allocated to indexing down to a single thread.

    The Logging/Elasticsearch Nodes dashboard contains the following charts about Elasticsearch index throttling.

    Node JVM Heap statistics

    The Logging/Elasticsearch Nodes dashboard contains the following charts about JVM Heap operations.

    Table 8. JVM Heap statistic charts
    MetricDescription

    Heap used

    The amount of the total allocated JVM Heap space that is used on the selected Elasticsearch node.

    GC count

    The number of garbage collection operations that have been run on the selected Elasticsearch node, by old and young garbage collection.

    GC time

    The amount of time that the JVM spent running garbage collection operations on the selected Elasticsearch node, by old and young garbage collection.