Troubleshooting
These ports will not open until a namespace and placement have been created and the nodes have bootstrapped.
Double check your configuration against the bootstrapping guide. The nodes will log what bootstrapper they are using and what time range they are using it for.
If you’re using the commitlog bootstrapper, and it seems to be slow, ensure that snapshotting is enabled for your namespace. Enabling snapshotting will require a node restart to take effect.
If an m3db node hasn’t been able to snapshot for awhile, or is stuck in the commitlog bootstrapping phase for a long time due to accumulating a large number of commitlogs, consider using the peers bootstrapper. In situations where a large number of commitlogs need to be read, the peers bootstrapper will outperform the commitlog bootstrapper (faster and less memory usage) due to the fact that it will receive already-compressed data from its peers. Keep in mind that this will only work with a replication factor of 3 or larger and if the nodes peers are healthy and bootstrapped. Review the for more information.
Ensure you’ve set to something like 262,144 using sysctl. Find out more in the Clustering the Hard Way document.
- Ensure that you are not co-locating coordinator, etcd or query nodes with your M3DB nodes. Colocation or embedded mode is fine for a development environment, but highly discouraged in production.
- Ensure that you use at most 50-60% memory utilization in the normal running state. You want to ensure enough overhead to handle bursts of metrics, especially ones with new IDs as those will take more memory initially.
- High cardinality metrics can also lead to OOMs especially if you are not adequately provisioned. If you have many unique timeseries such as ones containing UUIDs or timestamps as tag values, you should consider mitigating their cardinality.
- CPU profile: determines where a program spends its time while actively consuming CPU cycles (as opposed to while sleeping or waiting for I/O). Currently set to take a 5 second profile.
- Heap profile: reports memory allocation samples; used to monitor current and historical memory usage, and to check for memory leaks.
- Goroutines profile: reports the stack traces of all current goroutines.
- Host profile: returns data about the underlying host such as PID, working directory, etc.
- Placement: returns information about the placement setup in M3DB
This endpoint can be used on both the db nodes as well as the coordinator/query nodes. However, namespace and placement info are only available on the coordinator debug endpoint.
To use this, simply run the following on either the M3DB debug listen port or the regular port on M3Coordinator.
You may need to include the following headers:
Cluster-Environment-Name
:
This header is used to specify the cluster environment name. If not set, the default is used.Cluster-Zone-Name
:
This header is used to specify the cluster zone name. If not set, the defaultembedded
is used.
cpuSource
heapSource
goroutineProfile
less goroutine.prof
hostSource
namespaceSource
less namespace.json | jq .