Deployment checklist
Basics
- YugabyteDB works on a variety of operating systems. For production workloads, the recommended operating systems are CentOS 7.x and RHEL 7.x.
- Set the appropriate on each node running a YugabyteDB server.
- Use ntp to synchronize time among the machines.
YugabyteDB internally replicates data in a strongly consistent manner using Raft consensus protocol in order to survive node failure without compromising data correctness. The number of copies of the data represents the replication factor.
You would first need to choose a replication factor. You would need at least as many machines as the replication factor. YugabyteDB works with both hostnames or IP addresses. IP addresses are preferred at this point, they are more extensively tested. Below are some recommendations relating to the replication factor.
- The replication factor should be an odd number.
- Number of YB-Master servers running in a cluster should match replication factor. Run each server on a separate machine to prevent losing data on failures.
- Number of YB-TServer servers running in the cluster should not be less than the replication factor. Run each server on a separate machine to prevent losing data on failures.
- Specify the replication factor using the
—replication_factor
when bringing up the YB-Master servers.
See the for more information.
Hardware requirements
YugabyteDB is designed to run well on bare-metal machines, virtual machines (VMs), or containers.
Allocate adequate CPU and RAM. YugabyteDB has good defaults for running on a wide range of machines, and has been tested from 2 core to 64 core machines, and up to 200GB RAM.
- Minimum configuration: 2 cores and 2GB RAM
- For higher performance:
- 8 cores or more
- Add more CPU (compared to adding more RAM) to improve performance.
Memory depends on your application query pattern.Writes require memory but only up to a certain point (4GB, but if you have a write-heavy workload you may need a little more).Beyond that, more memory generally helps improve the read throughput and latencies by caching data in the internal cache.If you do not have enough memory to fit the read working set, then you will typicallyexperience higher read latencies because data has to be read from disk.Having a faster disk could help in some of these cases.
YugabyteDB explicitly manages a block cache, and doesn’t need the entire data set to fit in memory.We do not rely on OS to keep data in its buffers.If you give YugabyteDB sufficient memory data accessed and present in block cache will stay in memory.
Disks
- Use SSDs (solid state disks) for good performance.
- Both local or remote attached storage work with YugabyteDB. Since YugabyteDB internally replicates data for fault tolerance, remote attached storage which does its own additional replication is not a requirement. Local disks often offer better performance at a lower cost.
- Multi-disk nodes
- Do not use RAID across multiple disks. YugabyteDB can natively handle multi-disk nodes (JBOD).
- Create a data directory on each of the data disks and specify a comma separated list of those directories to the yb-master and yb-tserver servers via the
—fs_data_dirs
flag.
- Mount settings
- XFS is the recommended filesystem.
- Use the
noatime
setting when mounting the data drives.
YugabyteDB does not require any form of RAID, but runs optimally on a JBOD (just a bunch of disks) setup.It can also leverage multiple disks per node and has been tested beyond 10 TB of storage per node.
Write-heavy applications usually require more disk IOPS (especially if the size of each record is larger),therefore in this case the total IOPS that a disk can support matters.On the read side, if the data does not fit into the cache and data needs to be readfrom the disk in order to satisfy queries, the disk performance (latency and IOPS) will start to matter.
YugabyteDB uses per-tablet size tiered compaction.Therefore the typical space amplification in YugabyteDB tends to be in the 10-20% range.
- In order to view the cluster dashboard, you need to be able to navigate to the following ports on the nodes
- 7000 (Cluster dashboard viewable from any of the YB-Master servers)
- To use the database from the app, the following ports need to be accessible from the app (or commandline interface)
- 9042 (which supports YCQL, YugabyteDB’s Cassandra-compatible API)
- 6379 (which supports YEDIS, YugabyteDB’s Redis-compatible API)
Default ports reference
The above deployment uses the various default ports listed below.
NoteIn our Enterprise installs, we change the SSH port for added security.
For YugabyteDB to preserve data consistency, the clock drift and clock skew across different nodes must be bounded. This can be achieved by running clock synchronization software, such as NTP. Below are some recommendations on how to configure clock synchronization.
Set a safe value for the maximum clock skew parameter (—max_clock_skew_usec
) when starting the YugabyteDB servers. We recommend setting this parameter to twice the expected maximum clock skew between any two nodes in your deployment.
For example, if the maximum clock skew across nodes is expected to be no more than 250ms, then set the parameter to 500ms (—max_clock_skew_usec=500000
).
Clock drift
NoteIn practice, the clock drift would have to be orders of magnitude higher in order to cause correctness issues.
Running on public clouds
- Use the
c5
ori3
instance families. - Recommended types are
i3.2xlarge
, ,c5.2xlarge
,c5.4xlarge
- For the
c5
instance family, usegp2
EBS (SSD) disks that are at least 250GB in size, larger if more IOPS are needed.- The number of IOPS are proportional to the size of the disk.
- In our testing,
gp2
EBS SSDs provide the best performance for a given cost among the various EBS disk options.
- Avoid running on . The
t2
instance types are burstable instance types. Their baseline performance and ability to burst are governed by CPU Credits, and makes it hard to get steady performance.
Google Cloud
- Use the
n1-highcpu
instance family. As a second choice,n1-standard
instance family works too. - Recommended instance types are
n1-highcpu-8
andn1-highcpu-16
. - are the preferred storage option.
- Each local SSD is 375 GB in size, but you can attach up to eight local SSD devices for 3 TB of total local SSD storage space per instance.
- As a second choice, remote persistent SSDs work well. Make sure the size of these SSDs are at least 250GB in size, larger if more IOPS are needed.
- Avoid running on
f1
or machine families. These are that may not deliver steady performance.