Operations guide - Runtime reconfiguration - 《etcd v3.3.12 document》

Reconfiguration requests can only be processed when a majority of cluster members are functioning. It is highly recommended to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not be able to make progress and need to .

To better understand the design behind runtime reconfiguration, please read the runtime reconfiguration document.

This section will walk through some common reasons for reconfiguring a cluster. Most of these reasons just involve combinations of adding or removing a member, which are explained below under .

If multiple cluster members need to move due to planned maintenance (hardware upgrades, network downtime, etc.), it is recommended to modify members one at a time.

It is safe to remove the leader, however there is a brief period of downtime while the election process takes place. If the cluster holds more than 50MB of v2 data, it is recommended to migrate the member’s data directory.

Change the cluster size

Increasing the cluster size can enhance and provide better read performance. Since clients can read from any member, increasing the number of members increases the overall serialized read throughput.

Decreasing the cluster size can improve the write performance of a cluster, with a trade-off of decreased resilience. Writes into the cluster are replicated to a majority of members of the cluster before considered committed. Decreasing the cluster size lowers the majority, and each write is committed more quickly.

If a machine fails due to hardware failure, data directory corruption, or some other fatal situation, it should be replaced as soon as possible. Machines that have failed but haven’t been removed adversely affect the quorum and reduce the tolerance for an additional failure.

To replace the machine, follow the instructions for removing the member from the cluster, and then in its place. If the cluster holds more than 50MB, it is recommended to migrate the failed member’s data directory if it is still accessible.

Restart cluster from majority failure

If the majority of the cluster is lost or all of the nodes have changed IP addresses, then manual action is necessary to recover safely. The basic steps in the recovery process include , forcing a single member to act as the leader, and finally using runtime configuration to add new members to this new cluster one at a time.

Cluster reconfiguration operations

Before making any change, a simple majority (quorum) of etcd members must be available. This is essentially the same requirement for any kind of write to etcd.

All changes to the cluster must be done sequentially:

To replace a healthy single member, remove the old member then add a new member
To increase from 3 to 5 members, issue two add operations
To decrease from 5 to 3, issue two remove operations

All of these examples use the etcdctl command line tool that ships with etcd. To change membership without etcdctl, use the or the v3 gRPC members API.

Update advertise client URLs

To update the advertise client URLs of a member, simply restart that member with updated client urls flag (--advertise-client-urls) or environment variable (ETCD_ADVERTISE_CLIENT_URLS). The restarted member will self publish the updated URLs. A wrongly updated client URL will not affect the health of the etcd cluster.

Update advertise peer URLs

To update the advertise peer URLs of a member, first update it explicitly via member command and then restart the member. The additional action is required since updating peer URLs changes the cluster wide configuration and can affect the health of the etcd cluster.

To update the advertise peer URLs, first find the target member’s ID. To list all members with etcdctl:

This example will update a8266ecf031671f3 member ID and change its peerURLs value to http://10.0.1.10:2380:

$ etcdctl member update a8266ecf031671f3 --peer-urls=http://10.0.1.10:2380
Updated member with ID a8266ecf031671f3 in cluster

Remove a member

Suppose the member ID to remove is a8266ecf031671f3. Use the remove command to perform the removal:

$ etcdctl member remove a8266ecf031671f3

The target member will stop itself at this point and print out the removal in the log:

It is safe to remove the leader, however the cluster will be inactive while a new leader is elected. This duration is normally the period of election timeout plus the voting process.

Add the new member to the cluster via the , the gRPC members API, or the etcdctl member add command.
Start the new member with the new cluster configuration, including a list of the updated members (existing members + the new member).

etcdctl adds a new member to the cluster by specifying the member’s and advertised peer URLs:

$ etcdctl member add infra3 --peer-urls=http://10.0.1.13:2380
added member 9bf1b35fc7761a23 to cluster
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE=existing

etcdctl has informed the cluster about the new member and printed out the environment variables needed to successfully start it. Now start the new etcd process with the relevant flags for the new member:

$ export ETCD_NAME="infra3"
$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
$ etcd --listen-client-urls http://10.0.1.13:2379 --advertise-client-urls http://10.0.1.13:2379 --listen-peer-urls http://10.0.1.13:2380 --initial-advertise-peer-urls http://10.0.1.13:2380 --data-dir %data_dir%

The new member will run as a part of the cluster and immediately begin catching up with the rest of the cluster.

If adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members. If adding a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. This behavior only happens between the time etcdctl member add informs the cluster about the new member and the new member successfully establishing a connection to the existing one.

Error cases when adding members

In the following case a new host is not included in the list of enumerated nodes. If this is a new cluster, the node must be added to the list of initial cluster members.

In this case, give a different address (10.0.1.14:2380) from the one used to join the cluster (10.0.1.13:2380):

$ etcd --name infra4 \
  --initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra4=http://10.0.1.14:2380 \
  --initial-cluster-state existing
etcdserver: assign ids error: unmatched member while checking PeerURLs
exit 1

If etcd starts using the data directory of a removed member, etcd automatically exits if it connects to any active member in the cluster:

$ etcd
exit 1

Strict reconfiguration check mode (`-strict-reconfig-check`)

As described in the above, the best practice of adding new members is to configure a single member at a time and verify it starts correctly before adding more new members. This step by step approach is very important because if newly added members is not configured correctly (for example the peer URLs are incorrect), the cluster can lose quorum. The quorum loss happens since the newly added member are counted in the quorum even if that member is not reachable from other existing members. Also quorum loss might happen if there is a connectivity issue or there are operational issues.

For avoiding this problem, etcd provides an option . If this option is passed to etcd, etcd rejects reconfiguration requests if the number of started members will be less than a quorum of the reconfigured cluster.

It is enabled by default.

Runtime reconfiguration