In an RKE installation, the cluster data is replicated on each of three etcd nodes in the cluster, providing redundancy and data duplication in case one of the nodes fails.

Cluster Data within an RKE Kubernetes Cluster Running the Rancher Management Server

Requirements

The commands for taking etcd snapshots are only available in RKE v0.1.7 and later.

You’ll need the RKE config file that you used for Rancher install, rancher-cluster.yml. You created this file during your initial install. Place this file in same directory as the RKE binary.

Backup Outline

Backing up your high-availability Rancher cluster is process that involves completing multiple tasks.

  1. Take Snapshots of the etcd Database

    Take snapshots of your current etcd database using Rancher Kubernetes Engine (RKE).

  2. After taking your snapshots, export them to a safe location that won’t be affected if your cluster encounters issues.

1. Take Snapshots of the etcd Database

Take snapshots of your etcd database. You can use these snapshots later to recover from a disaster scenario. There are two ways to take snapshots: recurringly, or as a one-off. Each option is better suited to a specific use case. Read the short description below each link to know when to use each option.

  • After you stand up a high-availability Rancher install, we recommend configuring RKE to automatically take recurring snapshots so that you always have a safe restore point available.

  • We advise taking one-time snapshots before events like upgrades or restore of another snapshot.

For all high-availability Rancher installs, we recommend taking recurring snapshots so that you always have a safe restore point available.

To take recurring snapshots, enable the etcd-snapshot service, which is a service that’s included with RKE. This service runs in a service container alongside the etcd container. You can enable this service by adding some code to rancher-cluster.yml.

To Enable Recurring Snapshots:

The steps to enable recurring snapshots differ based on the version of RKE.

  1. Edit the code for the etcd service to enable recurring snapshots. Snapshots can be saved in a S3 compatible backend.

  2. Save and close rancher-cluster.yml.

  3. Open Terminal and change directory to the location of the RKE binary. Your rancher-cluster.yml file must reside in the same directory.

  4. Run the following command:

    1. rke up --config rancher-cluster.yml

Result: RKE is configured to take recurring snapshots of etcd on all nodes running the role. Snapshots are saved locally to the following directory: /opt/rke/etcd-snapshots/. If configured, the snapshots are also uploaded to your S3 compatible backend.

  1. Open rancher-cluster.yml with your favorite text editor.
  2. Edit the code for the etcd service to enable recurring snapshots.

  3. Save and close rancher-cluster.yml.

  4. Open Terminal and change directory to the location of the RKE binary. Your rancher-cluster.yml file must reside in the same directory.

    1. rke up --config rancher-cluster.yml

Result: RKE is configured to take recurring snapshots of etcd on all nodes running the etcd role. Snapshots are saved locally to the following directory: /opt/rke/etcd-snapshots/.

When you’re about to upgrade Rancher or restore it to a previous snapshot, you should snapshot your live image so that you have a backup of etcd in its last known state.

To Take a One-Time Local Snapshot:

  1. Enter the following command. Replace <SNAPSHOT.db> with any name that you want to use for the snapshot (e.g. upgrade.db).

Result: RKE takes a snapshot of etcd running on each etcd node. The file is saved to .

To Take a One-Time S3 Snapshot:

Available as of RKE v0.2.0

  1. Open Terminal and change directory to the location of the RKE binary. Your rancher-cluster.yml file must reside in the same directory.

  2. Enter the following command. Replace <SNAPSHOT.db> with any name that you want to use for the snapshot (e.g. upgrade.db).

    1. rke etcd snapshot-save \
    2. --config rancher-cluster.yml \
    3. --name snapshot-name \
    4. --s3 \
    5. --access-key S3_ACCESS_KEY \
    6. --secret-key S3_SECRET_KEY \
    7. --bucket-name s3-bucket-name \
    8. --s3-endpoint s3.amazonaws.com \

Result: RKE takes a snapshot of etcd running on each etcd node. The file is saved to /opt/rke/etcd-snapshots. It is also uploaded to the S3 compatible backend.

2. Back up Local Snapshots to a Safe Location

After taking the snapshots, save them to a safe location so that they’re unaffected if your cluster experiences a disaster scenario. This location should be persistent.

Example: