Backing Up Data

    This page covers backups for YugabyteDB CE. Here are some points to keep in mind.

    • Export-based backups (YugabyteDB CE)

      • Single row ACID backups
      • Backup the schema and data separately
      • Multi-threaded parallelism
    • Distributed backups (YugabyteDB EE)

      • Single tablet ACID backups
      • Integrated backup solution, integrated with object stores such as AWS S3
      • Massively parallel, efficient for very large data sets

    Dump the keyspace schema (optional) as well as the data in the tables to create a backup of the data in YugabyteDB. Here are some points to bear in mind.

    • Backing up schema for one keyspace

    In order to backup the schema for a particular keyspace, run the following command.

    • Backing up schema for entire cluster

    In order to backup the schema for all tables across all keyspaces, run the following command.

    The following command exports all the data from a table in the CSV (comma separated value) format to the output file specified. Each row in the table is written to a separate line in the file with the various column values separated by the delimiter.

    • Backing up all columns of the table
    1. cqlsh -e "COPY <keyspace>.<table> TO 'data.csv' WITH HEADER = TRUE;"
    • Backing up select columns of the table

    In order to backup selected columns of the table, specify the column names in a list.

    1. cqlsh -e "COPY <keyspace>.<table> (<column 1 name>, <column 2 name>, ...) TO 'data.csv' WITH HEADER = TRUE;"
    • Connecting to a remote host and port

    The default host is 127.0.0.1 and the default port is 9042. You can override these values as shown below.

    1. cqlsh -e <command> <host> [<port>]

    The syntax to specify options in the COPY TO command is shown below.

    1. COPY table_name [( column_list )]
    2. FROM 'file_name'[, 'file2_name', ...] | STDIN
    3. [WITH option = 'value' [AND ...]]

    There are a number of useful options in the COPY TO command used to perform the backup. Some of these are outlined below.

    We are going to use the example shown in the quick start section in order to demonstrate how to perform backups.

    This section assumes you already have a YugabyteDB cluster. You can install a local cluster on your laptop using .

    Create a keyspace for the stock ticker app.

    1. cqlsh> CREATE TABLE myapp.stock_market (
    2. stock_symbol text,
    3. ts text,
    4. current_price float,
    5. PRIMARY KEY (stock_symbol, ts)
    6. );

    Insert some sample data.

    1. INSERT INTO myapp.stock_market (stock_symbol,ts,current_price) VALUES ('AAPL','2017-10-26 09:00:00',157.41);
    2. INSERT INTO myapp.stock_market (stock_symbol,ts,current_price) VALUES ('AAPL','2017-10-26 10:00:00',157);
    3. INSERT INTO myapp.stock_market (stock_symbol,ts,current_price) VALUES ('FB','2017-10-26 10:00:00',170.1);
    4. INSERT INTO myapp.stock_market (stock_symbol,ts,current_price) VALUES ('GOOG','2017-10-26 09:00:00',972.56);
    5. INSERT INTO myapp.stock_market (stock_symbol,ts,current_price) VALUES ('GOOG','2017-10-26 10:00:00',971.91);

    You can query all the 6 rows we inserted by running the following command in cqlsh.

    1. cqlsh> SELECT * FROM myapp.stock_market;
    1. stock_symbol | ts | current_price
    2. --------------+---------------------+---------------
    3. GOOG | 2017-10-26 09:00:00 | 972.56
    4. GOOG | 2017-10-26 10:00:00 | 971.90997
    5. AAPL | 2017-10-26 09:00:00 | 157.41
    6. AAPL | 2017-10-26 10:00:00 | 157
    7. FB | 2017-10-26 09:00:00 | 170.63
    8. FB | 2017-10-26 10:00:00 | 170.10001
    9. (6 rows)

    Run the following in order to backup the schema of the keyspace myapp.

      The schema of the keyspace myapp along with the tables in it are saved to the file myapp_schema.cql.

      1. CREATE KEYSPACE myapp WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
      2. CREATE TABLE myapp.stock_market (
      3. stock_symbol text,
      4. ts text,
      5. current_price float,
      6. PRIMARY KEY (stock_symbol, ts)
      7. ) WITH CLUSTERING ORDER BY (ts ASC)

      Run the following command in order to backup the data in the table myapp.stock_market.

      1. $ cqlsh -e "COPY myapp.stock_market TO 'myapp_data.csv' WITH HEADER = TRUE ;"

      All columns of the rows in the table myapp.stock_market are saved to the file myapp_data.csv.

      1. $ cat myapp_data.csv
      2. stock_symbol,ts,current_price
      3. AAPL,2017-10-26 09:00:00,157.41
      4. AAPL,2017-10-26 10:00:00,157
      5. FB,2017-10-26 09:00:00,170.63
      6. FB,2017-10-26 10:00:00,170.10001
      7. GOOG,2017-10-26 09:00:00,972.56
      8. GOOG,2017-10-26 10:00:00,971.90997

      In order to backup a subset of columns, you can specify them in the backup command. In the example below, the stock_symbol and ts columns are backed up, while the current_price column is not.

      1. $ cqlsh -e "COPY myapp.stock_market (stock_symbol, ts) TO 'myapp_data_partial.csv' WITH HEADER = TRUE ;"
      1. $ cat myapp_data_partial.csv
      2. stock_symbol,ts
      3. AAPL,2017-10-26 09:00:00
      4. AAPL,2017-10-26 10:00:00
      5. FB,2017-10-26 09:00:00
      6. FB,2017-10-26 10:00:00
      7. GOOG,2017-10-26 09:00:00