Altering tables

    and DROP COLUMN actions are not currently supported. See SQL compatibility.

    The parameters of a table can be modified using the ALTER TABLE clause:

    In order to set a parameter to its default value use reset:

    1. cr> alter table my_table reset (number_of_replicas);
    2. ALTER OK, -1 rows affected (... sec)

    Changing the number of shards in general works in the following steps.

    1. A new target table is created but with more/less number of primary shards.

    2. The segments from the source table (the underling Lucene index to be precise) are hard-linked into the target table at file system level.

    3. The source table is dropped while the new table is renamed into the source and then recovered in the cluster.

    Note

    Segment hard-linking makes this operation relevantly cheap as it involves no data copying. If the file system, however, does not support hard-linking, then all segments will be copied into the new table, resulting in much more time and resource consuming operation.

    To change the number of primary shards of a table, it is necessary to first satisfy certain conditions.

    Decreasing the number of shards

    To decrease the number of shards, it is necessary to ensure the following two conditions:

    First, a (primary or replica) copy of every shard of the table must be present on the same node. The user can choose the most suitable node for this operation and then restrict table on that node using the shard allocation filtering.

    1. cr> alter table my_table set ("blocks.write" = true);

    Afterwards the number of shards can be decreased:

    1. cr> alter table my_table set (number_of_shards = 1);
    2. ALTER OK, 0 rows affected (... sec)

    The user should then revert the restrictions applied on the table, for instance

    1. cr> alter table my_table reset ("routing.allocation.require._name", "blocks.write");
    2. ALTER OK, -1 rows affected (... sec)

    It is necessary to use a factor of the current number of primary shards as the target number of shards. For example, a table with 8 shards can be shrunk into 4, 2 or 1 primary shards.

    Increase the number of shards

    Increasing the number of shards is limited to tables which have been created with a number_of_routing_shards setting. For such tables the shards can be increased by a factor that depends on this setting. For example, a table with 5 shards, with number_of_routing_shards set to 20 can be changed to have either 10 or 20 shards. (5 x 2 (x 2)) = 20 or (5 x 4) = 20.

    The only condition required for increasing the number of shards is to block operations to the table:

    Afterwards, the table shards can be increased:

    1. cr> alter table my_table set (number_of_shards = 2);
    2. ALTER OK, 0 rows affected (... sec)

    Similarly, the user should revert the restrictions applied on the table, for instance:

    1. cr> alter table my_table set ("blocks.write" = false);

    Read to see how to alter parameters of partitioned tables.

    In order to add a column to an existing table use ALTER TABLE with the ADD COLUMN clause:

    1. cr> alter table my_table add column new_column_name text;
    2. ALTER OK, -1 rows affected (... sec)

    The inner schema of object columns can also be extended, as shown in the following example.

    First a column of type object is added:

    1. cr> alter table my_table add column obj_column object as (age int);
    2. ALTER OK, -1 rows affected (... sec)

    And now a nested column named name is added to the obj_column:

    1. cr> select column_name, data_type from information_schema.columns
    2. ... where table_name = 'my_table' and column_name like 'obj_%';
    3. | column_name | data_type |
    4. +--------------------+-----------+
    5. | obj_column | object |
    6. | obj_column['age'] | integer |
    7. | obj_column['name'] | text |
    8. SELECT 3 rows in set (... sec)
    1. cr> alter table my_table close;
    2. ALTER OK, -1 rows affected (... sec)

    Closing a table will cause all operations beside ALTER TABLE ... OPEN to fail.

    A table can be reopened again by using ALTER TABLE with the OPEN clause:

    1. cr> alter table my_table open;
    2. ALTER OK, -1 rows affected (... sec)

    Note

    This setting is not the same as blocks.read_only. Closing and opening a table will preserve these settings if they are already set.

    A table can be renamed by using ALTER TABLE with the RENAME TO clause:

    1. cr> alter table my_table rename to my_new_table;
    2. ALTER OK, -1 rows affected (... sec)

    During the rename operation the shards of the table become temporarily unavailable.

    With the REROUTE command it is possible to control the of shards. This gives you the ability to re-balance the cluster state manually. The supported reroute options are listed in the reference documentation of ALTER TABLE REROUTE.

    Shard rerouting can help solve several problems:

    This command takes these into account. Once an allocation occurs CrateDB tries (by default) to re-balance shards to an even state. CrateDB can be set to disable shard re-balancing with the setting cluster.routing.rebalance.enable=None to perform only the explicit triggered allocations. .

    Note

    In those two cases it may be necessary to move shards manually to another node or force the retry of the allocation process.