Lexical structure

    The syntax of a command defines its set of valid tokens. A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol.

    Table of contents

    String literals are defined as an arbitrary sequence of characters that are delimited with single quotes ' as defined in ANSI SQL, for example 'This is a string'.

    The escape character in CrateDB is the single-quote '. A character gets escaped when adding a single-quote before it. For example a single quote character within a string literal can be included by writing two adjacent single quotes, e.g., 'Jack''s car'.

    Note

    Two adjacent single quotes are not equivalent to the double-quote character ".

    In addition to the escaped character ', CrateDB supports C-Style escaped string sequences. Such a sequence is constructed by prefixing the string literal with the letter E or e, for example, e'hello\nWorld'. The following escaped sequences are supported:

    For instance, the escape string literal is equivalent to the 'aaa' string literal.

    Any other character following a backslash is taken literally. Thus, to include a backslash character \, two adjacent backslashes need to be used (i.e. \\).

    1. +--------+
    2. | col1 |
    3. +--------+
    4. | aa\nbb |
    5. +--------+
    6. SELECT 1 row in set (... sec)

    Finally, a single quote can be included in an escape string literal by also using the escape backslash character: \', in addition to the single-quote described in the string literals section.

    1. cr> select e'aa\'bb' as col1;
    2. +-------+
    3. | col1 |
    4. +-------+
    5. | aa'bb |
    6. +-------+

    The table below lists all reserved key words in CrateDB. These need to be quoted if used as identifiers:

    Tokens such as my_table, id, name, or data in the example below are identifiers, which identify names of tables, columns, and other database objects.

    Example:

    1. CREATE TABLE my_table (
    2. id INTEGER,
    3. data OBJECT
    4. ) WITH (number_of_replicas = 0);

    Key words and unquoted identifiers are case insensitive.

    This means that:

    1. select foo from t;

    is equivalent to:

    or:

    1. select FOO from t;

    A widely used convention is to write key words in uppercase and identifiers in lowercase, such as

    1. ALTER TABLE foo ADD COLUMN new_column INTEGER;

    Quoted identifiers can contain an arbitrary sequence of characters enclosed by double quotes ("). Quoted identifiers are never keywords, so you can use "update" as a table or column name.

    Some non-alphanumeric characters do have a special meaning. For their usage please refer to the sections where the respective syntax elements are described.

    Semicolon

    The semicolon (;) terminates an SQL statement. It cannot appear anywhere else within the command, except within a string or quoted identifier.

    Comma

    The comma (,) is used in various syntactical elements to separate elements of a list.

    Brackets

    Square brackets ([]) are used to select elements of arrays and objects, e.g. arr[1] or obj['key'].

    Asterisk

    The asterisk (*) is used in some contexts to denote all columns of a table. As an argument in global it has the meaning of any field, e.g. COUNT(*).

    Period

    The period (.) is used for numeric values and to separate schema and table names, e.g. blob.my_blob_table.

    An SQL statement can contain comments. Single line comments start with a double dash (--) and end at the end of that line. Multi line comments start with /* and end with */.

    1. /*
    2. * Retrieve information about all tables in the 'doc' schema.
    3. */
    4. SELECT *