Lexical structure
The syntax of a command defines its set of valid tokens. A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol.
Table of contents
String literals are defined as an arbitrary sequence of characters that are delimited with single quotes '
as defined in ANSI SQL, for example 'This is a string'
.
The escape character in CrateDB is the single-quote '
. A character gets escaped when adding a single-quote before it. For example a single quote character within a string literal can be included by writing two adjacent single quotes, e.g., 'Jack''s car'
.
Note
Two adjacent single quotes are not equivalent to the double-quote character "
.
In addition to the escaped character '
, CrateDB supports C-Style escaped string sequences. Such a sequence is constructed by prefixing the string literal with the letter E
or e
, for example, e'hello\nWorld'
. The following escaped sequences are supported:
For instance, the escape string literal is equivalent to the 'aaa'
string literal.
Any other character following a backslash is taken literally. Thus, to include a backslash character \
, two adjacent backslashes need to be used (i.e. \\
).
+--------+
| col1 |
+--------+
| aa\nbb |
+--------+
SELECT 1 row in set (... sec)
Finally, a single quote can be included in an escape string literal by also using the escape backslash character: \'
, in addition to the single-quote described in the string literals section.
cr> select e'aa\'bb' as col1;
+-------+
| col1 |
+-------+
| aa'bb |
+-------+
The table below lists all reserved key words in CrateDB. These need to be quoted if used as identifiers:
Tokens such as my_table
, id
, name
, or data
in the example below are identifiers, which identify names of tables, columns, and other database objects.
Example:
CREATE TABLE my_table (
id INTEGER,
data OBJECT
) WITH (number_of_replicas = 0);
Key words and unquoted identifiers are case insensitive.
This means that:
select foo from t;
is equivalent to:
or:
select FOO from t;
A widely used convention is to write key words in uppercase and identifiers in lowercase, such as
ALTER TABLE foo ADD COLUMN new_column INTEGER;
Quoted identifiers can contain an arbitrary sequence of characters enclosed by double quotes ("
). Quoted identifiers are never keywords, so you can use "update"
as a table or column name.
Some non-alphanumeric characters do have a special meaning. For their usage please refer to the sections where the respective syntax elements are described.
Semicolon
The semicolon (;
) terminates an SQL statement. It cannot appear anywhere else within the command, except within a string or quoted identifier.
Comma
The comma (,
) is used in various syntactical elements to separate elements of a list.
Brackets
Square brackets ([]
) are used to select elements of arrays and objects, e.g. arr[1]
or obj['key']
.
Asterisk
The asterisk (*
) is used in some contexts to denote all columns of a table. As an argument in global it has the meaning of any field, e.g. COUNT(*)
.
Period
The period (.
) is used for numeric values and to separate schema and table names, e.g. blob.my_blob_table
.
An SQL statement can contain comments. Single line comments start with a double dash (--
) and end at the end of that line. Multi line comments start with /*
and end with */
.
/*
* Retrieve information about all tables in the 'doc' schema.
*/
SELECT *