7.3. Cassandra Connector
Connector is compatible with all Cassandra versions starting from 2.1.5.
Configuration
To configure the Cassandra connector, create a catalog properties file with the following contents, replacing host1,host2
with a comma-separated list of the Cassandra nodes used to discovery the cluster topology:
You will also need to set cassandra.native-protocol-port
if your Cassandra nodes are not using the default port (9042).
You can have as many catalogs as you need, so if you have additional Cassandra clusters, simply add another properties file to etc/catalog
with a different name (making sure it ends in .properties
). For example, if you name the property file sales.properties
, Presto will create a catalog named sales
using the configured connector.
Note
If authorization is enabled, cassandra.username
must have enough permissions to perform SELECT
queries on the system.size_estimates
table.
The following advanced configuration properties are available:
Querying Cassandra Tables
The users
table is an example Cassandra table from the Cassandra Getting Started guide. It can be created along with the mykeyspace
keyspace using Cassandra’s cqlsh (CQL interactive terminal):
This table can then be queried in Presto:
The data types mappings are as follows:
Any collection (LIST/MAP/SET) can be designated as FROZEN, and the value is mapped to VARCHAR. Additionally, blobs have the limitation that they cannot be empty.
Types not mentioned in the table above are not supported (e.g. tuple or UDT).
Limitations
- Queries without filters containing the partition key result in fetching all partitions. This causes a full scan of the entire data set, therefore it’s much slower compared to a similar query with a partition key as a filter.
- Range (
<
or>
andBETWEEN
) filters can be applied only to the partition keys.