Ingesting data
Users often choose to leverage existing 3rd party tools to build data ingest pipelines that increase ingest rates by performing batch writes into TimescaleDB, as opposed to inserting data one row or metric at a time. At a high-level, TimescaleDB looks just like PostgreSQL, so any tool that can read and/or write to PostgreSQL also works with TimescaleDB.
Below, we discuss some popular frameworks and systems used in conjunction with TimescaleDB.
Prometheus is a popular tool used to monitor infrastructure metrics. It can scrape any endpoints that expose metrics in a Prometheus-compatible format. The metrics are stored in Prometheus and can be queried using PromQL. Prometheus itself is not built for long-term metrics storage, and instead, supports a variety of remote storage solutions.
We developed a Promscale that allows Prometheus to use TimescaleDB as a remote store for long-term metrics. Promscale supports both PromQL and SQL, PromQL queries can be directed to the Promscale endpoint or Prometheus instance and the can be accessed by connecting to Timescale directly. It also offers other native time-series capabilities, such as automaticallycompressing your data, retention policies, continuous aggregate views, downsampling, data gap-filling, and interpolation. It is already natively supported by Grafana via the and PostgreSQL/TimescaleDB postgres-grafana data sources.
Telegraf is an agent that collects, processes, aggregates, and writes metrics. Since it is plugin-driven for both the collection and the output of data, it is easily extendable. In fact, it already contains over 200 plugins for gathering and writing different types of data.
We wrote the PostgreSQL output plugin which also has the ability to send data to a TimescaleDB hypertable. Telegraf handles batching, processing, and aggregating the data collected prior to inserting that data into TimescaleDB.
The PostgreSQL plugin extends the ease of use users get from leveraging Telegraf by handling schema generation and modification. This means that as metrics are collected by Telegraf, the plugin creates a table if it doesn’t exist and alters the table if a schema has changed. By default, the plugin leverages a , which is typically the schema model that TimescaleDB users tend to choose when storing metrics. However, you can specify that you want to store metrics in a narrow model with a separate metadata table and foreign keys. You can also choose to use JSONB.
Another popular method of ingesting data into TimescaleDB is through the use of the PostgreSQL connector with Kafka Connect. The connector is designed to work with and to be deployed to a Kafka Connect runtime service. It’s purpose is to ingest change events from PostgreSQL databases (i.e. TimescaleDB).
The deployed connector will monitor one or more schemas within a TimescaleDB server and write all change events to Kafka topics, which can be independently consumed by one or more clients. Kafka Connect can be distributed to provide fault tolerance to ensure the connectors are running and continually keeping up with changes in the database.
TIP:The PostgreSQL connector can also be used as a library without Kafka or Kafka Connect, enabling applications and services to directly connect to TimescaleDB and obtain the ordered change events. This approach requires the application to record the progress of the connector so that upon restart, the connect can continue where it left off. This approach may be useful for less critical use cases. However, for production use cases, it’s recommended that you use this connector with Kafka and Kafka Connect.
To start using the PostgreSQL connector, visit the GitHub page. If you are interested in an alternative method to ingest data from Kafka to TimescaleDB, you can download the and get started with this tutorial.