Overview
In order to solve the problem of data source diversity, InLong-agent abstracts multiple data sources into a unified source concept, and abstracts sinks to write data. When you need to access a new data source, you only need to configure the format and reading parameters of the data source to achieve efficient reading.
The InLong Agent task is used as a data acquisition framework, constructed with a channel + plug-in architecture. Read and write the data source into a reader/writer plug-in, and then into the entire framework.
- Reader: Reader is the data collection module, responsible for collecting data from the data source and sending the data to the channel.
- Writer: Writer is a data writing module, which reuses data continuously to the channel and writes the data to the destination.
User-configured path monitoring, able to monitor the created file information Directory regular filtering, support YYYYMMDD+regular expression path configuration Breakpoint retransmission, when InLong-Agent restarts, it can automatically re-read from the last read position to ensure no reread or missed reading.
This type of collection reads binlog and restores data by configuring mysql slave Need to pay attention to multi-threaded parsing when binlog is read, and multi-threaded parsing data needs to be labeled in order The code is based on the old version of dbsync, the main modification is to change the sending of tdbus-sender to push to agent-channel for integration