Pulsar Example
Please refer to Official Installation Guidelines.
Install Hive
Hive is the necessary component. If you don’t have Hive in your machine, we recommand using Docker to install it. Details can be found here.
- Install InLong with Docker by according to the .(Recommanded)
- Install InLong binary according to the instructions here.
Create a data ingestion
When creating data ingestion, the message middleware that the data stream group can use is Pulsar, and other configuration items related to Pulsar include:
- Write quorum: Number of copies to store for each message
- Ack quorum: Number of guaranteed copies (acks to wait before write is complete)
- retention time: retention time for the consumed message
- ttl: The default Time to Live for message
- retention size: retention size for the consumed message
Enter Approval page, click My Approval, abd approve the data ingestion application. After the approval is over, the topics and subscriptions required for the data stream will be created in the Pulsar cluster synchronously. We can use the command-line tool in the Pulsar cluster to check whether the topic is created successfully.
Configure File Agent
Then we need to create a new file /data/collect-data/test.log
and add content to it to trigger the agent to send data to the dataproxy.
Then you can observe the Audit Data Pages, and see that the data has been collected and sent successfully.
Finally, we log in to the Hive cluster and use Hive SQL commands to check whether data is successfully inserted in the table.
Troubleshooting
- Check whether the topic information corresponding to the data stream is correctly written in the
conf/topics.properties
folder of :
b_test_group/test_stream=persistent://public/b_test_group/test_stream