Installation and Deployment

Before continue reading, you might want to compile Doris following the instructions in the Compile topic.

Doris, as an open source OLAP database with an MPP architecture, can run on most mainstream commercial servers. For you to take full advantage of the high concurrency and high availability of Doris, we recommend that your computer meet the following requirements:

Linux Operating System Version Requirements

Software Requirements

OS Installation Requirements

Set the maximum number of open file descriptors in the system

Clock synchronization

The metadata in Doris requires a time precision of less than 5000ms, so all machines in all clusters need to synchronize their clocks to avoid service exceptions caused by inconsistencies in metadata caused by clock problems.

Close the swap partition

The Linux swap partition can cause serious performance problems for Doris, so you need to disable the swap partition before installation.

Linux file system

When installing the operating system, we recommend that you select the ext4 file system.

Development Test Environment

Production Environment

Usually we recommend 10 to 100 machines to give full play to Doris’ performance (deploy FE on 3 of them (HA) and BE on the rest).
The performance of Doris is positively correlated with the number of nodes and their configuration. With a minimum of four machines (one FE, three BEs; hybrid deployment of one BE and one Observer FE to provide metadata backup) and relatively low configuration, Doris can still run smoothly.
In hybrid deployment of FE and BE, you might need to be watchful for resource competition and ensure that the metadata catalogue and data catalogue belong to different disks.

Broker Deployment

Broker is a process for accessing external data sources, such as hdfs. Usually, deploying one broker instance on each machine should be enough.

Network Requirements

Doris instances communicate directly over the network. The following table shows all required ports.

IP Binding

Because of the existence of multiple network cards, or the existence of virtual network cards caused by the installation of docker and other environments, the same host may have multiple different IPs. Currently Doris does not automatically identify available IPs. So when you encounter multiple IPs on the deployment host, you must specify the correct IP via the configuration item.

priority_networks is a configuration item that both FE and BE have. It needs to be written in fe.conf and be.conf. It is used to tell the process which IP should be bound when FE or BE starts. Examples are as follows:

priority_networks=10.1.3.0/24

This is a representation of CIDR. FE or BE will find the matching IP based on this configuration item as their own local IP.

Note: Configuring priority_networks and starting FE or BE only ensure the correct IP binding of FE or BE. You also need to specify the same IP in ADD BACKEND or ADD FRONTEND statements, otherwise the cluster cannot be created. For example:

BE is configured as priority_networks = 10.1.3.0/24'..

If you use the following IP in the ADD BACKEND statement: ALTER SYSTEM ADD BACKEND "192.168.0.1:9050";

Then FE and BE will not be able to communicate properly.

At this point, you must DROP the wrong BE configuration and use the correct IP to perform ADD BACKEND.

The same works for FE.

Broker currently does not have the priority_networks configuration item, nor does it need. Broker’s services are bound to 0.0.0.0 by default. You can simply execute the correct accessible BROKER IP when using ADD BROKER.

Table Name Case Sensitivity

By default, table names in Doris are case-sensitive. If you need to change that, you may do it before cluster initialization. The table name case sensitivity cannot be changed after cluster initialization is completed.

See the lower_case_table_names section in Variables for details.

Deploy FE

Copy the FE deployment file into the specified node

Find the Fe folder under the output generated by source code compilation, copy it into the specified deployment path of FE nodes and put it the corresponding directory.
Configure FE
1. The configuration file is conf/fe.conf. Note: meta_dir indicates the metadata storage location. The default value is ${DORIS_HOME}/doris-meta. The directory needs to be created manually.
  
  Note: For production environments, it is better not to put the directory under the Doris installation directory but in a separate disk (SSD would be the best); for test and development environments, you may use the default configuration.
2. The default maximum Java heap memory of JAVA_OPTS in fe.conf is 4GB. For production environments, we recommend that it be adjusted to more than 8G.
Start FE

bin/start_fe.sh --daemon

The FE process starts and enters the background for execution. Logs are stored in the log/ directory by default. If startup fails, you can view error messages by checking out log/fe.log or log/fe.out.
For details about deployment of multiple FEs, see the FE scaling section.

Deploy BE

Modify all BE configurations

Modify be/conf/be.conf, which mainly involves configuring storage_root_path: data storage directory. By default, under be/storage, the directory needs to be created manually. Use ; to separate multiple paths (do not add ; after the last directory).

You may specify the directory storage medium in the path: HDD or SSD. You may also add capacility limit to the end of every path and use , for separation. Unless you use a mix of SSD and HDD disks, you do not need to follow the configuration methods in Example 1 and Example 2 below, but only need to specify the storage directory; you do not need to modify the default storage medium configuration of FE, either.

Note: For SSD disks, add .SSD to the end of the directory; for HDD disks, add .HDD.

`storage_root_path=/home/disk1/doris.HDD;/home/disk2/doris.SSD;/home/disk2/doris`

Description

Example 2:

Note: You do not need to add the .SSD or .HDD suffix, but to specify the medium in the storage_root_path parameter

`storage_root_path=/home/disk1/doris,medium:HDD;/home/disk2/doris,medium:SSD`

Description

BE webserver_port configuration

If the BE component is installed in hadoop cluster, you need to change configuration webserver_port=8040 to avoid port used.
Set JAVA_HOME environment variable

SinceVersion 1.2.0Java UDF is supported since version 1.2, so BEs are dependent on the Java environment. It is necessary to set the `JAVA_HOME` environment variable before starting. You can also do this by adding `export JAVA_HOME=your_java_home_path` to the first line of the `start_be.sh` startup script.
Install Java UDF

SinceVersion 1.2.0Because Java UDF is supported since version 1.2, you need to download the JAR package of Java UDF from the official website and put them under the lib directory of BE, otherwise it may fail to start.
Add all BE nodes to FE

BE nodes need to be added in FE before they can join the cluster. You can use mysql-client (Download MySQL 5.7) to connect to FE:

./mysql-client -h fe_host -P query_port -uroot

is the node IP where FE is located; query_port is in fe/conf/fe.conf; the root account is used by default and no password is required in login.

After login, execute the following command to add all the Best:

ALTER SYSTEM ADD BACKEND "be_host:heartbeat_service_port";

be_host is the node IP where BE is located; heartbeat_service_port is in be/conf/be.conf.
Start BE

bin/start_be.sh --daemon

The BE process will start and go into the background for execution. Logs are stored in be/log/directory by default. If startup fails, you can view error messages by checking out be/log/be.log or be/log/be.out.
View BE status

Connect to FE using mysql-client and execute SHOW PROC '/backends'; to view BE operation status. If everything goes well, the isAlivecolumn should be true.

(Optional) FS_Broker Deployment

Broker is deployed as a plug-in, which is independent of Doris. If you need to import data from a third-party storage system, you need to deploy the corresponding Broker. By default, Doris provides fs_broker for HDFS reading and object storage (supporting S3 protocol). fs_broker is stateless and we recommend that you deploy a Broker for each FE and BE node.

Copy the corresponding Broker directory in the output directory of the source fs_broker to all the nodes that need to be deployed. It is recommended to keep the Broker directory on the same level as the BE or FE directories.
Modify the corresponding Broker configuration

You can modify the configuration in the corresponding broker/conf/directory configuration file.
Start Broker

bin/start_broker.sh --daemon
Add Broker

To let Doris FE and BE know which nodes Broker is on, add a list of Broker nodes by SQL command.

Use mysql-client to connect the FE started, and execute the following commands:

ALTER SYSTEM ADD BROKER broker_name "broker_host1:broker_ipc_port1","broker_host2:broker_ipc_port2",...;

broker\_host is the Broker node ip; broker_ipc_port is in conf/apache_hdfs_broker.conf in the Broker configuration file.
View Broker status

Connect any started FE using mysql-client and execute the following command to view Broker status: SHOW PROC '/brokers';

Note: In production environments, daemons should be used to start all instances to ensure that processes are automatically pulled up after they exit, such as Supervisor. For daemon startup, in Doris 0.9.0 and previous versions, you need to remove the last & symbol in the start_xx.sh scripts. In Doris 0.10.0 and the subsequent versions, you may just call sh start_xx.sh directly to start.

How can we know whether the FE process startup succeeds?

After the FE process starts, metadata is loaded first. Based on the role of FE, you can see transfer from UNKNOWN to MASTER/FOLLOWER/OBSERVER in the log. Eventually, you will see the thrift server started log and can connect to FE through MySQL client, which indicates that FE started successfully.

http://fe_host:fe_http_port/api/bootstrap

If it returns:

{"status":"OK","msg":"Success"}

The startup is successful; otherwise, there may be problems.
How can we know whether the BE process startup succeeds?

After the BE process starts, if there have been data there before, it might need several minutes for data index loading.

If BE is started for the first time or the BE has not joined any cluster, the BE log will periodically scroll the words waiting to receive first heartbeat from frontend, meaning that BE has not received the Master’s address through FE’s heartbeat and is waiting passively. Such error log will disappear after sending the heartbeat by ADD BACKEND in FE. If the word master client', get client from cache failed. host:, port: 0, code: 7 appears after receiving the heartbeat, it indicates that FE has successfully connected BE, but BE cannot actively connect FE. You may need to check the connectivity of rpc_port from BE to FE.

If BE has been added to the cluster, the heartbeat log from FE will be scrolled every five seconds: get heartbeat, host:xx. xx.xx.xx, port:9020, cluster id:xxxxxxx, indicating that the heartbeat is normal.

Secondly, if the word finish report task success. return code: 0 is scrolled every 10 seconds in the log, that indicates that the BE-to-FE communication is normal.

Meanwhile, if there is a data query, you will see the rolling logs and the logs, indicating that BE is started successfully, and the query is normal.

You can also check whether the startup was successful by connecting as follows:

http://be_host:be_http_port/api/health

If it returns:

{"status": "OK","msg": "To Be Added"}

That means the startup is successful; otherwise, there may be problems.
How can we confirm that the connectivity of FE and BE is normal after building the system?

Firstly, you need to confirm that FE and BE processes have been started separately and worked normally. Then, you need to confirm that all nodes have been added through ADD BACKEND or ADD FOLLOWER/OBSERVER statements.

If the heartbeat is normal, BE logs will show get heartbeat, host:xx.xx.xx.xx, port:9020, cluster id:xxxxx; if the heartbeat fails, you will see backend [10001] got Exception: org.apache.thrift.transport.TTransportException in FE’s log, or other thrift communication abnormal log, indicating that the heartbeat from FE to 10001 BE fails. Here you need to check the connectivity of the FE to BE host heartbeat port.

If the BE-to-FE communication is normal, the BE log will display the words finish report task success. return code: 0. Otherwise, the words master client, get client from cache failed will appear. In this case, you need to check the connectivity of BE to the rpc_port of FE.
What is the Doris node authentication mechanism?

In addition to Master FE, the other role nodes (Follower FE, Observer FE, Backend) need to register to the cluster through the ALTER SYSTEM ADD statement before joining the cluster.

When Master FE is started for the first time, a cluster_id is generated in the doris-meta/image/VERSION file.

When FE joins the cluster for the first time, it first retrieves the file from Master FE. Each subsequent reconnection between FEs (FE reboot) checks whether its cluster ID is the same as that of other existing FEs. If it is not the same, the FE will exit automatically.

When BE first receives the heartbeat of Master FE, it gets the cluster ID from the heartbeat and records it in the cluster_id file of the data directory. Each heartbeat after that compares that to the cluster ID sent by FE. If the cluster IDs are not matched, BE will refuse to respond to FE’s heartbeat.

The heartbeat also contains Master FE’s IP. If the Master FE changes, the new Master FE will send the heartbeat to BE together with its own IP, and BE will update the Master FE IP it saved.
What is the number of file descriptors of a BE process?

The number of file descriptor of a BE process is determined by two parameters: min_file_descriptor_number/max_file_descriptor_number.

If it is not in the range [min_file_descriptor_number, max_file_descriptor_number], error will occurs when starting a BE process. You may use the ulimit command to reset the parameters.

The default value of min_file_descriptor_number is 65536.

The default value of max_file_descriptor_number is 131072.

For example, the command ulimit -n 65536; means to set the number of file descriptors to 65536.

After starting a BE process, you can use cat /proc/$pid/limits to check the actual number of file descriptors of the process.
```
vim /etc/supervisord.conf
minfds=65535                 ; (min. avail startup file descriptors;default 1024)
```