Configuration Settings

    An Alluxio cluster can be configured by setting the values of Alluxio configuration properties within ${ALLUXIO_HOME}/conf/alluxio-site.properties.

    The two major components to configure are

    • , consisting of masters and workers
    • Alluxio clients, which are typically a part of compute applications.

    Customizing how an application interacts with Alluxio is specific to each application. The following are recommendations for some common applications.

    Note that it is only valid to set client-side configurations for applications, such as properties prefixed with alluxio.user. Similarly, setting server-side properties on a compute application has no effect, such as properties prefixed with either alluxio.master or alluxio.worker.

    Alluxio shell users can put JVM system properties -Dproperty=value after the fs command and before the subcommand to specify Alluxio user properties from the command line. For example, the following Alluxio shell command sets the write type to CACHE_THROUGH when copying files to Alluxio:

    Note that, as a part of Alluxio deployment, the Alluxio shell will also take the configuration in ${ALLUXIO_HOME}/conf/alluxio-site.properties when it is run from Alluxio installation at ${ALLUXIO_HOME}.

    Spark

    To customize Alluxio client-side properties in Spark applications, Spark users can use pass Alluxio properties as JVM system properties. See examples for the entire Spark Service or for .

    Hadoop MapReduce

    See examples to configure Alluxio properties for or for individual MapReduce jobs.

    Hive can be configured to use customized Alluxio client-side properties for the entire service. See .

    Presto

    Presto can be configured to use customized Alluxio client-side properties for the entire service. See .

    alluxio-site.properties Files (Recommended)

    Alluxio admins can create and customize the properties file alluxio-site.properties to configure an Alluxio masters or workers. If this file does not exist, it can be created from the template file under ${ALLUXIO_HOME}/conf:

    1. $ cp conf/alluxio-site.properties.template conf/alluxio-site.properties

    Alluxio supports defining a few frequently used configuration settings through environment variables, including:

    For example, to setup the following:

    • an Alluxio master at localhost
    • the root mount point as an HDFS cluster with a namenode also running at localhost
    • enable Java remote debugging at port 7001 run the following commands before startingthe master process:

    Users can either set these variables through the shell or in conf/alluxio-env.sh. If this file does not exist yet, create one by copying the template:

      Cluster Defaults

      When different client applications (Alluxio Shell CLI, Spark jobs, MapReduce jobs) or Alluxio workers connect to an Alluxio master, they will initialize their own Alluxio configuration properties with the default values supplied by the masters based on the master-side ${ALLUXIO_HOME}/conf/alluxio-site.properties files. As a result, cluster admins can set client-side settings (e.g., alluxio.user.*), or network transport settings (e.g., alluxio.security.authentication.type), or worker settings (e.g., alluxio.worker.*) in ${ALLUXIO_HOME}/conf/alluxio-site.properties on all the masters, which will be distributed and become cluster-wide default values when clients and workers connect.

      For example, the property alluxio.user.file.writetype.default defaults to ASYNC_THROUGH, which first writes to Alluxio and then asynchronously writes to the UFS. In an Alluxio cluster where data persistence is preferred and all jobs need to write to both the UFS and Alluxio, the administrator can add alluxio.user.file.writetype.default=CACHE_THROUGH in each master’s alluxio-site.properties file. After restarting the cluster, all jobs will automatically set alluxio.user.file.writetype.default to CACHE_THROUGH.

      Clients can ignore or overwrite the cluster-wide default values by following the approaches described in Configure Alluxio for Applications to overwrite the same properties.

      Note that, before version 1.8, ${ALLUXIO_HOME}/conf/alluxio-site.properties file is only loaded by Alluxio server processes and will be ignored by applications interacting with Alluxio service through Alluxio client, unless ${ALLUXIO_HOME}/conf is on applications’ classpath.

      Path Defaults

      Since version 2.0, Alluxio administrators can set default client side configurations for Alluxio paths. FileSystem client operations have options, FileSystem options are derived from client side configuration properties. Only these configuration properties can be set as as path defaults.

      For example, createFile has an option to specify write type. By default, the write type is the value of the configuration key . The administrator can set default value of alluxio.user.file.write.type.default to MUST_CACHE for all paths with prefix /tmp by bin/alluxio fsadmin pathConf add --property alluxio.user.file.writetype.default=MUST_CACHE /tmp. Then for any createFile on paths with prefix /tmp, by default, the write type will be MUST_CACHE.

      Path defaults will be automatically propagated to long running clients if they are updated. If the administrator updates path defaults by bin/alluxio fsadmin pathConf add --property alluxio.user.file.writetype.default=THROUGH /tmp, afterwards, all createFile will by default have write type THROUGH.

      An Alluxio property can be possibly configured in multiple sources. In this case, its final value is determined by the following priority list, from highest priority to lowest:

      1. JVM system properties (i.e., -Dproperty=key)
      2. Property files: When an Alluxio cluster starts, each server process including master and worker searches for alluxio-site.properties within the following directories in the given order, stopping when a match is found: ${CLASSPATH}, ${HOME}/.alluxio/, /etc/alluxio/, and ${ALLUXIO_HOME}/conf
      3. Cluster default values: An Alluxio client may initialize its configuration based on the cluster-wide default configuration served by the masters.

      If no user-specified configuration is found for a property, Alluxio runtime will fallback to its .

      To check the value of a specific configuration property and the source of its value, users can run the following command:

      To list all of the configuration properties with sources:

      1. $ ./bin/alluxio getConf --source
      2. alluxio.conf.dir=/Users/bob/alluxio/conf (SYSTEM_PROPERTY)
      3. alluxio.debug=false (DEFAULT)
      4. ...

      Users can also specify the --master option to list all of the cluster-wide configuration properties served by the masters. Note that with the --master option, getConf will query the master which requires the master process to be running. Otherwise, without --master option, this command only checks the local configuration.

      The server-side configuration checker helps discover configuration errors and warnings. Suspected configuration errors are reported through the web UI, doctor CLI, and master logs.

      The web UI shows the result of the server configuration check.

      Users can also run the fsadmin doctor command to get the same results.

        Configuration warnings can also be found in the master logs.

        masterLogs