2.4. Benchmark Driver

    Download presto-benchmark-driver-0.245.1-executable.jar, rename it to , then make it executable with chmod +x.

    Create a suite.json file:

    The SQL files are contained in a directory named sql and must have the .sql file extension. The name of the query is the name of the file without the extension.

    The benchmark driver will measure the wall time, total CPU time used by all Presto processes and the CPU time used by the query. For each timing, the driver reports median, mean and standard deviation of the query runs. The difference between process and query CPU times is the query overhead, which is normally from garbage collections. The following is the output from the file_formats suite above:

    The driver can add additional columns to the output by extracting values from the schema name or SQL files. In the suite file above, the schema names contain named regular expression capturing groups for compression, format, and , so if we ran the queries in a catalog containing the schemas tpch_sf100_orc_none, tpch_sf100_orc_snappy, and tpch_sf100_orc_zlib, we get the above output.

    Another way to create additional output columns is by adding tags to the SQL files. For example, the following SQL file declares two tags, projection and filter:

    The presto-benchmark-driver program contains many CLI arguments to control which suites and queries to run, the number of warm-up runs and the number of measurement runs. All of the command line arguments can be seen with the --help option.