2.4. Benchmark Driver
Download presto-benchmark-driver-0.245.1-executable.jar, rename it to , then make it executable with chmod +x
.
Create a suite.json
file:
The SQL files are contained in a directory named sql
and must have the .sql
file extension. The name of the query is the name of the file without the extension.
The benchmark driver will measure the wall time, total CPU time used by all Presto processes and the CPU time used by the query. For each timing, the driver reports median, mean and standard deviation of the query runs. The difference between process and query CPU times is the query overhead, which is normally from garbage collections. The following is the output from the file_formats
suite above:
The driver can add additional columns to the output by extracting values from the schema name or SQL files. In the suite file above, the schema names contain named regular expression capturing groups for compression
, format
, and , so if we ran the queries in a catalog containing the schemas tpch_sf100_orc_none
, tpch_sf100_orc_snappy
, and tpch_sf100_orc_zlib
, we get the above output.
Another way to create additional output columns is by adding tags to the SQL files. For example, the following SQL file declares two tags, projection
and filter
:
The presto-benchmark-driver
program contains many CLI arguments to control which suites and queries to run, the number of warm-up runs and the number of measurement runs. All of the command line arguments can be seen with the --help
option.