1. Overview
The DBSTAT virtual table is available on all when SQLite is built using the SQLITE_ENABLE_DBSTAT_VTAB compile-time option.
The DBSTAT virtual table is an , meaning that is not necessary to run CREATE VIRTUAL TABLE to create an instance of the dbstat virtual table before using it. The “dbstat” module name can be used as if it were a table name to query the dbstat virtual table directly. For example:
If a named virtual table that uses the dbstat module is desired, then the recommended way to create an instance of the dbstat virtual table is as follows:
- CREATE VIRTUAL TABLE temp.stat USING dbstat(main);
Note the “temp.” qualifier before the virtual table name (“stat”). This qualifier causes the virtual table to be temporary - to only exist for the duration of the current database connection. This is the recommended approach.
The schema for the DBSTAT virtual table looks like this:
- CREATE TABLE dbstat(
- name TEXT, -- Name of table or index
- path TEXT, -- Path to page from root
- pageno INTEGER, -- Page number, or page count
- ncell INTEGER, -- Cells on page (0 for overflow pages)
- payload INTEGER, -- Bytes of payload on this page or btree
- unused INTEGER, -- Bytes of unused space on this page or btree
- mx_payload INTEGER, -- Largest payload size of all cells on this row
- pgoffset INTEGER, -- Byte offset of the page in the database file
- pgsize INTEGER, -- Size of the page, in bytes
- schema TEXT HIDDEN, -- Database schema being analyzed
- aggregate BOOL HIDDEN -- True to enable aggregate mode
The DBSTAT table only reports on the content of btrees within the database file. Freelist pages, pointer-map pages, and the lock page are omitted from the analysis.
By default, there is a single row in the DBSTAT table for each btree page the database file. Each row provides information about the space utilization of that one page of the database. However, if the hidden column “aggregate” is TRUE, then results are aggregated and there is a single row in the DBSTAT table for each btree in the database, providing information about space utilization across the entire btree.
2. The “path” column of the dbstat virtual table
The “path” column describes the path taken from the root node of the btree structure to each page. The “path” of the root node itself is ‘/‘. The “path” is NULL when “aggregate” is TRUE. The “path” for the left-most child page of the root of a btree page is ‘/000/‘. (Btrees store content ordered from left to right so the pages to the left have smaller keys than the pages to the right.) The next to left-most child of the root page is ‘/001’, and so on, each sibling page identified by a 3-digit hex value. The children of the 451st left-most sibling have paths such as ‘/1c2/000/, ‘/1c2/001/‘ etc. Overflow pages are specified by appending a ‘+’ character and a six-digit hexadecimal value to the path to the cell they are linked from. For example, the three overflow pages in a chain linked from the left-most cell of the 450th child of the root page are identified by the paths:
- '/1c2/000/' // Left-most child of 451st child of root
3. Aggregated Data
Beginning with SQLite version 3.31.0 (2020-01-22), the DBSTAT table has a new named “aggregate”, which if constrained to be TRUE will cause DBSTAT to generate one row per btree in the database, rather than one row per page. When running in aggregated mode, the “path”, “pagetype”, and “pgoffset” columns are always NULL and the “pageno” column holds the number of pages in the entire btree, rather than the number of the page that corresponds to the row.
The following table shows the meanings of the (non-hidden) columns of DBSTAT in both normal and aggregated mode:
4. Example uses of the dbstat virtual table
To find the total number of pages used to store table “xyz” in schema “aux1”, use either of the following two queries (the first is the traditional way, and the second shows the use of the aggregated feature):
- SELECT count(*) FROM dbstat('aux1') WHERE name='xyz';
- SELECT pageno FROM dbstat('aux1',1) WHERE name='xyz';
To see how efficiently the content of a table is stored on disk, compute the amount of space used to hold actual content divided by the total amount of disk space used. The closer this number is to 100%, the more efficient the packing. (In this example, the ‘xyz’ table is assumed to be in the ‘main’ schema. Again, there are two different versions that show the use of DBSTAT both without and with the new aggregated feature, respectively.)
- SELECT avg(ncell) FROM dbstat WHERE name='xyz' AND pagetype='internal';
Modern filesystems operate faster when disk accesses are sequential. Hence, SQLite will run faster if the content of the database file is on sequential pages. To find out what fraction of the pages in a database are sequential (and thus obtain a measurement that might be useful in determining when to ), run a query like the following:
- CREATE TEMP TABLE s(rowid INTEGER PRIMARY KEY, pageno INT);
- INSERT INTO s(pageno) SELECT pageno FROM dbstat ORDER BY path;
- SELECT sum(s1.pageno+1==s2.pageno)*1.0/count(*)
- FROM s AS s1, s AS s2
- DROP TABLE s;