T-Digest Functions
T-digest was developed by Ted Dunning.
A T-digest is a data sketch which stores approximate percentile information. The Presto type for this data structure is called tdigest, and it accepts a parameter of type which represents the set of numbers to be ingested by the tdigest
. Other numeric types may be added in a future release.
T-digests may be merged without losing precision, and for storage and retrieval they may be cast to/from VARBINARY
.
Functions
merge
(tdigest<double>) → tdigest<double>
Merges all input tdigest
s into a single tdigest
.
value_at_quantile
(tdigest<double>, quantile) → double
quantile_at_value
(tdigest<double>, value) → double
Returns the approximate quantile number between 0 and 1 from the T-digest given an input . Null is returned if the T-digest is empty or the input value is outside of the range of the digest.
scale_tdigest
(tdigest<double>, scale_factor) → tdigest<double>
Returns a tdigest
whose distribution has been scaled by a factor specified by scale_factor
.
values_at_quantiles
(tdigest<double>, quantiles) → array<double>
Returns the approximate percentile values as an array given the input T-digest and array of values between 0 and 1 which represent the quantiles to return.
Returns the tdigest
which is composed of all input values of x
.
tdigest_agg
(x, w) → tdigest<double>
Returns the which is composed of all input values of x
using the per-item weight w
.
tdigest_agg
(x, w, accuracy) → tdigest<double>
Returns the tdigest
which is composed of all input values of x
using the per-item weight w
and maximum error of accuracy
. accuracy
must be a value greater than zero and less than one, and it must be constant for all input rows.
(tdigest<double>) → row<centroid_means array<double>, centroid_weights array<integer>, compression double, min double, max double, sum double, count bigint>