In the function syntax, y is interpreted as an x-dependent variable.
The linear regression aggregate functions take a pair of arguments, the dependent variable expression (y) and the independent variable expression (x), which are both numeric value expressions. Any row in which either argument evaluates to NULL
is removed from the rows that qualify. If there are no rows that qualify, then the result of REGR_COUNT
is 0
(zero), and the other linear regression aggregate functions result in NULL
.
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
Table 9.4.1.1 REGR_AVGX
Function Parameters
The function REGR_AVGX
calculates the average of the independent variable (x) of the regression line.
The function REGR_AVGX(<y>, <x>)
is equivalent to
SUM(<exprX>) / REGR_COUNT(<y>, <x>)
<exprX> :==
CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END
See also, Section 9.4.3, REGR_COUNT(),
9.4.2 REGR_AVGY()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
REGR_AVGY ( <y>, <x> )
Table 9.4.2.1 REGR_AVGY
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The function REGR_AVGY
calculates the average of the dependent variable (y) of the regression line.
The function REGR_AVGY(<y>, <x>)
is equivalent to
SUM(<exprY>) / REGR_COUNT(<y>, <x>)
<exprY> :==
CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <y> END
See also, Section 9.4.3, REGR_COUNT(),
9.4.3 REGR_COUNT()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
REGR_COUNT ( <y>, <x> )
Table 9.4.3.1 REGR_COUNT
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The function REGR_COUNT
counts the number of non-empty pairs of the regression line.
SUM(<exprXY>) / REGR_COUNT(<y>, <x>)
<exprXY> :==
CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN 1 END
See also
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
Table 9.4.4.1 REGR_INTERCEPT
Function Parameters
The function REGR_INTERCEPT
calculates the point of intersection of the regression line with the y-axis.
The function REGR_INTERCEPT(<y>, <x>)
is equivalent to
REGR_AVGY(<y>, <x>) - REGR_SLOPE(<y>, <x>) * REGR_AVGX(<y>, <x>)
9.4.4.1 REGR_INTERCEPT
Examples
Forecasting sales volume
with recursive years (byyear) as (
from rdb$database
union all
select byyear + 1
from years
where byyear < 2020
),
s as (
select
extract(year from order_date) as byyear,
sum(total_value) as total_value
from sales
group by 1
),
regr as (
select
regr_intercept(total_value, byyear) as intercept,
regr_slope(total_value, byyear) as slope
from s
)
select
years.byyear as byyear,
intercept + (slope * years.byyear) as total_value
from years
cross join regr
BYYEAR TOTAL_VALUE
------ ------------
1992 414557.62
1993 710737.89
1994 1006918.16
1995 1303098.43
1996 1599278.69
1997 1895458.96
1998 2191639.23
2000 2783999.77
...
See also, Section 9.4.2, REGR_AVGY(),
9.4.5 REGR_R2()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
REGR_R2 ( <y>, <x> )
Table 9.4.5.1 REGR_R2
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The REGR_R2 function calculates the coefficient of determination, or R-squared, of the regression line.
The function REGR_R2(<y>, <x>)
is equivalent to
POWER(CORR(<y>, <x>), 2)
See also, POWER
9.4.6 REGR_SLOPE()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
Table 9.4.6.1 REGR_SLOPE
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The function REGR_SLOPE
calculates the slope of the regression line.
COVAR_POP(<y>, <x>) / VAR_POP(<exprX>)
<exprX> :==
CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END
See alsoSection 9.3.2, COVAR_POP(),
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
REGR_SXX ( <y>, <x> )
Table 9.4.7.1 REGR_SXX
Function Parameters
The function REGR_SXX
calculates the sum of squares of the independent expression variable (x).
The function REGR_SXX(<y>, <x>)
is equivalent to
REGR_COUNT(<y>, <x>) * VAR_POP(<exprX>)
<exprX> :==
CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END
See alsoSection 9.4.3, REGR_COUNT(),
9.4.8 REGR_SXY()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
REGR_SXY ( <y>, <x> )
Table 9.4.8.1 REGR_SXY
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The function REGR_SXY
calculates the sum of products of independent variable expression (x) times dependent variable expression (y).
The function REGR_SXY(<y>, <x>)
is equivalent to
REGR_COUNT(<y>, <x>) * COVAR_POP(<y>, <x>)
See also, Section 9.4.3, REGR_COUNT()
9.4.9 REGR_SYY()
Available inDSQL, PSQL
Result typeDOUBLE PRECISION
Syntax
Table 9.4.9.1 REGR_SYY
Function Parameters
Parameter | Description |
---|---|
y | Dependent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
x | Independent variable of the regression line. It may contain a table column, a constant, a variable, an expression, a non-aggregate function or a UDF. Aggregate functions are not allowed as expressions. |
The function REGR_SYY
calculates the sum of squares of the dependent variable (y).
The function REGR_SYY(<y>, <x>)
is equivalent to
REGR_COUNT(<y>, <x>) * VAR_POP(<exprY>)
<exprY> :==