--- 如果不进行or-expansion的改写,该查询只能使用主表访问路径
OceanBase (root@test)> explain select/*+NO_REWRITE()*/ * from t1 where t1.a = 1 or t1.b = 1;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ===================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
-----------------------------------
|0 |TABLE SCAN|t1 |4 |649 |
===================================
Outputs & filters:
-------------------------------------
0 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter([t1.a = 1 OR t1.b = 1]),
access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p0)
--- 改写之后,每个子查询能使用不同的索引访问路径
OceanBase (root@test)> explain select * from t1 where t1.a = 1 or t1.b = 1;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------
|0 |UNION ALL | |3 |190 |
|1 | TABLE SCAN|t1(idx_a)|2 |94 |
|2 | TABLE SCAN|t1(idx_b)|1 |95 |
=========================================
Outputs & filters:
-------------------------------------
0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)], [UNION(t1.c, t1.c)], [UNION(t1.d, t1.d)], [UNION(t1.e, t1.e)]), filter(nil)
1 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter(nil),
access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p0)
2 - output([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), filter([lnnvl(t1.a = 1)]),
access([t1.a], [t1.b], [t1.c], [t1.d], [t1.e]), partitions(p02
- 允许每个分支使用不同的连接算法来加速查询,避免使用笛卡尔连接。
如下例所示,Q1 会被改写成 Q2 的形式。对于 Q1 来说,它的连接方式只能是 nested loop join (笛卡尔乘积), 但是被改写之后,每个子查询都可以选择 nested loop join,hash join 或者 merge join,这样会有更多的优化空间。
OceanBase (root@test)> create table t1(a int, b int);
Query OK, 0 rows affected (0.17 sec)
OceanBase (root@test)> create table t2(a int, b int);
Query OK, 0 rows affected (0.13 sec)
OceanBase (root@test)> explain select/*+NO_REWRITE()*/ * from t1, t2 where t1.a = t2.a or t1.b = t2.b;
|ID|OPERATOR |NAME|EST. ROWS|COST |
-------------------------------------------
|0 |NESTED-LOOP JOIN| |3957 |585457|
|1 | TABLE SCAN |t1 |1000 |499 |
|2 | TABLE SCAN |t2 |4 |583 |
===========================================
Outputs & filters:
-------------------------------------
0 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
conds(nil), nl_params_([t1.a], [t1.b])
1 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0)
2 - output([t2.a], [t2.b]), filter([? = t2.a OR ? = t2.b]),
access([t2.a], [t2.b]), partitions(p0)
---被改写之后,每个子查询都使用了hash join
OceanBase (root@test)> explain select * from t1, t2 where t1.a = t2.a or t1.b = t2.b;
|ID|OPERATOR |NAME|EST. ROWS|COST|
-------------------------------------
|0 |UNION ALL | |2970 |9105|
|1 | HASH JOIN | |1980 |3997|
|2 | TABLE SCAN|t1 |1000 |499 |
|3 | TABLE SCAN|t2 |1000 |499 |
|4 | HASH JOIN | |990 |3659|
|5 | TABLE SCAN|t1 |1000 |499 |
|6 | TABLE SCAN|t2 |1000 |499 |
=====================================
Outputs & filters:
-------------------------------------
0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)], [UNION(t2.a, t2.a)], [UNION(t2.b, t2.b)]), filter(nil)
1 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
equal_conds([t1.a = t2.a]), other_conds(nil)
2 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0)
3 - output([t2.a], [t2.b]), filter(nil),
access([t2.a], [t2.b]), partitions(p0)
4 - output([t1.a], [t1.b], [t2.a], [t2.b]), filter(nil),
equal_conds([t1.b = t2.b]), other_conds([lnnvl(t1.a = t2.a)])
5 - output([t1.a], [t1.b]), filter(nil),
access([t2.a], [t2.b]), partitions(p0)
- 允许每个分支分别消除排序,更加快速的获取top-k结果。
如下例所示,Q1 会被改写成 Q2。对于 Q1 来说,执行方式是只能把满足条件的行数找出来,然后进行排序,最终取top-10 结果。对于 Q2 来说,如果存在索引(a,b), 那么 Q2 中的两个子查询都可以使用索引把排序消除,每个子查询取 top-10 结果,然后最终对这20行数据排序一下取出最终的 top-10 行。
OceanBase (root@test)> create table t1(a int, b int, index idx_a(a, b));
Query OK, 0 rows affected (0.20 sec)
---不改写的话,需要排序最终获取top-k结果
OceanBase (root@test)> explain select/*+NO_REWRITE()*/ * from t1 where t1.a = 1 or t1.a = 2 order by b limit 10;
| ==========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
------------------------------------------
|0 |LIMIT | |4 |77 |
|1 | TOP-N SORT | |4 |76 |
|2 | TABLE SCAN|t1(idx_a)|4 |73 |
==========================================
Outputs & filters:
-------------------------------------
0 - output([t1.a], [t1.b]), filter(nil), limit(10), offset(nil)
1 - output([t1.a], [t1.b]), filter(nil), sort_keys([t1.b, ASC]), topn(10)
2 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0)
--- 进行改写的话,排序算子可以被消除,最终获取top-k结果
OceanBase (root@test)> explain select * from t1 where t1.a = 1 or t1.a = 2 order by b limit 10;
| ===========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-------------------------------------------
|0 |LIMIT | |3 |76 |
|1 | TOP-N SORT | |3 |76 |
|2 | UNION ALL | |3 |74 |
|3 | TABLE SCAN|t1(idx_a)|2 |37 |
|4 | TABLE SCAN|t1(idx_a)|1 |37 |
===========================================
Outputs & filters:
-------------------------------------
0 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil), limit(10), offset(nil)
1 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil), sort_keys([UNION(t1.b, t1.b), ASC]), topn(10)
2 - output([UNION(t1.a, t1.a)], [UNION(t1.b, t1.b)]), filter(nil)
3 - output([t1.a], [t1.b]), filter(nil),
access([t1.a], [t1.b]), partitions(p0),
limit(10), offset(nil)
4 - output([t1.a], [t1.b]), filter([lnnvl(t1.a = 1)]),