Cost Based OR-Expansion
In a nutshell, the OR-expansion (ORE) is the transformation where an OR operator is transformed to a UNION ALL set operation, like:
select /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */
* from t where n1 = 1 or n2 = 1 ;
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 5 (100)| |
| 1 | VIEW | VW_ORE_1B35BA0F | 3 | 117 | 5 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID| T | 1 | 12 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | T_IDX1 | 1 | | 1 (0)| 00:00:01 |
|* 5 | TABLE ACCESS BY INDEX ROWID| T | 2 | 24 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | T_IDX2 | 2 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("N1"=1)
5 - filter(LNNVL("N1"=1))
6 - access("N2"=1)
It wasn’t until release 12.2 that this transformation became cost based.
In this blog post I’ll be explaining how to understand the cost based OR-Expansion (CBORE) optimizer calculations and decisions by analyzing the information in the 10053 trace.
I’ll be using the following table for demonstration purposes:
create table t
as
select level n1, mod(level,5000) n2, level n3
from dual connect by level <=10000 ;
create index t_idx1 on t(n1);
create index t_idx2 on t(n2);
create index t_idx3 on t(n3);
Let’s start with generating the execution plan for the following simple query with a single OR operation:
select /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */
* from t where n1 = 1 or n2 = 1
CBO (10053) trace analysis
We can see that the optimizer decided to consider CBORE:
ORE: Predicate chain after QB validity check - SEL$1
"T"."N1"=1 OR "T"."N2"=1
Registered qb: SEL$1 0x8384b0c0 (COPY SEL$1)
Further, the optimizer built a conjunction chain for each operand:
DNF Matrix (Before sorting OR branches)
P1 P2
CNJ (#1) : 1 0
CNJ (#2) : 0 1
ORE: Predicate list
P1 : "T"."N1"=1
P2 : "T"."N2"=1
The conjunction chain matrix above is just another representation of our WHERE clause “T”.”N1″=1 OR “T”.”N2″=1. Simply put, each conjunction chain (CNJ) – represented as a row in the matrix above – is an OR operand. The value of the matrix element “1” in the first CNJ means that the query block contains the OR operand P1, that is “T”.”N1″=1. For a WHERE clause consisting of solely OR operators, each conjunction chain contains a single “1” value and the rest are zeros. At the first sight, it might look counterintuitive to represent an inherently disjunctive OR using a conjunctive chain, but I’m going to explain later why it makes sense to do so.
Anyway, for our simple single OR operator there are only two possibilities to choose from. We can either forgo CBORE and leave the query as it is or transform it to a UNION ALL query with two query blocks – one for each OR operand. The optimizer informed us about that:
ORE: Switching to two pass because of # conjunction: 2
In the first iteration, the optimizer calculated the cost for the query block containing both OR operands.
ORE: Starting iteration 1, state space = [{ 1 2 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N1" "N1","T"."N2" "N2","T"."N3" "N3" FROM "VAR"."T" "T" WHERE "T"."N1"=1 OR "T"."N2"=1
FPD: Considering simple filter push in query block SEL$1 (#1)
"T"."N1"=1 OR "T"."N2"=1
ry to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N1"=1 OR "T"."N2"=1
ORE: Costing transformed query.
CBQT: Looking for cost annotations for query block SEL$1, key = SEL$1_00000000_0
CBQT: Could not find stored cost annotations.
The state space numbers in the excerpt above are telling us which OR operands are contained in the query block that is currently being costed. As the state space in the first iteration is [{ 1 2 }], the query block contains both OR operands: P1 and P2. In other words, the optimizer calculated the cost for the non-transformed query:
Best:: AccessPath: TableScan
Cost: 9.120737 Degree: 1 Resp: 9.120737 Card: 2.999800 Bytes: 0.000000
Best so far: Table#: 0 cost: 9.120737 card: 2.999800 bytes: 36.000000
Final cost for query block SEL$1 (#1) - All Rows Plan:
Best join order: 1
Cost: 9.120737 Degree: 1 Card: 3.000000 Bytes: 36.000000
The optimizer stored the calculated cost before moving to the next iteration:
ORE: Updated best state, Cost = 9.120737
In the next iteration, the optimizer calculated the cost for two query blocks, each of them containing only one OR operand:
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N3" "ITEM_1","T"."N2" "ITEM_2","T"."N1" "ITEM_3" FROM "VAR"."T" "T" WHERE "T"."N1"=1
FPD: Considering simple filter push in query block SEL$1 (#1)llllll
"T"."N1"=1
try to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N1"=1
...
Best:: AccessPath: IndexRange
Index: T_IDX1
Cost: 2.000645 Degree: 1 Resp: 2.000645 Card: 1.000000 Bytes: 0.000000
...
Best so far: Table#: 0 cost: 2.000645 card: 1.000000 bytes: 12.000000
...
Cost: 2.000645 Degree: 1 Card: 1.000000 Bytes: 12.000000
...
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N3" "ITEM_1","T"."N2" "ITEM_2","T"."N1" "ITEM_3" FROM "VAR"."T" "T" WHERE "T"."N2"=1 AND LNNVL("T"."N1"=1)
FPD: Considering simple filter push in query block SEL$1 (#1)
"T"."N2"=1 AND LNNVL("T"."N1"=1)
try to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N2"=1 AND LNNVL("T"."N1"=1)
...
ORE: Costing transformed query.
...
Best so far: Table#: 0 cost: 3.000962 card: 1.999800 bytes: 24.000000
...
Final cost for query block SEL$1 (#1) - All Rows Plan:
Best join order: 1
Cost: 3.000962 Degree: 1 Card: 2.000000 Bytes: 24.000000
The cost of the transformed UNION ALL query is calculated as the sum of the both query blocks costs. Since this cost is lower than the cost of the original query, the optimizer went with CBORE in this particular case.
kkoqbc: finish optimizing query block SEL$1 (#1)
ORE: Updated best state, Cost = 5.001606
ORE: Transferring best state space to preserved query.
ORE: Transferring best state space to original query.
Multiple OR operators
Using the information above, it’s easy to extract the essential information for understanding the CBORE decision making:
egrep "state space|Updated best state|Not update best state" DB1_ora_17091_CBORE.trc
Here’s is the decision tree for our example:
ORE: Starting iteration 1, state space = [{ 1 2 }]
ORE: Updated best state, Cost = 9.120737
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.001606
ORE: Transferring best state space to preserved query.
ORE: Transferring best state space to original query.
Let’s extend our query with an additional OR operator:
select /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */
* from t where n1 = 1 or n3 = 1 or n2 = 1 ;
ORE: Starting iteration 1, state space = [{ 1 2 3 }]
ORE: Updated best state, Cost = 9.149879
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.001606
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 7.002255
ORE: Starting iteration 3, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 3, state space = [{ 2 3 }]
ORE: Not update best state, Cost = 11.129714
ORE: Starting iteration 4, state space = [{ 1 2 }]
ORE: Not update best state, Cost = 9.120737
ORE: Transferring best state space to preserved query.
ORE: Transferring best state space to original query.
We can draw the following conclusions from the output above:
- The iteration 1 is the non-transformed query.
- The iteration 2 is the UNION ALL with three query blocks – one for each OR operand. At the end, this one had the lowest cost.
- The iteration 3 is the UNION ALL with two query blocks. The first query block contains the first OR operand P1 and the second query block has the second and the third operand (P2 OR P3).
- The iteration 4 is also the UNION ALL with two query blocks. The first query block entails the first and the second OR operand (P1 OR P2). The second query block contains the third OR operand P3. Unsurprisingly, the second query block wasn’t even costed since the first query block had already exceeded the cost of the second iteration.
- Interestingly, what wasn’t considered at all was the UNION ALL consisting of the state space = [{ 2 }] and state space = [{ 1 3 }], which, generally, could yield a good execution plan.
Further, something strange will start happening if we keep adding more OR operators. For example, this is what we’ll get after the number of OR operands exceeds 4:
ORE: Starting iteration 1, state space = [{ 1 2 3 4 5 }]
ORE: Updated best state, Cost = 17.129661
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000388
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 4.000776
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 6.001167
ORE: Starting iteration 2, state space = [{ 4 }]
ORE: Updated best state, Cost = 8.001558
ORE: Starting iteration 2, state space = [{ 5 }]
ORE: Updated best state, Cost = 10.001950
ORE: Transferring best state space to preserved query.
ORE: Transferring best state space to original query.
As you can see, the optimizer did only two iterations – one for the non-transformed query (iteration 1), and one for the UNION ALL query which has one query block for each OR-operand (iteration 2). I don’t know whether that’s an anomaly or some heuristics to limit time spent on calculating cost for different combinations.
Conjunction chains
We previously saw that the optimizer internally represents the OR operands with the conjunction chain matrix:
DNF Matrix (Before sorting OR branches)
P1 P2
CNJ (#1) : 1 0
CNJ (#2) : 0 1
ORE: Predicate list
P1 : "T"."N1"=1
P2 : "T"."N2"=1
This might indeed seem unnecessary for our simple query. But it makes perfect sense for more complex WHERE clauses consisting of both OR and AND operators, like:
select /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */
* from t where n1 = 1 and n3 = 1 or n2 = 1 ;
ORE: Predicate list
P1 : "T"."N1"=1
P2 : "T"."N3"=1
P3 : "T"."N2"=1
DNF Matrix (Before sorting OR branches)
P1 P2 P3
CNJ (#1) : 1 1 0
CNJ (#2) : 0 0 1
ORE: Predicate list
P1 : "T"."N1"=1
P2 : "T"."N3"=1
P3 : "T"."N2"=1
DNF Matrix (After OR branch sorting)
P1 P2 P3
CNJ (#1) : 0 0 1
CNJ (#2) : 1 1 0
The purpose of this internal matrix representation becomes obvious now – it helps grouping the conjunctive operands together into a single OR operand.
Bugs
Although making ORE cost based generally leads to better executions plans, you should be aware of some boundary cases when it doesn’t function properly. I described one such example in my previous blog post Cost-Based OR Expansion Transformation. There, you can also find how to fall back to pre-12.2 behavior in case you hit such problem.
Hi Nenad,
Thanks for this nice article
I think there is a typo in the Multiple OR operator section : or n3 = 1 instead of and n3 =1
select /*+ OPT_PARAM(‘_B_TREE_BITMAP_PLANS’,’FALSE’) */
* from t where n1 = 1 and n3 = 1 or n2 = 1 ;
And you are right to say that Oracle has considered 4 state space iterations instead of 5. These iterations are following the Stirling Number of the second kind which, in case of 3 ORed conjuncts should generate 5 possible OR combinations 1 + 3 + 1
And as per regards to your remark of Oracle doing only two iterations when you have used 5 Ored conjuncts this depends strongly on the type of the OR-expansion method used : two_pass, linear or Greedy. In the case of two_pass method whatever the number of ORed conjuncts is, Oracle will only use two state space iterations including the non-transformed query.
I started a blog article about this stuff a long time ago and have not yet succeed to polish it and make it publishable. Hope will do this very soon.
Cheers
Mohamed Houri
Hi Mohamed,
Thank you for your feedback!
1. Yes, it was a typo. Now corrected. Thanks for pointing that out!
2. A more general definition of the maximum number of iterations is the sum of all Stirling numbers of the second kind where k goes from 1 to n, where n is the number of disjunctive elements.
3. Do you know how does optimizer decide whether to use two_pass, linear or greedy?
With great interest I’m awaiting your article!
Cheers,
Nenad