Dissecting Cost Based OR-Expansion

Cost Based OR-Expansion

In a nutshell, the OR-expansion (ORE) is the transformation where an OR operator is transformed to a UNION ALL set operation, like:

select  /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */ 
  * from t where n1 = 1 or n2 = 1 ;

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |       |       |     5 (100)|          |
|   1 |  VIEW                         | VW_ORE_1B35BA0F |     3 |   117 |     5   (0)| 00:00:01 |
|   2 |   UNION-ALL                   |                 |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| T               |     1 |    12 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T_IDX1          |     1 |       |     1   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS BY INDEX ROWID| T               |     2 |    24 |     3   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | T_IDX2          |     2 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("N1"=1)
   5 - filter(LNNVL("N1"=1))
   6 - access("N2"=1)

It wasn’t until release 12.2 that this transformation became cost based.

In this blog post I’ll be explaining how to understand the cost based OR-Expansion (CBORE) optimizer calculations and decisions by analyzing the information in the 10053 trace.

I’ll be using the following table for demonstration purposes:

create table t  
as
  select level n1, mod(level,5000) n2, level n3
    from dual connect by level <=10000 ;

create index t_idx1 on t(n1);
create index t_idx2 on t(n2);
create index t_idx3 on t(n3);

Let’s start with generating the execution plan for the following simple query with a single OR operation:

select  /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */ 
  * from t where n1 = 1 or n2 = 1 

CBO (10053) trace analysis

We can see that the optimizer decided to consider CBORE:

ORE: Predicate chain after QB validity check - SEL$1
"T"."N1"=1 OR "T"."N2"=1
Registered qb: SEL$1 0x8384b0c0 (COPY SEL$1)

Further, the optimizer built a conjunction chain for each operand:

DNF Matrix (Before sorting OR branches)
             P1  P2
CNJ (#1) :   1   0

CNJ (#2) :   0   1

ORE:  Predicate list
P1 : "T"."N1"=1
P2 : "T"."N2"=1

The conjunction chain matrix above is just another representation of our WHERE clause “T”.”N1″=1 OR “T”.”N2″=1. Simply put, each conjunction chain (CNJ) – represented as a row in the matrix above – is an OR operand. The value of the matrix element “1” in the first CNJ means that the query block contains the OR operand P1, that is “T”.”N1″=1. For a WHERE clause consisting of solely OR operators, each conjunction chain contains a single “1” value and the rest are zeros. At the first sight, it might look counterintuitive to represent an inherently disjunctive OR using a conjunctive chain, but I’m going to explain later why it makes sense to do so.

Anyway, for our simple single OR operator there are only two possibilities to choose from. We can either forgo CBORE and leave the query as it is or transform it to a UNION ALL query with two query blocks – one for each OR operand. The optimizer informed us about that:

ORE: Switching to two pass because of # conjunction: 2

In the first iteration, the optimizer calculated the cost for the query block containing both OR operands.

ORE: Starting iteration 1, state space = [{ 1 2 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N1" "N1","T"."N2" "N2","T"."N3" "N3" FROM "VAR"."T" "T" WHERE "T"."N1"=1 OR "T"."N2"=1
FPD: Considering simple filter push in query block SEL$1 (#1)
"T"."N1"=1 OR "T"."N2"=1
ry to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N1"=1 OR "T"."N2"=1

ORE: Costing transformed query.
CBQT: Looking for cost annotations for query block SEL$1, key = SEL$1_00000000_0
CBQT: Could not find stored cost annotations.

The state space numbers in the excerpt above are telling us which OR operands are contained in the query block that is currently being costed. As the state space in the first iteration is [{ 1 2 }], the query block contains both OR operands: P1 and P2. In other words, the optimizer calculated the cost for the non-transformed query:

Best:: AccessPath: TableScan
Cost: 9.120737  Degree: 1  Resp: 9.120737  Card: 2.999800  Bytes: 0.000000

Best so far:  Table#: 0  cost: 9.120737  card: 2.999800  bytes: 36.000000
	
Final cost for query block SEL$1 (#1) - All Rows Plan:
Best join order: 1
Cost: 9.120737  Degree: 1  Card: 3.000000  Bytes: 36.000000

The optimizer stored the calculated cost before moving to the next iteration:

ORE: Updated best state, Cost = 9.120737

In the next iteration, the optimizer calculated the cost for two query blocks, each of them containing only one OR operand:

ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N3" "ITEM_1","T"."N2" "ITEM_2","T"."N1" "ITEM_3" FROM "VAR"."T" "T" WHERE "T"."N1"=1
FPD: Considering simple filter push in query block SEL$1 (#1)llllll
"T"."N1"=1
try to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N1"=1
...
Best:: AccessPath: IndexRange
Index: T_IDX1
Cost: 2.000645  Degree: 1  Resp: 2.000645  Card: 1.000000  Bytes: 0.000000
...
Best so far:  Table#: 0  cost: 2.000645  card: 1.000000  bytes: 12.000000
...
Cost: 2.000645  Degree: 1  Card: 1.000000  Bytes: 12.000000
...   
   
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Transformed query
******* UNPARSED QUERY IS *******
SELECT "T"."N3" "ITEM_1","T"."N2" "ITEM_2","T"."N1" "ITEM_3" FROM "VAR"."T" "T" WHERE "T"."N2"=1 AND LNNVL("T"."N1"=1)
FPD: Considering simple filter push in query block SEL$1 (#1)
"T"."N2"=1 AND LNNVL("T"."N1"=1)
try to generate transitive predicate from check constraints for query block SEL$1 (#1)
finally: "T"."N2"=1 AND LNNVL("T"."N1"=1)
...   
ORE: Costing transformed query.
...   
Best so far:  Table#: 0  cost: 3.000962  card: 1.999800  bytes: 24.000000
...
Final cost for query block SEL$1 (#1) - All Rows Plan:
Best join order: 1
Cost: 3.000962  Degree: 1  Card: 2.000000  Bytes: 24.000000

The cost of the transformed UNION ALL query is calculated as the sum of the both query blocks costs. Since this cost is lower than the cost of the original query, the optimizer went with CBORE in this particular case.

kkoqbc: finish optimizing query block SEL$1 (#1)
ORE: Updated best state, Cost = 5.001606
ORE:   Transferring best state space to preserved query.
ORE:   Transferring best state space to original query.

Multiple OR operators

Using the information above, it’s easy to extract the essential information for understanding the CBORE decision making:

egrep "state space|Updated best state|Not update best state" DB1_ora_17091_CBORE.trc

Here’s is the decision tree for our example:

ORE: Starting iteration 1, state space = [{ 1 2 }]
ORE: Updated best state, Cost = 9.120737
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.001606
ORE:   Transferring best state space to preserved query. 
ORE:   Transferring best state space to original query.

Let’s extend our query with an additional OR operator:

select  /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */ 
  * from t where n1 = 1 or n3 = 1 or n2 = 1 ;

ORE: Starting iteration 1, state space = [{ 1 2 3 }]
ORE: Updated best state, Cost = 9.149879
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 5.001606
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 7.002255
ORE: Starting iteration 3, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000645
ORE: Starting iteration 3, state space = [{ 2 3 }]
ORE: Not update best state, Cost = 11.129714
ORE: Starting iteration 4, state space = [{ 1 2 }]
ORE: Not update best state, Cost = 9.120737
ORE:   Transferring best state space to preserved query. 
ORE:   Transferring best state space to original query.

We can draw the following conclusions from the output above:

  • The iteration 1 is the non-transformed query.
  • The iteration 2 is the UNION ALL with three query blocks – one for each OR operand. At the end, this one had the lowest cost.
  • The iteration 3 is the UNION ALL with two query blocks. The first query block contains the first OR operand P1 and the second query block has the second and the third operand (P2 OR P3).
  • The iteration 4 is also the UNION ALL with two query blocks. The first query block entails the first and the second OR operand (P1 OR P2). The second query block contains the third OR operand P3. Unsurprisingly, the second query block wasn’t even costed since the first query block had already exceeded the cost of the second iteration.
  • Interestingly, what wasn’t considered at all was the UNION ALL consisting of the state space = [{ 2 }] and state space = [{ 1 3 }], which, generally, could yield a good execution plan.

Further, something strange will start happening if we keep adding more OR operators. For example, this is what we’ll get after the number of OR operands exceeds 4:

ORE: Starting iteration 1, state space = [{ 1 2 3 4 5 }]
ORE: Updated best state, Cost = 17.129661
ORE: Starting iteration 2, state space = [{ 1 }]
ORE: Updated best state, Cost = 2.000388
ORE: Starting iteration 2, state space = [{ 2 }]
ORE: Updated best state, Cost = 4.000776
ORE: Starting iteration 2, state space = [{ 3 }]
ORE: Updated best state, Cost = 6.001167
ORE: Starting iteration 2, state space = [{ 4 }]
ORE: Updated best state, Cost = 8.001558
ORE: Starting iteration 2, state space = [{ 5 }]
ORE: Updated best state, Cost = 10.001950
ORE:   Transferring best state space to preserved query. 
ORE:   Transferring best state space to original query. 

As you can see, the optimizer did only two iterations – one for the non-transformed query (iteration 1), and one for the UNION ALL query which has one query block for each OR-operand (iteration 2). I don’t know whether that’s an anomaly or some heuristics to limit time spent on calculating cost for different combinations.

Conjunction chains

We previously saw that the optimizer internally represents the OR operands with the conjunction chain matrix:

DNF Matrix (Before sorting OR branches)
             P1  P2
CNJ (#1) :   1   0

CNJ (#2) :   0   1

ORE:  Predicate list
P1 : "T"."N1"=1
P2 : "T"."N2"=1

This might indeed seem unnecessary for our simple query. But it makes perfect sense for more complex WHERE clauses consisting of both OR and AND operators, like:

select  /*+ OPT_PARAM('_B_TREE_BITMAP_PLANS','FALSE') */ 
  * from t where n1 = 1 and n3 = 1 or n2 = 1 ;


ORE:  Predicate list
P1 : "T"."N1"=1
P2 : "T"."N3"=1
P3 : "T"."N2"=1

 DNF Matrix (Before sorting OR branches)
            P1  P2  P3
CNJ (#1) :   1   1   0

CNJ (#2) :   0   0   1

ORE:  Predicate list
P1 : "T"."N1"=1
P2 : "T"."N3"=1
P3 : "T"."N2"=1

 DNF Matrix (After OR branch sorting)
            P1  P2  P3
CNJ (#1) :   0   0   1

CNJ (#2) :   1   1   0

The purpose of this internal matrix representation becomes obvious now – it helps grouping the conjunctive operands together into a single OR operand.

Bugs

Although making ORE cost based generally leads to better executions plans, you should be aware of some boundary cases when it doesn’t function properly. I described one such example in my previous blog post Cost-Based OR Expansion Transformation. There, you can also find how to fall back to pre-12.2 behavior in case you hit such problem.

Thanks for sharing

Nenad Noveljic

3 Comments

  1. Hi Nenad,
    Thanks for this nice article
    I think there is a typo in the Multiple OR operator section : or n3 = 1 instead of and n3 =1
    select /*+ OPT_PARAM(‘_B_TREE_BITMAP_PLANS’,’FALSE’) */
    * from t where n1 = 1 and n3 = 1 or n2 = 1 ;

    And you are right to say that Oracle has considered 4 state space iterations instead of 5. These iterations are following the Stirling Number of the second kind which, in case of 3 ORed conjuncts should generate 5 possible OR combinations 1 + 3 + 1

    And as per regards to your remark of Oracle doing only two iterations when you have used 5 Ored conjuncts this depends strongly on the type of the OR-expansion method used : two_pass, linear or Greedy. In the case of two_pass method whatever the number of ORed conjuncts is, Oracle will only use two state space iterations including the non-transformed query.

    I started a blog article about this stuff a long time ago and have not yet succeed to polish it and make it publishable. Hope will do this very soon.

    Cheers
    Mohamed Houri

    • Hi Mohamed,

      Thank you for your feedback!

      1. Yes, it was a typo. Now corrected. Thanks for pointing that out!

      2. A more general definition of the maximum number of iterations is the sum of all Stirling numbers of the second kind where k goes from 1 to n, where n is the number of disjunctive elements.

      3. Do you know how does optimizer decide whether to use two_pass, linear or greedy?

      With great interest I’m awaiting your article!

      Cheers,
      Nenad

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.