Old-style (non-cost-based) Join Predicate Push-Down Transformation

Generally, join predicate push-down (JPPD) transformation is cost based. However, sometimes the optimizer falls back to the old-style (non-cost-based) JPPD (OJPPD). In such a case cost can be massively underestimated and the effect on the performance detrimental.

I’ll be using a simple data set for demonstrating this on a 18.7 Oracle database:

create table t1 (n1 integer) ;
create table t2 (n1 integer, constraint pk_t2 primary key (n1)) ;
create table t3 (n1 integer, n2 integer) ;
create table t4 (n1 integer, c1 varchar2(4000)) ;

insert into t4 select 1, lpad('A',4000,'A') 
  from dual connect by level <=100000 ;
commit ;

begin
  dbms_stats.gather_table_stats(null, 'T1' );
  dbms_stats.gather_table_stats(null, 'T2' );
  dbms_stats.gather_table_stats(null, 'T3' );
  dbms_stats.gather_table_stats(null, 'T4' );
end;
/

set pagesize 0
set linesize 200

Let’s start with a straightfoward query that produces a plan with JPPD:

select /*+ qb_name(MAIN) */ t1.n1 from 
  t1,
  ( 
    select /*+ qb_name(RECEIVER) */ t2.n1 from t2, t4
      where t4.n1 = t2.n1
  ) v1
 where t1.n1 = v1.n1(+) 
;

select * from table(dbms_xplan.display_cursor(NULL,NULL)) ;

---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |       |       |  1615 (100)|          |
|   1 |  NESTED LOOPS OUTER     |       |     1 |    15 |  1615   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL     | T1    |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   VIEW PUSHED PREDICATE |       |     1 |     2 |  1613   (1)| 00:00:01 |
|   4 |    NESTED LOOPS         |       |   100K|  1562K|  1613   (1)| 00:00:01 |
|*  5 |     INDEX UNIQUE SCAN   | PK_T2 |     1 |    13 |     0   (0)|          |
|*  6 |     TABLE ACCESS FULL   | T4    |   100K|   292K|  1613   (1)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T2"."N1"="T1"."N1")
   6 - filter("T4"."N1"="T1"."N1")

It’s worth noting that the nested loop (NL) in the query block (QB) RECEIVER is the main contributor to the total cost. That’s the QB which receives the pushed predicate.

The optimizer trace confirms the JPPD transformation:

JPPD:  Considering Cost-based predicate pushdown from query block MAIN (#1)
Cost-based predicate pushdown (JPPD)
JPPD: Checking validity of push-down in query block MAIN (#1)
JPPD:   Checking validity of push-down from query block MAIN (#1) to query block RECEIVER (#2)
JPPD:     Passed validity checks
JPPD: JPPD:   Pushdown from query block MAIN (#1) passed validity checks.
JPPD: Using search type: linear
JPPD: Considering join predicate push-down
JPPD: Starting iteration 1, state space = (2) : (0)
JPPD: Performing join predicate push-down (no transformation phase) from query block MAIN (#1) to query block RECEIVER (#2)
JPPD: Costing transformed query.
JPPD: Updated best state, Cost = 1615.054794
JPPD: Starting iteration 2, state space = (2) : (1)
JPPD: Performing join predicate push-down (candidate phase) from query block MAIN (#1) to query block RECEIVER (#2)
JPPD:   Pushing predicate "T1"."N1"="V1"."N1"(+)
JPPD: Push dest of pred 0x80c9c730 is qb 0x80cab7b0:query block RECEIVER (#2)
JPPD: Costing transformed query.
JPPD: Retrieved original view card: 1.000000
JPPD: Updated best state, Cost = 1614.996808
JPPD: Will use JPPD from MAIN (#1) to RECEIVER (#2).
JPPD: Applying transformation directives
JPPD: Checking validity of push-down in query block MAIN (#1)
JPPD: JPPD:   Pushdown from query block MAIN (#1) passed validity checks.
JPPD: Performing join predicate push-down (final phase) from query block MAIN (#1) to query block RECEIVER (#2)
JPPD:   Pushing predicate "T1"."N1"="V1"."N1"(+)
JPPD: Push dest of pred 0x80cab828 is qb 0x80cfeda0:query block RECEIVER (#2)
JPPD: Retrieved original view card: 1.000000

But the cost suddenly drops after adding a hierarchical subquery:

select /*+ qb_name(MAIN) */ t1.n1 from 
  t1,
  ( 
    select /*+ qb_name(RECEIVER) */ t2.n1 from t2, t4
      where t4.n1 = t2.n1
  ) v1,
  (
    select /*+ qb_name(hierarchical) */ t3.n1 n1 from t3 
	  connect by n2 = prior n1
  ) v2
 where t1.n1 = v1.n1(+) and t1.n1 = v2.n1 
;

select * from table(dbms_xplan.display_cursor(NULL,NULL)) ;

----------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |       |       |     4 (100)|          |
|*  1 |  HASH JOIN                     |       |     1 |    39 |     4   (0)| 00:00:01 |
|   2 |   NESTED LOOPS OUTER           |       |     1 |    26 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL           | T1    |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |    VIEW PUSHED PREDICATE       |       |     1 |    13 |     0   (0)|          |
|   5 |     NESTED LOOPS               |       |     1 |    16 |  1613   (1)| 00:00:01 |
|*  6 |      INDEX UNIQUE SCAN         | PK_T2 |     1 |    13 |     0   (0)|          |
|*  7 |      TABLE ACCESS FULL         | T4    |     1 |     3 |  1613   (1)| 00:00:01 |
|   8 |   VIEW                         |       |     1 |    13 |     2   (0)| 00:00:01 |
|*  9 |    CONNECT BY WITHOUT FILTERING|       |       |       |            |          |
|  10 |     TABLE ACCESS FULL          | T3    |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="V2"."N1")
   6 - access("T2"."N1"="T1"."N1")
   7 - filter(("T4"."N1"="T2"."N1" AND "T4"."N1"="T1"."N1"))
   9 - access("N2"=PRIOR NULL)

So instead of slightly increasing, the cost slumped. As you can see, the NL cost in the step 5 wasn’t propagated to its parent. The magnitude of the error is proportional to the following two values: QB RECEIVER cost and outer row source size. That’s not surprising at all, because the total cost is calculated as QB_RECEIVER_cost * num_outer_rows.

The 10053 trace tells us that the optimizer, surprisingly, switched from JPPD to OJPPD:

JPPD - join predicate push-down
OJPPD - old-style (non-cost-based) JPPD
OJPPD: Promoting index PK_T2
OJPPD: Performing join predicate push-down  to query block RECEIVER (#0)
OJPPD: Pushing predicate "T1"."N1"="V1"."N1"(+)
OJPPD: Used promoted index: PK_T2
OJPPD: Performing join predicate push-down  to query block RECEIVER (#0)
OJPPD: Pushing predicate "T1"."N1"="V1"."N1"(+)
OJPPD: Used promoted index: PK_T2

Finally, I’m going to show an extremely unlucky case – with a large outer row source. To do that, first I’m going to insert lots of rows into the outer table T1.

insert into t1 select 1
  from dual connect by level <=1e6 ;
commit ;

exec dbms_stats.gather_table_stats(null, 'T1' );

Expectedly, the execution plan for the query without the hierarchical subquery changed – JPPD isn’t performed anymore because the transformed query cost exploded after increasing the outer row source size.

select /*+ qb_name(MAIN) */ t1.n1 from 
  t1,
  ( 
    select /*+ qb_name(RECEIVER) */ t2.n1 from t2, t4
      where t4.n1 = t2.n1
  ) v1
 where t1.n1 = v1.n1(+) 
;

select * from table(dbms_xplan.display_cursor(NULL,NULL)) ;

-------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |       |       |  1659 (100)|          |
|*  1 |  HASH JOIN RIGHT OUTER|       |  1000K|    15M|  1659   (1)| 00:00:01 |
|   2 |   VIEW                |       |     1 |    13 |  1613   (1)| 00:00:01 |
|*  3 |    HASH JOIN          |       |     1 |    16 |  1613   (1)| 00:00:01 |
|   4 |     INDEX FULL SCAN   | PK_T2 |     1 |    13 |     0   (0)|          |
|   5 |     TABLE ACCESS FULL | T4    |   100K|   292K|  1613   (1)| 00:00:01 |
|   6 |   TABLE ACCESS FULL   | T1    |  1000K|  2929K|    45   (3)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N1"="V1"."N1")
   3 - access("T4"."N1"="T2"."N1")

The optimizer trace shows the exact costs of both transformed and non-transformed queries:

opt_qb.awk -v qb=MAIN DB_ora_13845.trc
---------------
36 Registered qb: MAIN 0x86311848 (HINT MAIN)
1181 JPPD: Performing join predicate push-down (no transformation phase) from query block MAIN (#1) to query block RECEIVER (#2)
1689 Final cost for query block MAIN (#1) - All Rows Plan:
1690   Best join order: 1
1691   Cost: 1659.234715  Degree: 1  Card: 1000000.000000  Bytes: 16000000.000000
---------------
36 Registered qb: MAIN 0x86311848 (HINT MAIN)
1704 JPPD: Performing join predicate push-down (candidate phase) from query block MAIN (#1) to query block RECEIVER (#2)
2065 Final cost for query block MAIN (#1) - All Rows Plan:
2066   Best join order: 1
2067   Cost: 1612996725.467509  Degree: 1  Card: 1000000.000000  Bytes: 5000000.000000
---------------
36 Registered qb: MAIN 0x86311848 (HINT MAIN)
2089 JPPD: Performing join predicate push-down (no transformation phase) from query block MAIN (#1) to query block RECEIVER (#2)
2741 Final cost for query block MAIN (#1) - All Rows Plan:
2742   Best join order: 1
2743   Cost: 1659.234715  Degree: 1  Card: 1000000.000000  Bytes: 16000000.000000

By the way, opt_qb.awk comes in handy for analyzing query transformations. It’s an awk script for extracting the essential information about QBs and is documented in a previous blog post.

In contrast, the execution plan for the query with the hierarchical subquery still contains the PUSHED PREDICATE operation:

-----------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |       |       |    48 (100)|          |
|   1 |  NESTED LOOPS OUTER             |       |     1 |    29 |    48   (5)| 00:00:01 |
|*  2 |   HASH JOIN                     |       |     1 |    16 |    48   (5)| 00:00:01 |
|   3 |    VIEW                         |       |     1 |    13 |     2   (0)| 00:00:01 |
|*  4 |     CONNECT BY WITHOUT FILTERING|       |       |       |            |          |
|   5 |      TABLE ACCESS FULL          | T3    |     1 |    26 |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL            | T1    |  1000K|  2929K|    45   (3)| 00:00:01 |
|   7 |   VIEW PUSHED PREDICATE         |       |     1 |    13 |     0   (0)|          |
|   8 |    NESTED LOOPS                 |       |     1 |    16 |  1613   (1)| 00:00:01 |
|*  9 |     INDEX UNIQUE SCAN           | PK_T2 |     1 |    13 |     0   (0)|          |
|* 10 |     TABLE ACCESS FULL           | T4    |     1 |     3 |  1613   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."N1"="V2"."N1")
   4 - access("N2"=PRIOR NULL)
   9 - access("T2"."N1"="T1"."N1")
  10 - filter(("T4"."N1"="T2"."N1" AND "T4"."N1"="T1"."N1"))

In this particular case, the cost was hugely underestimated – by the factor 33 million.

A possible workaround is to selectively disable pushing predicates with the following hint:

opt_param('_push_join_predicate', 'false')

On the downside, this will also prevent useful predicates from being pushed.

In summary, hierarchical subqueries can lead to suboptimal execution plans in conjunction with JPPD. The cost will be massively underestimated for large outer row sources. This problem can be easily detected by comparing the VIEW PUSHED PREDICATE cost to its children – VIEW PUSHED PREDICATE has a lower cost. In fact, I’m just thinking of writing a procedure for parsing execution plans and reporting such issues.

Update on August 25th 2019

The wrong cost calculation seems to be caused by the bug 22582700. The issue is fixed in Oracle 19.3.
Many thanks to Timur Akhmadeev for providing this information (see the comments section).

Thanks for sharing

Nenad Noveljic

One Comment

  1. In 19.3 things are different; it looks like bug 22582700 is the reason:

    [sourcecode]
    select /*+ qb_name(MAIN) push_pred(@RECEIVER) */ t1.n1 from
    t1,
    (
    select /*+ qb_name(RECEIVER) */ t2.n1 from t2, t4
    where t4.n1 = t2.n1
    ) v1,
    (
    select /*+ qb_name(hierarchical) */ t3.n1 n1 from t3
    connect by n2 = prior n1
    ) v2
    where t1.n1 = v1.n1(+) and t1.n1 = v2.n1
    ;

    Plan hash value: 1501888726

    —————————————————————————————-
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    —————————————————————————————-
    | 0 | SELECT STATEMENT | | | | 27448 (100)| |
    |* 1 | HASH JOIN | | 1 | 39 | 27448 (1)| 00:00:02 |
    | 2 | NESTED LOOPS OUTER | | 1 | 26 | 27446 (1)| 00:00:02 |
    | 3 | TABLE ACCESS FULL | T1 | 1 | 13 | 2 (0)| 00:00:01 |
    | 4 | VIEW PUSHED PREDICATE | | 1 | 13 | 27444 (1)| 00:00:02 |
    | 5 | NESTED LOOPS | | 1 | 16 | 27444 (1)| 00:00:02 |
    |* 6 | INDEX UNIQUE SCAN | PK_T2 | 1 | 13 | 0 (0)| |
    |* 7 | TABLE ACCESS FULL | T4 | 1 | 3 | 27444 (1)| 00:00:02 |
    | 8 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
    |* 9 | CONNECT BY WITHOUT FILTERING| | | | | |
    | 10 | TABLE ACCESS FULL | T3 | 1 | 26 | 2 (0)| 00:00:01 |
    —————————————————————————————-

    Predicate Information (identified by operation id):
    —————————————————

    1 – access(“T1″.”N1″=”V2”.”N1″)
    6 – access(“T2″.”N1″=”T1”.”N1″)
    7 – filter((“T4″.”N1″=”T2”.”N1″ AND “T4″.”N1″=”T1”.”N1″))
    9 – access(“N2″=PRIOR NULL)

    Hint Report (identified by operation id / Query Block Name / Object Alias):
    Total hints for statement: 1 (U – Unused (1))
    —————————————————————————

    5 – SEL$09FB3C0F
    U – push_pred(@RECEIVER)
    select /*+ qb_name(MAIN) push_pred(@RECEIVER) opt_param(‘_fix_control’ ‘22582700:off’) */ t1.n1 from
    t1,
    (
    select /*+ qb_name(RECEIVER) */ t2.n1 from t2, t4
    where t4.n1 = t2.n1
    ) v1,
    (
    select /*+ qb_name(hierarchical) */ t3.n1 n1 from t3
    connect by n2 = prior n1
    ) v2
    where t1.n1 = v1.n1(+) and t1.n1 = v2.n1
    ;

    Plan hash value: 1501888726

    —————————————————————————————-
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    —————————————————————————————-
    | 0 | SELECT STATEMENT | | | | 4 (100)| |
    |* 1 | HASH JOIN | | 1 | 39 | 4 (0)| 00:00:01 |
    | 2 | NESTED LOOPS OUTER | | 1 | 26 | 2 (0)| 00:00:01 |
    | 3 | TABLE ACCESS FULL | T1 | 1 | 13 | 2 (0)| 00:00:01 |
    | 4 | VIEW PUSHED PREDICATE | | 1 | 13 | 0 (0)| |
    | 5 | NESTED LOOPS | | 1 | 16 | 27444 (1)| 00:00:02 |
    |* 6 | INDEX UNIQUE SCAN | PK_T2 | 1 | 13 | 0 (0)| |
    |* 7 | TABLE ACCESS FULL | T4 | 1 | 3 | 27444 (1)| 00:00:02 |
    | 8 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
    |* 9 | CONNECT BY WITHOUT FILTERING| | | | | |
    | 10 | TABLE ACCESS FULL | T3 | 1 | 26 | 2 (0)| 00:00:01 |
    —————————————————————————————-

    Predicate Information (identified by operation id):
    —————————————————

    1 – access(“T1″.”N1″=”V2”.”N1″)
    6 – access(“T2″.”N1″=”T1”.”N1″)
    7 – filter((“T4″.”N1″=”T2”.”N1″ AND “T4″.”N1″=”T1”.”N1″))
    9 – access(“N2″=PRIOR NULL)

    Hint Report (identified by operation id / Query Block Name / Object Alias):
    Total hints for statement: 1 (U – Unused (1))
    —————————————————————————

    5 – SEL$09FB3C0F
    U – push_pred(@RECEIVER)
    [/sourcecode]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.