Bypassed Subquery Unnesting for Anti Join

I serendipitously discovered a special case where the optimizer fails to perform subquery unnesting (SU) for anti join (AJ), and consequently comes up with an inferior execution plan.

We’ll need two tables to reproduce the problem (tested on Oracle database version 19.10.0.0.210119):

drop table t1 ;
drop table t2 ;

create table t1 ( n1 number, n2 number ) ;
insert into t1 select level, 1 from dual connect by level <= 1000 ;

create table t2 ( n1 number ) ;
insert into t2 select level from dual connect by level <= 100 ;

commit ;

exec dbms_stats.gather_table_stats(null,'T1') ;
exec dbms_stats.gather_table_stats(null,'T2') ;

The optimizer produces a good plan with AJ for the following query:

explain plan for
select t1.n1 
  from t1 
  where not exists ( 
    select 0
      from t2 
      where t1.n1 = t2.n1
  ) ;
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT      |         |       |       |     6 |           |
| 1   |  HASH JOIN RIGHT ANTI |         |   901 |  6307 |     6 |  00:00:01 |
| 2   |   TABLE ACCESS FULL   | T2      |   100 |   300 |     3 |  00:00:01 |
| 3   |   TABLE ACCESS FULL   | T1      |  1000 |  4000 |     3 |  00:00:01 |
----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - access("T1"."N1"="T2"."N1")

It’s worth noting, that SU transformation is a prerequisite for AJ:

SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$2 (#2)
SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.
SU:  Merge subquery block SEL$2 (#2) into outer QB SEL$1 (#1)

Let’s make a small syntactic change in the subquery – replace the constant 0 with the column t1.n2:

explain plan for
select t1.n1 
  from t1 
  where not exists ( 
    select t1.n2
      from t2 
      where t1.n1 = t2.n1
  ) ;

t1.n2 in the subquery doesn’t project, so this shouldn’t change anything. Simply put, both queries are equivalent.

But the execution plan changed for worse – it uses FILTER and has a higher host:

--------------------------------------+-----------------------------------+
| Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |         |       |       |  1504 |           |
| 1   |  FILTER             |         |       |       |       |           |
| 2   |   TABLE ACCESS FULL | T1      |  1000 |  7000 |     3 |  00:00:01 |
| 3   |   TABLE ACCESS FULL | T2      |     1 |     3 |     3 |  00:00:01 |
--------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - filter( IS NULL)
3 - filter("T2"."N1"=:B1)

The column t1.n2, which was referenced in the SELECT clause, prevented SU:

942 SU:     SU bypassed: Correlation column in SELECT, GROUP BY, or ORDER BY.
943 SU:   Validity checks failed.

At some point, the optimizer replaced t1.n2 with a constant, but it was too late then, because it had already bypassed SU.

1069  NOT EXISTS (SELECT 0 FROM "T2" "T2")

In other words, ANTI JOIN is feasible, but it didn’t happen, because of the order in which different query transformations were performed.

In conclusion, make sure that you really use a constant in a NOT EXISTS subquery, because optimizer will generate a subpar plan with FILTER instead of AJ otherwise.

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.