Disjunctive Subquery Optimization

Disjunctive Subqueries

Disjunctive subqueries are the subqueries which are applied to the or operator, like the following one:

select a from t_large l 
  where id in ( select b from t_small s ) or a = 1 

Given that the not null constraint on the t_large.a column is defined, an alternative way to write this query is:

select a from t_large l where a = 1 
union all
select a from t_large l 
  where id in ( select b from t_small s ) and a != 1 

Ideally, the optimizer should apply cost-based query transformations to rewrite the original query with the or operator as the union all variant when appropriate. In this blog post I’ll investigate, how successfully these algorithms work in SQL Server and Oracle.

SQL Server

Setting the Scene

Let me start with SQL Server. I’ll create two test tables, one large with a milion records and the other small, containing just two records:


SELECT TOP (1000000)
id = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])),
a = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO t_large
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

alter table t_large alter column a integer not null

create table t_small (b integer,c integer) ;
insert into t_small values (1,1) ;
insert into t_small values (2,2) ;

The query to populate a large table with numbers is used by courtesy of Aaron Bertrand.

Query Transformation

First, I’ll examine the execution plan and the statistics of the original query with the or operator:

select a from t_large l where id in ( select b from t_small s ) or a = 1 
StmtText	EstimateRows	TotalSubtreeCost	EstimateExecutions
select a from t_large l where id in ( select b from t_small s ) or a = 1	1000000	9.70106	NULL
  |--Nested Loops(Left Semi Join, OUTER REFERENCES:([l].[id], [l].[a]))	1000000	9.70106	1
       |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]))	1000000	3.94106	1
       |--Concatenation	1	1.58	1000000
            |--Filter(WHERE:(STARTUP EXPR([master].[dbo].[t_large].[a] as [l].[a]=(1))))	1	1.48	1000000
            |    |--Constant Scan	1	1	1000000
            |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]), WHERE:([master].[dbo].[t_large].[id] as [l].[id]=[master].[dbo].[t_small].[b] as [s].[b]))	1	9.70106	1000000

At the first glance, there are two things in the execution plan above that look suspicious. One is the Table Scan in the Nested Loop Join (NLJ). Usually, the Query Optimizer uses Nested Loop Join Heuristics to pre-empt Table Scans in NLJs. The other is, that the Concatenation operator, which implements union all, is not on the top of the execution plan. Instead, it is placed inside the NLJ. As a consequence, there will be repeated Table Scans of the inner table t_small.

So, let’s examine the execution plan of the equivalent query with the union all operator:

select a from t_large l where a = 1 
union all
select a from t_large l where id in ( select b from t_small s ) and a != 1
StmtText	TotalSubtreeCost	EstimateExecutions
select a from t_large l where id in ( select b from t_small s ) and a != 1	15.3989	NULL
  |--Concatenation	15.3989	1
       |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]), WHERE:([master].[dbo].[t_large].[a] as [l].[a]=(1)))	3.94106	1
       |--Hash Match(Right Semi Join, HASH:([s].[b])=([l].[id]), RESIDUAL:([master].[dbo].[t_large].[id] as [l].[id]=[master].[dbo].[t_small].[b] as [s].[b]))	10.97784	1
            |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]))	0.0032842	1
            |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]), WHERE:([master].[dbo].[t_large].[a] as [l].[a]<>(1)))	3.94106	1

As expected, the union query uses the Hash Join (HJ), and the Concatenation operator is on the top of the exection plan. However, its estimated cost is significantly higher than the cost of the NLJ query, 15.3989 and 9.70106, respectively.
For this reason, it makes sense to cross-check the cost calculation with the IO and timing statistics for both queries:

NLS (or query) :

Table 't_small'. Scan count 1, logical reads 999999, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_large'. Scan count 1, logical reads 3832, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 3766 ms,  elapsed time = 5143 ms.
   
HJ (union all query):
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_large'. Scan count 2, logical reads 7664, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 328 ms,  elapsed time = 493 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

Curiously, despite its more expensive plan, the union all query with the HJ is by orders of magnitude more efficient than the original query with the or operator which does NLJ. What I mean by that is, considering the high number of logical reads (999999) on t_small, its estimated cost (9.70106) seems quite small when compared to the cost of the HJ (10.97784), which performed in total only 7665 logical reads. To get an idea, by how much the cost of the NLJ was underestimated, I’ll isolate the join in a separate query forcing the join order and NLJ with hints:

select a from t_large l 
  where id in ( select b from t_small s ) 
  option (loop join , force order)

StmtText	TotalSubtreeCost
select a from t_large l 	
  where id in ( select b from t_small s ) option (loop join , force order)	93.60426
  |--Nested Loops(Left Semi Join, WHERE:([master].[dbo].[t_large].[id] as [l].[id]=[master].[dbo].[t_small].[b] as [s].[b]))	93.60426
       |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]))	3.94106
       |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]))	80.7032 

As opposed to the original query with the or operator, the cost of the scan of the inner table in the NLJ in the query above – 80.7032 – reflects much better the high number of logical reads performed. In conclusion, if this cost had been caculated correctly in the original query, it is not unreasonable to assume that the query would have been transformed correctly.

Indexing

So, how can we overcome the wrong cost calculation?

As the wrong cost calculation applies to the Table Scan, let’s try to eliminate it by creating the following index:

CREATE NONCLUSTERED INDEX [idx_t_large_id]
ON [dbo].[t_large] ([id])
INCLUDE ([a])

Indeed, the transformation is done as expected for the original query with the or clause:


  |--Concatenation
       |--Index Scan(OBJECT:([master].[dbo].[t_large].[idx_t_large_id] AS [l]),  WHERE:([master].[dbo].[t_large].[a] as [l].[a]=(1)))
       |--Nested Loops(Inner Join, OUTER REFERENCES:([s].[b]))
            |--Sort(DISTINCT ORDER BY:([s].[b] ASC))
            |    |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]))
            |--Index Seek(OBJECT:([master].[dbo].[t_large].[idx_t_large_id] AS [l]), SEEK:([l].[id]=[master].[dbo].[t_small].[b] as [s].[b]),  WHERE:([master].[dbo].[t_large].[a] as [l].[a]<>(1)) ORDERED FORWARD)

Also, the number of logical reads is identical for both the or and union all query:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_large'. Scan count 3, logical reads 2735, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Therefore, we can conclude, that the transformation is now successful.

Oracle

Now, let’s see how all of this works (or doesn’t work) in an Oracle database:

create user u identified by Temp_12345 ;
grant dba to u ;

connect u/Temp_12345

create table t_large ( id number , a number not null ) ;
create table t_small ( b number , c number ) ;

insert into t_large
SELECT level,level
FROM   dual
CONNECT BY level <= 1000000;

insert into t_small values (1,1) ;
insert into t_small values (2,2) ;

commit ;

CREATE INDEX idx_t_large_a ON t_large(id) ;

exec dbms_stats.gather_schema_stats('U');

First, I’ll inspect the execution plan and the statistics of the original query containing or:

select /*+ gather_plan_statistics */ a from t_large l 
  where id in ( select b from t_small s ) or a = 1  ;


------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name    | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |      1 |        |       |   280 (100)|          |      2 |00:00:00.01 |    5000K|
|*  1 |  FILTER            |         |      1 |        |       |            |          |      2 |00:00:00.01 |    5000K|
|   2 |   TABLE ACCESS FULL| T_LARGE |      1 |   1000K|  9765K|   280   (2)| 00:00:01 |   1000K|00:00:00.20 |     538 |
|*  3 |   TABLE ACCESS FULL| T_SMALL |    999K|      1 |     3 |     4   (0)| 00:00:01 |      1 |00:00:09.17 |    4999K|
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("A"=1 OR  IS NOT NULL))
   3 - filter("B"=:B1)

The presence of the filter operation indicates that the subquery was not unnested, which in turn means that it was repeatidly executed for each record in t_large. Moreover, the full table scan of t_small was not costed properly. What I mean by this is, there were 4999K logical reads on the table, but the cost was estimated to only 4.

At this point, we might ask ourselves, whether the absence of the query unnesting is the consequence of the wrong cost calculation, or is the Oracle optimizer inherently incapable of transforming disjunctive queries.

We can easily answer this question by verifying whether the execution plan changes after embedding the unnest hint in the subquery:

select /*+ gather_plan_statistics */ a from t_large l 
  where id in ( select /*+ unnest */ b from t_small s ) or a = 1  ;

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |       |       |   280 (100)|          |
|*  1 |  FILTER            |         |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_LARGE |  1000K|  9765K|   280   (2)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_SMALL |     1 |     3 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("A"=1 OR  IS NOT NULL))
   3 - filter("B"=:B1)

Unfortunately, the execution plan hasn’t improved, which leads us to the conclusion that the Oracle optimizer just doesn’t do unnesting of disjunctive subqueries. Actually, Jonathan Lewis and Mohamed Houry have already elaborated on this topic in their blog posts Subquery with OR and Tuning a disjunctive subquery, repectively.

Finally, let’s quantify the benefit of manually rewriting the query:

select /*+ gather_plan_statistics */ a from t_large l where a = 1 
union all
select a from t_large l where id in ( select b from t_small s ) and a != 1 ;

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |      1 |        |       |   286 (100)|          |      2 |00:00:00.01 |     548 |       |       |          |
|   1 |  UNION-ALL                    |               |      1 |        |       |            |          |      2 |00:00:00.01 |     548 |       |       |          |
|*  2 |   TABLE ACCESS FULL           | T_LARGE       |      1 |      1 |     5 |   279   (2)| 00:00:01 |      1 |00:00:00.01 |     538 |       |       |          |
|   3 |   NESTED LOOPS                |               |      1 |      2 |    26 |     7  (15)| 00:00:01 |      1 |00:00:00.01 |      10 |       |       |          |
|   4 |    NESTED LOOPS               |               |      1 |      2 |    26 |     7  (15)| 00:00:01 |      2 |00:00:00.01 |       9 |       |       |          |
|   5 |     SORT UNIQUE               |               |      1 |      2 |     6 |     4   (0)| 00:00:01 |      2 |00:00:00.01 |       5 |  2048 |  2048 | 2048  (0)|
|   6 |      TABLE ACCESS FULL        | T_SMALL       |      1 |      2 |     6 |     4   (0)| 00:00:01 |      2 |00:00:00.01 |       5 |       |       |          |
|*  7 |     INDEX RANGE SCAN          | IDX_T_LARGE_A |      2 |      1 |       |     1   (0)| 00:00:01 |      2 |00:00:00.01 |       4 |       |       |          |
|*  8 |    TABLE ACCESS BY INDEX ROWID| T_LARGE       |      2 |      1 |    10 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       1 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------

Looking into the buffers statistics, we can see that the number of logical reads slumped from 5 million to just 548. Sadly, Oracle misses a huge optimizing potential by not implementing the unnesting of disjunctive subqueries.

Conclusion

In summary, the unnesting of disjunctive subqueries and merging them into the main query can boost query performance.

Unfortunately, the algorithm hasn’t been implemented in the Oracle optimizer yet. Therefore, developers have to put effort in assisting the optimizer in deriving a good execution plan.

In contrast, the SQL Server optimizer is able to do this transformation all by itself. However, beware of some edge cases where, due to a wrong cost calculation, the optimization process doesn’t yield the expected result, which in turn may result in suboptimal plans. The good news is, that such plans can be easily recognized by Table Scans in Nested Loop Joins and therefore avoided by a good indexing strategy.

Updates

March 9, 2017 – 12.2

The behavior hasn’t changed in 12.2: the disjunctive subquery is still not being unnested.

August 24, 2020 – 19c

Jonathan Lewis shows us here that ORE works with disjunctive queries in 19c. The optimizer team implemented the unnesting of disjunctive subqueries. Keep in mind that every column of the table in the main query block which is referenced in OR must have an index. This means that you have to create an index on t_large.a in the test case above.

The feature isn’t implemented in 18c (tested on 18.5.0.0.190115).

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.