Correlated Subqueries in the SELECT Clause

In this blog post, I’ll be analyzing how Oracle 18c and SQL Server 2017 optimize queries with a correlated subquery in the SELECT clause, such as the following:

select /*+ qb_name(QB_MAIN) */ 
  case when exists ( 
    select /*+ qb_name(QB_EXISTS) */ 1 
	  from t_1k 
	  where t_1k.n1 = t_100k.n1 
  ) then 1 else 0 end
  from t_100k ;

First, we’re going to explore the limitations of the Oracle optimizer.

Oracle

Tables

The tables t_1k and t_100k are created as follows:

create table t_1k ( n1 integer ) ;

create table t_100k ( n1 integer ) ;

insert into t_1k
  select level
    from dual
    connect by level <= 1000;

insert into t_100k
  select level 
    from dual
    connect by level <= 100000;
    
commit ;

begin
  dbms_stats.gather_table_stats ( null, 'T_1K') ;
  dbms_stats.gather_table_stats ( null, 'T_100K') ;
end ;
/

The larger table t_100k is used in the main query block, the smaller t_1k in the subquery.

Missing Join

What first strikes out is that there isn’t any join operation performed.

-----------------------------------------------------------------------------
| Id  | Operation         | Name   | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |        |       |   356K(100)|          |
|*  1 |  TABLE ACCESS FULL| T_1K   |      1 |     4 |     4   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T_100K |    100K|   488K|    33   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Therefore, it comes to no surprise that both query blocks are optimized independently (they’re marked as OUTLINE_LEAF in the outline section):

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - QB_EXISTS / T_1K@QB_EXISTS
   2 - QB_MAIN   / T_100K@QB_MAIN
 
Outline Data
-------------
 
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('18.1.0')
      DB_VERSION('18.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"QB_EXISTS")
      OUTLINE_LEAF(@"QB_MAIN")
      FULL(@"QB_MAIN" "T_100K"@"QB_MAIN")
      FULL(@"QB_EXISTS" "T_1K"@"QB_EXISTS")
      END_OUTLINE_DATA
  */

Finally, the predicate and column projection sections tell us that the driving table is T_100K.

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("T_1K"."N1"=:B1)
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
   2 - "T_100K"."N1"[NUMBER,22]

So, T_1K.N1 is matched for every single retrieved row from T_100K. That’s the reason why the number of rows in the driving row source will be the main contributor to the total cost, specially for large driving row sources:
total_cost ~ driving_row_source_cost + k * driving_rows * single_subquery_access_cost
356000 ~ 33 + k * 100000 * 4

By the way, k is around 0.5 for smaller tables and then it grows toward 1 with the table size. That’s presumably to factor in the caching effect.

Anyway, the query will perform poorly for large driving row sources – the query indeed executed 500323 logical reads.

Further, it can be easily seen that our example query isn’t an exception, but rather other correlated subqueries suffer from the same problem too, like, for example:

select /*+ qb_name(QB_MAIN) */  
  (
    select /*+ qb_name(QB_EXISTS) */ n1 
	  from t_1k 
	  where t_1k.n1 = t_100k.n1 
  ) 
  from t_100k ;
  
----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |   100K|   488K|   371K  (1)| 00:00:59 |
|*  1 |  TABLE ACCESS FULL| T_1K   |     1 |     4 |     4   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T_100K |   100K|   488K|    33   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("T_1K"."N1"=:B1)

Notice that the first executed operation in the execution plan isn’t the first child, as per usual, but the second one.

In conclusion, problematic correlated subqueries have a specific shape with the following characteristics:

  • two children of the SELECT operation without a join,
  • filter predicate on the first child,
  • high cost of the parent SELECT related to a high cardinality of the second child.

I recommend memorizing this pattern, because it can help you immediately identify a problematic correlated subquery even in larger execution plans where the subquery is buried under many layers of cascaded views.

But, is there a better way to execute the same query?

Outer Join

The answer is yes, but we have to manually rewrite it to use a join. For instance, the query

select /*+ qb_name(QB_MAIN) */ 
  case when exists ( 
    select /*+ qb_name(QB_EXISTS) */ 1 
	  from t_1k 
	  where t_1k.n1 = t_100k.n1 
  ) then 1 else 0 end
  from t_100k ;

can be rewritten as:

select  
  case when v.n1 is not null then 1 else 0 end
  from t_100k, ( select distinct n1 from t_1k ) v
  where t_100k.n1 = v.n1(+) ;

--------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |   100K|  1757K|    39   (6)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT OUTER|        |   100K|  1757K|    39   (6)| 00:00:01 |
|   2 |   VIEW                |        |  1000 | 13000 |     5  (20)| 00:00:01 |
|   3 |    HASH UNIQUE        |        |  1000 |  4000 |     5  (20)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | T_1K   |  1000 |  4000 |     4   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL   | T_100K |   100K|   488K|    33   (0)| 00:00:01 |
--------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("T_100K"."N1"="V"."N1"(+))

As you can see, the plan has a significantly lower cost than the original, 39 and 371000, respectively. The total cost entails retrieving both row sources and combining them together with a hash join.

The modified query performed 161 logical reads, which is orders of magnitude less than carried out by the original query (500323).

We’ve seen so far how the query can be rewritten to execute more efficiently. Now it’s time to check how SQL Server optimizes the same original query.

SQL Server

Below is the equivalent test case for SQL Server:

drop table t_1k ;

drop table t_100k ;

create table t_1k (n1 integer) ;

create table t_100k (n1 integer) ;

insert into t_1k 
SELECT TOP (1000) 
  n1 = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

insert into t_100k
SELECT TOP (100000) 
  n1 = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

set statistics io on 

select 
  case when exists ( 
    select  1 
	  from t_1k 
	  where t_1k.n1 = t_100k.n1 
  ) then 1 else 0 end
  from t_100k ; 

That’s terrific! Unlike Oracle, SQL Server transformed the query to use merge semi join out of the box:

StmtText	StmtId	NodeId	Parent	EstimateRows	TotalSubtreeCost	EstimateExecutions
  |--Compute Scalar(DEFINE:([Expr1007]=CASE WHEN [Expr1008] THEN (1) ELSE (0) END))	1	2	1	90000	8.64109	1
       |--Merge Join(Left Semi Join, MANY-TO-MANY MERGE:([DERITRADE].[dbo].[t_100k].[n1])=([DERITRADE].[dbo].[t_1k].[n1]), RESIDUAL:([DERITRADE].[dbo].[t_1k].[n1]=[DERITRADE].[dbo].[t_100k].[n1]))	1	3	2	90000	8.632091	1
            |--Sort(ORDER BY:([DERITRADE].[dbo].[t_100k].[n1] ASC))	1	4	3	100000	7.995876	1
            |    |--Table Scan(OBJECT:([DERITRADE].[dbo].[t_100k]))	1	5	4	100000	0.3606894	1
            |--Sort(ORDER BY:([DERITRADE].[dbo].[t_1k].[n1] ASC))	1	6	3	1000	0.03351212	1
                 |--Table Scan(OBJECT:([DERITRADE].[dbo].[t_1k]))	1	7	6	1000	0.006604222	1

The execution resulted in only 339 logical reads without having to manually rewrite the query:

(100000 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_1k'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_100k'. Scan count 1, logical reads 335, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Summary

  • Unfortunately, Oracle doesn’t optimally handle queries with correlated subqueries in the select clause.
  • Such queries can be manually rewritten to use join instead.
  • Suboptimal correlated subqueries have a specific shape that can be easily spotted in a larger execution plan.
  • Unlike Oracle, SQL Server automatically transforms such queries to use more efficient joins.
Thanks for sharing

Nenad Noveljic

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.