Nested Loop Join Heuristics

Discarded Nested Loop Join

While testing the optimizer transformations on disjunctive subqueries, I created by chance boundary conditions where I observed that the optimizer had not chosen the plan with the lowest cost. More precisely, the optimizer discarded the cheaper Nested Loop join and opted for the Hash Join which had a higher cost. In particular, the phenomena happens when the index on the join column of the inner table is missing, and the inner table contains only 1 or 0 records. Although these conditions are less likely on a live system and would have only a minor impact on the query response time, I decided to find out the reason for such behaviour. In fact, elaborating the conditions under which the optimizer starts to change its behavioral patterns almost always provides interesting insights into its internal functioning. I tested the behaviour on SQL Server and Oracle.

SQL Server

First, I’ll create one big and one small table :

SELECT TOP (1000000) 
  id = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])),
  a = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO t_large
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

create table t_small (b integer,c integer) ;

insert into t_mall values (1,1) ;

The query to populate the large table with numbers is used by courtesy of Aaron Bertrand.
Next, I’ll execute the  query which joins both of the tables and observe the execution plan. The following output is produced by running the queries with set showplan_all on. For the sake of simplicity, I removed the irrelevant information leaving only the operation and its cost.

select a from t_large l join t_small s on l.id = s.b ;

select a from t_large l join t_small s on l.id = s.b 	10.49783	
  |--Hash Match(Inner Join, HASH:([s].[b])=([l].[id]), RESIDUAL:([master].[dbo].[t_small].[b] as [s].[b]=[master].[dbo].[t_large].[id] as [l].[id]))	10.49783	
       |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]))	0.0032831	
       |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]))	3.94106	

The optimizer has chosen Hash Join and estimated the total cost to 10.49783:

Let’s take a look at the plan cost for Nested Loop Join which can be enforced by hint:

select a from t_large l join t_small s on l.id = s.b option (loop join);

select a from t_large l join t_small s on l.id = s.b option (loop join);	8.424343
  |--Nested Loops(Inner Join, WHERE:([master].[dbo].[t_small].[b] as [s].[b]=[master].[dbo].[t_large].[id] as [l].[id]))	8.424343	
       |--Table Scan(OBJECT:([master].[dbo].[t_small] AS [s]))	0.0032831	
       |--Table Scan(OBJECT:([master].[dbo].[t_large] AS [l]))	3.94106	

In this case, as expected, the cost of the Nested Loop is slightly lower than the one of the Hash Join; the Hash Join namely has an overhead of preparing the hash table. This can be confirmed by the runtime statistics, which were switched on by the following directives:

SET STATISTICS IO ON
SET STATISTICS TIME ON

Nested Loop:

Table 't_large'. Scan count 1, logical reads 3832, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 125 ms,  elapsed time = 119 ms.

Hash Join:

Table 't_large'. Scan count 1, logical reads 3832, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 172 ms,  elapsed time = 159 ms.

We can make two observations. One is, that the inner table t_large was scanned only once in both cases. The other is, that the Hash Join spent slightly more time on CPU for, as already mentioned, building the hash table. In conclusion, the calculated costs correspond well to the reality, but at the end, the Nested Loop was not considered, unless it was enforced by the hint.

So, the question is, why did the optimizer ignore the nested loop?

Heuristics

To answer this question, we’ll trace the transformation rules which were applied by the optimizer during the query optimization. This can be done by comparing the content of the dm_exec_query_transformation_stats DMV just before and immediately after the query compilation. The query transformations for which the succeeded column increased, have been considered as alternatives. The exact technique for getting this information for a particular query is described in the book Inside the SQL Server Query Optimizer by Benjamin Nevarez. I publish the queries by courtesy of the author:

GO
SELECT *
INTO before_query_transformation_stats
FROM sys.dm_exec_query_transformation_stats
GO
SELECT *
INTO after_query_transformation_stats
FROM sys.dm_exec_query_transformation_stats
GO
DROP TABLE after_query_transformation_stats
DROP TABLE before_query_transformation_stats
GO
SELECT *
INTO before_query_transformation_stats
FROM sys.dm_exec_query_transformation_stats
GO
select a from t_large l join t_small s on l.id = s.b 
OPTION (RECOMPILE)
GO
SELECT *
INTO after_query_transformation_stats
FROM sys.dm_exec_query_transformation_stats
GO
SELECT a.name, (a.promised - b.promised) as promised
FROM before_query_transformation_stats b
JOIN after_query_transformation_stats a
ON b.name = a.name
WHERE b.succeeded <> a.succeeded
DROP TABLE before_query_transformation_stats
DROP TABLE after_query_transformation_stats

Here are the joins that are considered when we don’t specify the hint:

name	promised
...
JNtoHS	2
JNtoSM	2
...

It can be seen, that only JNtoSM (Join to Sort Merge Join) and JNtoHS (Join to Hash) have been considered. In contrast, JNtoNL (Join to Nested Loop) was discarded. Thus, we can deduce, that some heuristics kicked in. I couldn’t find any explanation for it, but we can guess that the heuristics is there to pre-empt a disastrous scenario, where we can end up with lot of table scans when there are more than one record in the small outer table. To demonstrate this point, I’ll add just one more record into the small table and collect the IO statistics:

insert into t_small values (2,2) ;

Prior to rerunning the query I’ll disable the Table Spool (Lazy Spool) operations, which would otherwise be introduced and change the execution plan:

DBCC RULEOFF('BuildSpool')
SET STATISTICS IO ON

Nested Loop:

Table 't_large'. Scan count 1, logical reads 7664, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Hash Join:

Table 't_large'. Scan count 1, logical reads 3832, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_small'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As expected, the large inner table in the Nested Loop has to be scanned once again for an additional row in the outer table, while the number of logical reads remained the same for Hash Join. In conclusion, the number of logical reads in the Nested Loop join grows proportionally to the number of records in the inner table. Therefore, there is indeed a compelling reason for the heuristic rule for avoiding nested loops when the index on the join predicate column in the inner table is missing.

Oracle

I also checked how the Oracle optimizer behaves under the same conditions. Here’s the test case:

create user u identified by Temp_1234 ;

grant dba to u ;

connect u/Temp_1234

create table t_large ( id number , a number  ) ;
create table t_small ( b number , c number ) ;

insert into t_large
SELECT level,level
FROM   dual
CONNECT BY level <= 1000000;

insert into t_small values (1,1) ;

commit ;

exec dbms_stats.gather_schema_stats('U');

alter session set tracefile_identifier = 'NL' ;
ALTER SESSION SET EVENTS='10053 trace name context forever, level 1' ;
explain plan for select a from t_large l join t_small s on l.id = s.b ;

Unlike in SQL Server, in Oracle, we don’t need to run the queries with hints to obtain the costing information for different join methods. We can rather extract all the information we need from the optimizer 10053 trace.
Likewise, Oracle opted for Hash Join, although Nested Loop was cheaper.

--------------------------------------+-----------------------------------+
| Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |         |       |       |   284 |           |
| 1   |  HASH JOIN          |         |     1 |    13 |   284 |  00:00:06 |
| 2   |   TABLE ACCESS FULL | T_SMALL |     1 |     3 |     4 |  00:00:01 |
| 3   |   TABLE ACCESS FULL | T_LARGE |  977K | 9766K |   278 |  00:00:06 |
--------------------------------------+-----------------------------------+

Best NL cost: 282.775582
...
HA Join
  HA cost: 283.618402
...
Best:: JoinMethod: Hash

If you run the test case on a 12c database, you’ll find a hint that there is indeed some Nested Loop Join heuristics involved in the optimization process:

DP: skipping adaptive plan due to NLJ heuristics

Conclusion

Both of the observed database optimizers employ heuristics to pre-empt expensive full table scans of the inner table, because the disadvantage of full table scan – when there are multiple records in the outer table – far outweighs the overhead of Hash Join compared to Nested Loop, when we have just 0 or 1 records in the outer table.

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.