Index Organized Table vs. Clustered Index

Index Organized Tables (IOTs) and Clustered Indexes are data structures in Oracle and SQL Server, respectively, which, unlike heap tables, store non-key data together with the index key in the B-tree structure. As a consequence, no additional lookup is needed for retrieval when data is queried based on the index key.

In this blog post I’ll compare the efficiency in terms of consistent gets/logical reads of two database products. In particular, a case with a secondary/nonclustered index will be considered. The tests were performed on the currently latest releases of both products – Oracle 12.1 and SQL Server 2016.

IOT

First, I’ll create a table with three columns, a primary key and a secondary index:

drop table t_iot ;

create table t_iot ( 
  n1 number , n2 number , n3 number , constraint pk_t_iot primary key (n1) 
) organization index ;

insert into t_iot 
  SELECT level,level,level
  FROM   dual
  CONNECT BY level <= 1000000 ; create index ix_t_iot_n2 on t_iot (n2) ; 

exec dbms_stats.gather_table_stats(user,'T_IOT',cascade => true ) ;

Then, I’ll execute different queries to measure the number of consistent gets for different access paths:

  • primary key
  • secondary index without a lookup (the selected column is contained in the secondary index)
  • secondary index with a lookup (the selected column is not contained in the secondary index, so it must be retrieved from the table)
set autotrace traceonly explain statistics ;
SQL> set autotrace traceonly explain statistics ;
SQL> select n1 from t_iot where n1=1 ;
                                                                              
------------------------------------------------------------------------------  
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |  
------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT  |          |     1 |     5 |     1   (0)| 00:00:01 |  
|*  1 |  INDEX UNIQUE SCAN| PK_T_IOT |     1 |     5 |     1   (0)| 00:00:01 |  
------------------------------------------------------------------------------  
                                                                                
Statistics
----------------------------------------------------------                      
          2  consistent gets                                                    

		  
SQL> select n2 from t_iot where n2=1 ;
                                                                      
--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |     1 |     5 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IX_T_IOT_N2 |     1 |     5 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------
                                                                                
Statistics
----------------------------------------------------------                      
          3  consistent gets                                                    

SQL> select * from t_iot where n2=1 ;

---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |             |     1 |    15 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| PK_T_IOT    |     1 |    15 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| IX_T_IOT_N2 |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------
                                                                                
Statistics
----------------------------------------------------------                      
          4  consistent gets                                                    
		  

The following table summarizes the the number of consistent gets for different access paths:

access pathconsistent gets
primary key2
secondary index without lookup3
secondary index with lookup4

Since the lookup after the access to the secondary index caused only one consistent get and the retrieval over primary key took two consistent gets, we can conclude that the lookup was performed over the rowid instead of the primary key.

The reason for this efficiency is that secondary indexes include a physical guess, which is the physical rowid of the index entry when it was first made. If the physical guess is not stale, the row will be retrieved via rowid. The primary key needs to be scanned otherwise. Richard Foote explained this behaviour in more detail in his blog post IOT Secondary Indexes – The Logical ROWID Guess Component Part I (Lucky) .

Oracle developers implemented this cunning idea to avoid primary key scans.

Clustered Index

Let’s see how SQL Server behaves in such a situation. First, I’ll create a similar structure.

drop table t_clustered ;

create table t_clustered (
n1 integer , n2 integer , n3 integer ,
CONSTRAINT PK_t_clustered PRIMARY KEY CLUSTERED (
n1 ASC
)) ;

create index ix_t_clustered_n2 on t_clustered(n2) ;

insert into t_clustered select top(1000000)
CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])),
CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])),
CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

Second, I’ll run the same queries as with Oracle:

set statistics io on

select n1 from t_clustered where n1 = 1 ;

|--Clustered Index Seek(OBJECT:([master].[dbo].[t_clustered].[PK_t_clustered]), SEEK:([master].[dbo].[t_clustered].[n1]=CONVERT_IMPLICIT(int,[@1],0)) ORDERED FORWARD)
Table 't_clustered'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

select n2 from t_clustered where n2 = 1 ;

|--Index Seek(OBJECT:([master].[dbo].[t_clustered].[ix_t_clustered_n2]), SEEK:([master].[dbo].[t_clustered].[n2]=CONVERT_IMPLICIT(int,[@1],0)) ORDERED FORWARD)
Table 't_clustered'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

select * from t_clustered where n2 = 1 ;

|--Nested Loops(Inner Join, OUTER REFERENCES:([master].[dbo].[t_clustered].[n1]))
  |--Index Seek(OBJECT:([master].[dbo].[t_clustered].[ix_t_clustered_n2]), SEEK:([master].[dbo].[t_clustered].[n2]=(1)) ORDERED FORWARD)
  |--Clustered Index Seek(OBJECT:([master].[dbo].[t_clustered].[PK_t_clustered]), SEEK:([master].[dbo].[t_clustered].[n1]=[master].[dbo].[t_clustered].[n1]) LOOKUP ORDERED FORWARD)
Table 't_clustered'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Finally, the table below shows the number of logical reads for different access paths. The nonclustered index in SQL Server corresponds with the secondary index in Oracle.

access pathlogical reads
clustered index (primary key)3
nonclustered index without lookup3
nonclustered index with lookup6

Unlike Oracle, SQL Server does a lookup based on the primary key, so clustered index seek needs to be performed in order to retrieve data. In this case three logical reads were done instead of only one.

Conclusion

In conclusion, SQL Server needs to traverse the clustered index to retrieve a row. In contrast, Oracle database stores initial physical rowids in the secondary index. By doing so, in the best case scenario only a single logical read is additionally needed to retrieve the row from the table.

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.