Extracting Join Order from Optimizer Trace

In my previous blog post, I introduced opt_qb.awk, a script for extracting the most relevant information for a given query block (QB) from optimizer trace:

opt_qb.awk -v qb=QB_MAIN optimizer_trace_file
---------------
24 Registered qb: QB_MAIN 0xc78a0038 (HINT QB_MAIN)
200183 Final cost for query block QB_MAIN (#1) - All Rows Plan:
200184   Best join order: 55
200185   Cost: 9827.131413  Degree: 1  Card: 126870.000000  Bytes: 273404850.000000

The output above contains, among others, the best join order number, though, no any further details about it. The calculations for different join orders were done much before, and as its name implies, the join order section contains the order in which the tables are joined, but also the intermediate cost calculation results for different join types. It’s undoubtedly very important information for understanding the optimizer decisions, but as the whole section commonly consists of thousands of lines, which, in addition, are placed much before the QB summary, it might take some time to find and extract the information of interest.

For this reason, I wrote opt_jo.awk, a script for displaying the cost calculation results summary for a given join order.

Let’s use it now for drilling down. opt_qb.awk showed that the optimal join order is 55, so I’m going to specify it as the input parameter to opt_jo.awk:

opt_jo.awk -v qb=QB_MAIN -v jo=55 optimizer_trace_file

#################
23924 Query Block QB_MAIN (#1)
=================
37199 Join order[55]:  TAB_1#1  TAB_2#9  TAB_3#8  TAB_4#5  TAB_5#4  TAB_6#7  TAB_7#0  TAB_8#2  TAB_9#3  TAB_10#6
-----------------
37202 Now joining: TAB_6#7
37274   Best NL cost: 132871.483354
37302   SM cost: 63265.650482 
37322   HA cost: 8622.012333 swapped 
37325 Best:: JoinMethod: Hash
37326        Cost: 8622.012333  Degree: 1  Resp: 8622.012333  Card: 126869.775717 Bytes: 
-----------------
37329 Now joining: TAB_7#0
37340   Best NL cost: 79989834.996156
37368   SM cost: 64302.055187 
37388   HA cost: 9252.984437 swapped 
37391 Best:: JoinMethod: Hash
37392        Cost: 9252.984437  Degree: 1  Resp: 9252.984437  Card: 126869.775717 Bytes: 
-----------------
37395 Now joining: TAB_8#2
37438   Best NL cost: 10001.244263
37466   SM cost: 64475.729580 
37486   HA cost: 9259.545341 swapped 
37489 Best:: JoinMethod: Hash
37490        Cost: 9259.545341  Degree: 1  Resp: 9259.545341  Card: 126869.775717 Bytes: 
-----------------
37493 Now joining: TAB_9#3
37504   Best NL cost: 48435094.585054
37532   SM cost: 66031.786006 
37552   HA cost: 9641.804044 swapped 
37555 Best:: JoinMethod: Hash
37556        Cost: 9641.804044  Degree: 1  Resp: 9641.804044  Card: 126869.775717 Bytes: 
-----------------
37559 Now joining: TAB_10#6
37636   Best NL cost: 136560.628579
37664   SM cost: 66305.446946 
37684   HA cost: 9746.850757 swapped 
37687 Best:: JoinMethod: HashSemi
37688        Cost: 9746.850757  Degree: 1  Resp: 9746.850757  Card: 126869.775717 Bytes: 

By looking at the output above, we can easily compare the cost of each join type and see why, for example, NL was discarded.

At this point, you might wonder where the calculation for the first five tables is and why the optimizer started the calculation for TAB_6, which is only the 6th table in the join order.

That’s a consequence of the optimization employed in the cost calculation. The calculation for the first five tables had already been done in the join order 39, so the optimizer stored its value for the future calculations.

opt_jo.awk -v qb=QB_MAIN -v jo=39 optimizer_trace_file
 
#################
23924 Query Block QB_MAIN (#1)
=================
32775 Join order[39]:  TAB_1#1  TAB_2#9  TAB_3#8  TAB_4#5  TAB_5#4  TAB_7#0  TAB_10#6  TAB_6#7  TAB_8#2  TAB_9#3
-----------------
32778 Now joining: TAB_5#4
32789   Best NL cost: 382588534.321321
32817   SM cost: 12235.764853 
32829   HA cost: 5880.070943  
32832 Best:: JoinMethod: Hash
32833        Cost: 5880.070943  Degree: 1  Resp: 5880.070943  Card: 126917.972455 Bytes: 
-----------------
32836 Now joining: TAB_7#0
32847   Best NL cost: 80017353.149534
32875   SM cost: 60860.650906 
32895   HA cost: 6511.043248 swapped 
32898 Best:: JoinMethod: Hash
32899        Cost: 6511.043248  Degree: 1  Resp: 6511.043248  Card: 126917.972455 Bytes: 
-----------------
32902 Now joining: TAB_10#6
32979   Best NL cost: 133477.886254
33007   SM cost: 61109.315695 
33027   HA cost: 6616.090160 swapped 
33030 Best:: JoinMethod: HashSemi
33031        Cost: 6616.090160  Degree: 1  Resp: 6616.090160  Card: 126917.972455 Bytes: 
-----------------
33034 Now joining: TAB_6#7
33106   Best NL cost: 133607.502571
33134   SM cost: 64650.100407 
33154   HA cost: 9358.031550 swapped 
33157 Best:: JoinMethod: Hash
33158        Cost: 9358.031550  Degree: 1  Resp: 9358.031550  Card: 126869.775717 Bytes: 
-----------------
33161 Now joining: TAB_8#2
33204   Best NL cost: 10106.291375
33232   SM cost: 65085.111004 
33252   HA cost: 9364.592454 swapped 
33255 Best:: JoinMethod: Hash
33256        Cost: 9364.592454  Degree: 1  Resp: 9364.592454  Card: 126869.775717 Bytes: 
-----------------
33259 Now joining: TAB_9#3
33270   Best NL cost: 48435199.632166
33298   SM cost: 66641.167430 
33318   HA cost: 9746.851157 swapped 
33321 Best:: JoinMethod: Hash
33322        Cost: 9746.851157  Degree: 1  Resp: 9746.851157  Card: 126869.775717 Bytes: 

I usually use opt_jo.awk as I did in the example above. I mean I first get the optimal join order number for the QB in question and then use opt_jo.awk for extracting more detailed information for the best join order. But you can also invoke it without any parameters to get the details for every tried permutation and the QB:

opt_jo [-v qb='QB_NAME'] [-v jo=join_order] optimizer_trace_file

As you have seen, it also prints the line numbers, so you can quickly position yourself within the trace file when looking for more information.

By the way, I also published some other articles about programmatically parsing optimizer trace files:

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.