In my previous blog post, I introduced opt_qb.awk, a script for extracting the most relevant information for a given query block (QB) from optimizer trace:
opt_qb.awk -v qb=QB_MAIN optimizer_trace_file
---------------
24 Registered qb: QB_MAIN 0xc78a0038 (HINT QB_MAIN)
200183 Final cost for query block QB_MAIN (#1) - All Rows Plan:
200184 Best join order: 55
200185 Cost: 9827.131413 Degree: 1 Card: 126870.000000 Bytes: 273404850.000000
The output above contains, among others, the best join order number, though, no any further details about it. The calculations for different join orders were done much before, and as its name implies, the join order section contains the order in which the tables are joined, but also the intermediate cost calculation results for different join types. It’s undoubtedly very important information for understanding the optimizer decisions, but as the whole section commonly consists of thousands of lines, which, in addition, are placed much before the QB summary, it might take some time to find and extract the information of interest.
For this reason, I wrote opt_jo.awk, a script for displaying the cost calculation results summary for a given join order.
Let’s use it now for drilling down. opt_qb.awk showed that the optimal join order is 55, so I’m going to specify it as the input parameter to opt_jo.awk:
opt_jo.awk -v qb=QB_MAIN -v jo=55 optimizer_trace_file
#################
23924 Query Block QB_MAIN (#1)
=================
37199 Join order[55]: TAB_1#1 TAB_2#9 TAB_3#8 TAB_4#5 TAB_5#4 TAB_6#7 TAB_7#0 TAB_8#2 TAB_9#3 TAB_10#6
-----------------
37202 Now joining: TAB_6#7
37274 Best NL cost: 132871.483354
37302 SM cost: 63265.650482
37322 HA cost: 8622.012333 swapped
37325 Best:: JoinMethod: Hash
37326 Cost: 8622.012333 Degree: 1 Resp: 8622.012333 Card: 126869.775717 Bytes:
-----------------
37329 Now joining: TAB_7#0
37340 Best NL cost: 79989834.996156
37368 SM cost: 64302.055187
37388 HA cost: 9252.984437 swapped
37391 Best:: JoinMethod: Hash
37392 Cost: 9252.984437 Degree: 1 Resp: 9252.984437 Card: 126869.775717 Bytes:
-----------------
37395 Now joining: TAB_8#2
37438 Best NL cost: 10001.244263
37466 SM cost: 64475.729580
37486 HA cost: 9259.545341 swapped
37489 Best:: JoinMethod: Hash
37490 Cost: 9259.545341 Degree: 1 Resp: 9259.545341 Card: 126869.775717 Bytes:
-----------------
37493 Now joining: TAB_9#3
37504 Best NL cost: 48435094.585054
37532 SM cost: 66031.786006
37552 HA cost: 9641.804044 swapped
37555 Best:: JoinMethod: Hash
37556 Cost: 9641.804044 Degree: 1 Resp: 9641.804044 Card: 126869.775717 Bytes:
-----------------
37559 Now joining: TAB_10#6
37636 Best NL cost: 136560.628579
37664 SM cost: 66305.446946
37684 HA cost: 9746.850757 swapped
37687 Best:: JoinMethod: HashSemi
37688 Cost: 9746.850757 Degree: 1 Resp: 9746.850757 Card: 126869.775717 Bytes:
By looking at the output above, we can easily compare the cost of each join type and see why, for example, NL was discarded.
At this point, you might wonder where the calculation for the first five tables is and why the optimizer started the calculation for TAB_6, which is only the 6th table in the join order.
That’s a consequence of the optimization employed in the cost calculation. The calculation for the first five tables had already been done in the join order 39, so the optimizer stored its value for the future calculations.
opt_jo.awk -v qb=QB_MAIN -v jo=39 optimizer_trace_file
#################
23924 Query Block QB_MAIN (#1)
=================
32775 Join order[39]: TAB_1#1 TAB_2#9 TAB_3#8 TAB_4#5 TAB_5#4 TAB_7#0 TAB_10#6 TAB_6#7 TAB_8#2 TAB_9#3
-----------------
32778 Now joining: TAB_5#4
32789 Best NL cost: 382588534.321321
32817 SM cost: 12235.764853
32829 HA cost: 5880.070943
32832 Best:: JoinMethod: Hash
32833 Cost: 5880.070943 Degree: 1 Resp: 5880.070943 Card: 126917.972455 Bytes:
-----------------
32836 Now joining: TAB_7#0
32847 Best NL cost: 80017353.149534
32875 SM cost: 60860.650906
32895 HA cost: 6511.043248 swapped
32898 Best:: JoinMethod: Hash
32899 Cost: 6511.043248 Degree: 1 Resp: 6511.043248 Card: 126917.972455 Bytes:
-----------------
32902 Now joining: TAB_10#6
32979 Best NL cost: 133477.886254
33007 SM cost: 61109.315695
33027 HA cost: 6616.090160 swapped
33030 Best:: JoinMethod: HashSemi
33031 Cost: 6616.090160 Degree: 1 Resp: 6616.090160 Card: 126917.972455 Bytes:
-----------------
33034 Now joining: TAB_6#7
33106 Best NL cost: 133607.502571
33134 SM cost: 64650.100407
33154 HA cost: 9358.031550 swapped
33157 Best:: JoinMethod: Hash
33158 Cost: 9358.031550 Degree: 1 Resp: 9358.031550 Card: 126869.775717 Bytes:
-----------------
33161 Now joining: TAB_8#2
33204 Best NL cost: 10106.291375
33232 SM cost: 65085.111004
33252 HA cost: 9364.592454 swapped
33255 Best:: JoinMethod: Hash
33256 Cost: 9364.592454 Degree: 1 Resp: 9364.592454 Card: 126869.775717 Bytes:
-----------------
33259 Now joining: TAB_9#3
33270 Best NL cost: 48435199.632166
33298 SM cost: 66641.167430
33318 HA cost: 9746.851157 swapped
33321 Best:: JoinMethod: Hash
33322 Cost: 9746.851157 Degree: 1 Resp: 9746.851157 Card: 126869.775717 Bytes:
I usually use opt_jo.awk as I did in the example above. I mean I first get the optimal join order number for the QB in question and then use opt_jo.awk for extracting more detailed information for the best join order. But you can also invoke it without any parameters to get the details for every tried permutation and the QB:
opt_jo [-v qb='QB_NAME'] [-v jo=join_order] optimizer_trace_file
As you have seen, it also prints the line numbers, so you can quickly position yourself within the trace file when looking for more information.
By the way, I also published some other articles about programmatically parsing optimizer trace files: