Event Propagation in Oracle 12.2

Unfortunately, the event propagation is broken in the release 12.2 What I mean by that is if you try to set the event to trace a particular SQL ID and run that SQL in another session, the SQL, sadly, won’t be traced.

For example, setting the trace in one session:

alter system set events='sql_trace[sql: 2asugy1n1r7y7] level=12'; 

and executing the query in another should produce the trace, but it doesn’t:

select /* test_trace */ 1 from dual ;

Suprisingly, the problem doesn’t seem to be widely known, although it’s such a useful feature. So, I posted the question on Oracle-L and obtained the following feedback from Ghassan Salem: “This is bug 25989066 (and 25994378), fixed in 18.1, no backport to 12.2 available yet.” Strangely enough, there isn’t any information about the problem on Metalink yet.

With regard to the topic, I’d like to share what I learned while investigating the problem.

Setting Events

Firstly, Oracle builds a doubly linked list when alter system set events is issued. I put together a DTrace script dbgd_flow_122.d to trace this process:

$dtrace -s dbgd_flow_122.d -p 26017

...
dbgdSetEvents entry
arg2=9845c870 
-----------
dbgdCopyEventNode source: arg2 = ffff80ffbd5b94d8 
*(arg2+0x28): *(ffff80ffbd5b9500) = 1
dbgdCopyEventNode destination: *ret_val = 96b837d8 
*ptr_mask = *ret_val+0x28: *(96b83800) = 1
-----------
dbgdLinkEvent entry arg2:96b837d8 arg3:9845c870
-----------
dbgdCopyEventNode source: arg2 = ffff80ffbd5b7730 
*(arg2+0x28): *(ffff80ffbd5b7758) = 6
dbgdCopyEventNode destination: *ret_val = 96b83458 
*ptr_mask = *ret_val+0x28: *(96b83480) = 6
-----------
dbgdLinkEvent entry arg2:96b83458 arg3:9845c870
-----------
dbgdSetEvents return 
*arg2: *(9845c870) = 80 
...

Let me outline two functions in the trace above. One is dbgdCopyEventNode which transfers the event information from PGA to SGA. The other is dbgdLinkEvent which, as its name says, links it into the list.

By the way, linked lists are a widely used concept in Oracle database. On the subject of linked lists, there is no better source of information than Jonathan Lewis’ Oracle Core Internals.

Propagation

Secondly, a similar process will kick in during login, with the difference that the information will flow in the opposite direction, i.e. from SGA to PGA. Again, the same DTrace script can be used for tracing the process. However, since the event propagation takes place during uga initialization which in turn is a part of session creation, we have to intercept this process at an early stage to be able to attach to it with DTrace. Talking about debugging login process, I documented that troubleshooting technique in my  previous blog post. And now, here’s the stripped output of the trace:

dbgdCopyEventNode source: arg2 = 96b837d8 
*(arg2+0x28): *(96b83800) = 1
dbgdCopyEventNode destination: *ret_val = ffff80ffbabc0c78 
*ptr_mask = *ret_val+0x28: *(ffff80ffbabc0ca0) = 1
-----------
dbgdLinkEvent entry arg2:ffff80ffbabc0c78 arg3:ffff80ffbd7cd610
-----------
dbgdSyncEventGrpsDirect  arg1      = bcffde30 
1st ptr *(arg1+0x10):    *bcffde40 = 9845c870
2nd ptr *(1st_ptr+0xb8): *9845c928 = 96b834c0  
-----------
dbgdCopyEventNode source: arg2 = 96b837d8 
*(arg2+0x28): *(96b83800) = 1
dbgdCopyEventNode destination: *ret_val = ffff80ffbd5ef940 
*ptr_mask = *ret_val+0x28: *(ffff80ffbd5ef968) = 1
-----------
dbgdLinkEvent entry arg2:ffff80ffbd5ef940 arg3:ffff80ffbd5efeb0

One thing that an attentive reader might notice, is that when the events were created two elements were copied to SGA. In particular, we can deduce that from two dbgdCopyEventNode calls. In contrast, only one was read back to the PGA during login.

Doubly Linked List

As for the next step, we’re going to take a look at both elements of the linked list. As I couldn’t identify an x$ fixed table which would externalize those structures, I wrote the utility select_x_linked_events.pl which gets this kind of information by directly reading the memory locations and following the pointers.

linked list ptr to root entry:
BCFFDE18+28: *BCFFDE40 = 9845C870

root_Entry:
9845C870+ 0: *9845C870 = 00000080
9845C870+B8: *9845C928 = 96B834C0

Traversing linked list:
96B83458+68: *96B834C0 = 96B83840
96B837D8+68: *96B83840 = 9845C928

1. element:
96B83458+ 0: *96B83458 = 02160001
96B83458+28: *96B83480 = 00000006
96B83458+38: *96B83490 = 00000000
96B83458+48: *96B834A0 = 96B83188
  => sql: 2asugy1n1r7y7
96B83458+50: *96B834A8 = 00000000
96B83458+68: *96B834C0 = 96B83840
96B83458+70: *96B834C8 = 9845C928
96B83458+B8: *96B83510 = 5F6C7173
  = sql_trace[sql: 2asugy1n1r7y7] level=12

2. element:
96B837D8+ 0: *96B837D8 = 0216000E
96B837D8+28: *96B83800 = 80000001
96B837D8+38: *96B83810 = 96B83458
96B837D8+48: *96B83820 = 00000000
96B837D8+50: *96B83828 = 96B83510
96B837D8+68: *96B83840 = 9845C928
96B837D8+70: *96B83848 = 96B834C0                                                                
96B837D8+B8: *96B83890 = 96B83890

So, let’s try to make some sense out of the memory dump above.

First of all, BCFFDE18 is the address of the first dbgdInitEventGr chunk:

select ksmchptr from x$ksmsp where ksmchcom='dbgdInitEventGr' and ksmchidx=1 
  order by ksmchidx fetch first 1 rows only ;
KSMCHPTR
----------------
00000000BCFFDDE8

Regarding dbgdInitEventGr, there is some background information about it in Tanel Poder’s blog .

BCFFDE40 (BCFFDE18+28) points to the root entry of the linked list which starts on 9845C870.

root entry:
9845C870 starting address
9845C928 (9845C870+B8) contains the address of the first element.

1st element:
96B83458 starting address
96B83480 (96B83458+28) if the third least significant bit is turned to 1, the sql will be traced.
96B834A0 (96B83458+48) contains the pointer to the location where the sql id is stored
96B834C0 (96B83458+68) pointer to the next element (2. element)
96B834C8 (96B83458+70) pointer the previous element (root entry)
96B83510 (96B83458+B8) text of alter system command

2nd element:
96B837D8 starting address
96B83810 (96B837D8+38) pointer to the first element
96B83828 (96B837D8+50) pointer to the alter system command which is contained in the first element

We can draw following conclusions by looking at the observations above:

  • The first element should trigger tracing (offset 0x28=6).
  • The first element is the one which was not copied from SGA to PGA during login.
  • The offset locations 0x38 and 0x50 are not initialized for the first element, whereas the same fields in the second element point to the first element.

The uninitialized 0x50 offset looks a bit odd, so let’s compare it with it’s counterpart in the second element. Clearly, it points to the offset 0xB8 of the first element which in turn contains the specification of the alter system command. Hence, I did a test on a test database and updated the uninitialized offset 0x50 of the first element to point to its B8 offset as well (set *0x96B834A8=0x96B83510 in gdb). Amazingly, after manipulating the memory location the trace started kicking in.

It’s true that the trace started working properly again, but I was unable to remove the event. The attempt ended in ORA-600 in a memory deallocation function, which indicates that the pointers have been messed up:

alter system set events='sql_trace[sql: 2asugy1n1r7y7] off' ;
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [KGHFRE3], [0x096D5FDC8], [], [],
[], [], [], [], [], [], [], []

With Tanel Poder’s fcha.sql it can be easily verified that the referenced address 0x0996d5FDC8 is an address in the dbgdInitEventG chunk that was allocated when the event was set.

Last but not least, let me stress that the change of the memory location content was done on a test database just to get some clues for possible workarounds for the problem. Moreover, the described manipulation is unsafe, unsupported and thus must not be done outside the lab.

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *