Dead Transaction Rollback

A crashed process can leave a dead transaction behind. Generally, SMON is the one who’ll be rolling it back. Under certain circumstances, though, a server process might lend a helping hand. That’s exactly the scenario we’ll be focusing on. The article is based on the Oracle version 12.2.0.1.180116.

A dead transaction can be identified in the transaction table, which in turn is exposed in the memory structure x$ktuxe:


column usn format 99
column slt format 99

select ktuxeusn usn, ktuxeslt slt, ktuxesqn sqn, ktuxesiz siz, 
    '0x' || to_char(ktuxeusn,'FM0xxxx') || '.' || to_char(ktuxeslt,'FM0xxx') ||
    '.' || to_char(ktuxesqn,'FM0xxxxxxxx') xid 
  from x$ktuxe where KTUXECFL='DEAD' ;                                                                                                                            

USN SLT        SQN        SIZ XID
--- --- ---------- ---------- -------------------------
10  12    4948315    5384191 0x0000a.000c.0004b815b

The meanings of the selected columns are as follows:

  • ktuxusn – undo segment number
  • ktuxeslt – slot number in the transaction table
  • ktuxesqn – the number of times that the slot has been used (a.k.a wrap#)
  • ktuxesiz – remaining number of undo blocks to be applied in rollback
  • xid – transaction id (ktuxusn.ktuxeslt.ktuxesqn in hexadecimal)

There are still 5384191 million undo blocks left to apply for rolling back the dead transaction above. During this time ktuxesiz keeps decreasing. After stopping the SMON with pstop, its value will remain constant, which proves that the SMON is the process doing the recovery.

But what will happen if we try to change the data that still have to be rolled back?

To answer this question, we first need to find a block containing some data changed by the dead transaction. Below is the excerpt from the dump of such block showing its interested transaction list (ITL):

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000c.006.0058fe83  0x00d5d0f8.d426.33  C---    0  scn  0x00000004ed0206b5
0x02   0x000c.010.00590c33  0x00d67d44.d42a.26  C---    0  scn  0x00000004ed023e3c
0x03   0x000a.00c.004b815b  0x00c0cf8c.9116.2e  ----    1  fsc  0x021e.00000000

Our dead transaction 0x000a.00c.004b815b is placed in the third slot in the blocks’s ITL.

A DML statement that is changing data in this block won’t expectedly end until the rollback is finished. Session statistics collected with Tanel Poder’s script Snapper provide a cue about what the server process is doing during this time:


@snapper stats,gather=s 5 5 256
------------------------------------------------------------------------------------------------------------------------------
TYPE, STATISTIC                                , DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
------------------------------------------------------------------------------------------------------------------------------
...
STAT, rollback changes - undo records applied  , 36009,      6.77k,         ,             ,          ,           ,   14.13k per execution
...

As you can see, the server process is doing a rollback. Moreover, it applied 36009 undo records in the time interval covered by the snapshot. Yet no explicit rollbacks have been issued – neither “transaction rollback” nor “user rollback” counter has been incremented.

ktuxesiz will keep decreasing after we stop the SMON, which proves that the dedicated process is rolling back our dead transaction. Also, we can double-check that by tracing the Oracle function kturRecoverUndoSegment. This function seems to accept undo segment number and slot as the first and the fifth argument, respectively.

The DTrace script must start before the DML statement:

# pragma D option quiet
pid$target:oracle:kturRecoverUndoSegment:entry
{
  printf("USN: %d, SLOT: %d \n",arg0,arg4);
}

USN und SLOT correspond indeed with our dead transaction:

USN: 10, SLOT: 12

In fact, implementing this behaviour was a smart decision – when a dedicated process wants to change the data that have to be rolled back, it starts doing rollback instead of just idly waiting on SMON to finish the work.

References:

Thanks for sharing

Nenad Noveljic

2 Comments

  1. just discovered your blog through twitter. this is a fantastic post. looking forward to combing through rest of site!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.