We hit ORA-12008 and ORA-00600 errors on materialized view refresh. The ORA-00600 error message were not directly related to the problem. A service request didn’t bear fruit – after getting some random action plans and inquiries for a couple of weeks, we decided to let it go. It was one of those cases where I had to use debugger to understand the root cause and find a workaround.
It turned out, there are two serious problems in how Oracle handles MV refresh. One is that the population of meta data isn’t atomic – if the process breaks, it will leave inconsistent information. As a consequence, the subsequent refreshes will fail. The other is, that the error isn’t raised immediately upon the retrieval of the corrupt metadata. The program continues, instead, and will fail at some random point with a random error message, completely unrelated to the issue. In 21c it was ORA-00600, and in 19c we saw “ORA-01031: insufficient privileges”, even when the privileges were correctly set.
Metadata
A simple table and a materialized view (MV) are sufficient to reproduce the problem:
drop materialized view mv ;
drop table t;
create table t (n number);
create materialized view mv as select * from t ;
The information in the table sys.snap_refop$ is critical. That table is empty after creating the MV:
select sql_txt FROM sys.snap_refop$
where operation# IN (21, 22)
and sowner = user and vname = 'MV' ;
no rows selected
The first refresh populates the table:
exec DBMS_MVIEW.REFRESH( LIST=>'MV', atomic_refresh => false, out_of_place => TRUE);
PL/SQL procedure successfully completed.
select sql_txt FROM sys.snap_refop$
where operation# IN (21, 22)
and sowner = user and vname = 'MV' ;
SQL_TXT
--------------------------------------------------------------------------------
/* MV_REFRESH (CTB) */ CREATE TABLE "U"."%s"
( "N" NUMBER
) SEGMENT CREAT
INSERT /*+ APPEND */ INTO "U"."%s" select * from t
The table contains two SQL statements. One is to create the table that underpins the MV. The other is an insert to populate it. The excerpts from the SQL trace below show that the subsequent refresh indeed reads the table and executes the retrieved commands:
@ev 10046 1
exec DBMS_MVIEW.REFRESH( LIST=>'MV', atomic_refresh => false, out_of_place => TRUE);
...
SELECT operation#, cols, sql_txt FROM sys.snap_refop$ WHERE operation# IN (21, 22) AND sowner = :1 AND vname = :2 AND instsite = :3 ORDER BY operation#
...
/* MV_REFRESH (CTB) *//* MV_REFRESH (CTB) */ CREATE TABLE "U"."RV$705F"
( "N" NUMBER
) SEGMENT CREATION DEFERRED
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING
TABLESPACE "USERS_DAYT2CP"
...
/* MV_REFRESH (ITB) */INSERT /*+ APPEND */ INTO "U"."RV$705F" select * from t
In this case everyhing went well because sys.snap_refop$ contained both entries. But, as we shall see, if the second row is missing, the refresh will fail.
When things go wrong
I’m going to demonstrate this by crashing the process just before inserting the INSERT entry. For that purpose, I’m going to set a breakpoint on the Oracle C function rpisplu_internal (“recursive program interface switch user and parse”, according to Frits Hoogland’s orafun). Simply put, this function executes recursive SQLs under SYS user on user’s behalf. It receives the pointer to SQL text in the 5th argument (at least in Oracle 21.4, where I conducted this analysis). Since according to x86 calling conventions, the 5th argument is passed through the CPU registry R8, the following breakpoint will stop at the function entry and print the SQL text:
b rpisplu_internal if $_regex( (char *)$r8, "(?i)insert.*snap_refop")
commands 1
x/s $r8
end
The stack trace on the breakpoint, when the second SQL text is just about to be inserted, looks as follows:
Breakpoint 1, 0x000000000432b680 in rpisplu_internal ()
0x15f77a80 : "INSERT INTO sys.snap_refop$ (sowner, vname, tabnum, operation#, cols, sql_txt, instsite, fcmaskvec, ejmaskvec, setnum) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10)"
(gdb) bt 3
#0 0x000000000432b680 in rpisplu_internal ()
#1 0x0000000009ebb1ed in kkzfrc_ofp ()
#2 0x0000000009eb9fb4 in kkzfrc ()
rpisplu_internal is called by kkzfrc_ofp. Oracle C function kkzfrc_ofp orchestrates the whole MV refresh and is called only once.
We’ll continue the execution until hitting the breakpoint for the 2nd time.
0x15f77a80 : "INSERT INTO sys.snap_refop$ (sowner, vname, tabnum, operation#, cols, sql_txt, instsite, fcmaskvec, ejmaskvec, setnum) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10)"
The CREATE entry is already commited at that point:
select sql_txt FROM sys.snap_refop$
where operation# IN (21, 22)
and sowner = user and vname = 'MV' ;
2 3
SQL_TXT
--------------------------------------------------------------------------------
/* MV_REFRESH (CTB) */ CREATE TABLE "U"."%s"
( "N" NUMBER
) SEGMENT CREAT
I’m crashing the process on the breakpoint, so that the second SQL text (for INSERT) won’t be inserted:
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 1849638) killed]
Since the inserts weren’t atomic, we ended up with an incomplete information in sys.snap_refop$.
The subsequent refresh fails:
exec DBMS_MVIEW.REFRESH( LIST=>'MV', atomic_refresh => false, out_of_place => TRUE);
BEGIN DBMS_MVIEW.REFRESH( LIST=>'MV', atomic_refresh => false, out_of_place => TRUE); END;
*
ERROR at line 1:
ORA-12008: error in materialized view or zonemap refresh path
ORA-00600: internal error code, arguments: [kghstack_alloc], [kkzfrc_ofpBegin],
[18446603592378862232], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 91
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 288
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2601
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3194
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3481
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3513
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 16
ORA-06512: at line 1
ORA-00600 was generated in the Oracle C function kghstack_alloc, which allocates the memory on the stack. kkzfrc_ofpBegin ist the allocation reason. The allocation reason is the comment passed to the allocation function, and stored in the allocated memory chunk. The memory dump displays the allocation reason, which is useful for analyzing memory allocations long after the chunk was allocated.
The number 18446603592378862232 looks suspicious because it looks neither as a plausible size in bytes nor as a pointer. First, we need to grasp what that number is. Let’s convert it in the hexadecimal notation:
printf '%X\n' 18446603592378862232
FFFF803BA42FCA98
We can see that the value was passed to kghstack_alloc as the second argument:
kghstack_alloc()+31 call kgeasnmierr() 7FC4616FA060 ? 7FC4615A0050 ?
6 017A5F84C ? 000000258 ?
000000001 00000000F
kkzfrc_ofp()+11747 call kghstack_alloc() 7FC4616FA060 ?
FFFF803BA42FCA98 015F6F32C ?
000000258 ? 000000001 ?
00000000F ?
kkzfrc()+19876 call kkzfrc_ofp() 7FC45BD01FD0 000007078
7FC461596800 000000258 ?
000000001 ? 00000000F ?
This number is huge, but only if represented as an unsigned long integer. The other representation, as a signed long integer, looks more meaningfull:
p/d 18446603592378862232
$2 = -140481330689384
It finally becomes obvious when we see the absolute value in the hexadecimal notation – it’s an address on the heap or on the stack:
p/x 140481330689384
$3 = 0x7fc45bd03568
Simply put, a negative heap (or stack) address was passed as the second argument to kghstack_alloc. It doesn’t make any sense to use negative addresses – this was just an error. Further, kghstack_alloc expects an unsigned long integer as the second argument. To understand the cause for that error, we have to look at the usual argument values for kghstack_alloc. In particular, we’re interested in the second argument. Out of curiosity, we might take a look at the third argument as well because 015F6F32C looks like a string.
We can obtain all that information with the following breakpoint:
# breakpoint on entry
b kghstack_alloc
commands 2
# print 2nd argument as unsigned long
p/d $rsi
# dereferenced string pointer in the 3th argument
x/s $rdx
end
The output is below:
Breakpoint 2, 0x0000000015347780 in kghstack_alloc ()
$16 = 384
0x1707878c: "psdnam:parts"
(gdb) c
Continuing.
Breakpoint 2, 0x0000000015347780 in kghstack_alloc ()
$18 = 4000
0x165b3e34: "kzctxEUsrEnv: tmp ctx"
(gdb) c
Continuing.
Breakpoint 2, 0x0000000015347780 in kghstack_alloc ()
$19 = 84
0x16f2296c: "rpi role space"
The string passed by the fourth argument looks like the allocation reason. The second argument is always a small positive integer (384, 4000, 84) that might be the requested allocation size. But when the error was thrown, kghstack_alloc received a negative pointer instead of the size in bytes.
Now we can reconstruct what happened. The size passed to kghstack_alloc for the allocation reason kkzfrc_ofpBegin is calculated as a difference between two pointers. In case of error, the minuend was a null pointer that was left unchecked, so the difference was a negative value of the subtrahend.
Unhandled errors
The problem first appeared in the Oracle C function kkzdrro. kkzdrro is called by kkzfrc_ofp to select the information from the table sys.snap_refop$ and populates the structure which holds the text of both SQL statements. But since there was no insert statement, the pointer remained uninitialized. Unfortunately, the program didn’t stop and raise the error. It continued, instead, and threw a random error at some random point. In 21.4, the problem manifests itself as ORA-00600 in kghstack_alloc, but it can just as well be any other error message where the program can’t deal with the null pointer anymore.
The analysis would have been much easier if only the developer had decided to throw an error immediately after retrieving the incomplete information. rpifch is another function in the recursive programming interface which we can use for tracking down the root cause. This function fetches the rows from sys queries executed on user’s behalf and is directly called by kkzdrro:
b rpifch
(gdb) bt 3
#0 0x000000000432c200 in rpifch ()
#1 0x0000000009e53647 in kkzdrro ()
#2 0x0000000009eba159 in kkzfrc_ofp ()
It returns an ORA- error code or 0 on good execution. It returned 1403 (“no data found”) when it tried to fetch the second row (INSERT command):
(gdb) p $rax
$31 = 1403
oerr ora 1403
01403, 00000, "no data found"
// *Cause: No data was found from the objects.
// *Action: There was no data from the objects which may be due to end of fetch.
As I already mentioned, this error is ignored until the problem with the memory allocation on the stack. I ran pin debugtrace just to calculate the number of function calls between the ignored “no data error” and failed memory allocation on the stack:
sudo /export/home/oracle/workspace_vbnov/pin-3.21-98484-ge7cd811fd-gcc-linux/pin -pid 1895209 -t source/tools/DebugTrace/obj-intel64/debugtrace.so
3460965: 121462970 Return 0x000000000432c235 21.4.0.0.211019_a/bin/oracle:rpifch+0x000000000035 returns: 0x57b
...
3620569: 127665538 Call 0x0000000009ebce1e 21.4.0.0.211019_a/bin/oracle:kkzfrc_ofp+0x000000002dde -> 0x0000000015347780 21.4.0.0.211019_a/bin/oracle:kghstack_alloc(0x7fda927ce060, 0xffff80256d988aab, ...)
This number is immense – there have been ( 3620569 – 3460965 ) / 2 = 79802 function calls between the retrieval of the inconsistent information and raising the error. Some random error could have been raised anywhere. Whenever a null pointer remains unchecked and propagates it’s going to be difficult to identify the root cause.
Monitoring
This (now) known problem can be monitored with the following query:
SELECT sowner, vname FROM sys.snap_refop$
WHERE operation# IN (21, 22)
group by sowner,vname
having count(*) = 1 ;
Workaround
As a workaround you can recreate the affected MVs, which will also delete the corrupted information in the snapshot table.
Summary
In the course of the first MV refresh, the Oracle process populates the table sys.snap_refop$ with two SQL commands that will be used in future refreshes. The first is CREATE TABLE, the other is INSERT. Unfortunately, these two inserts don’t happen in the same transaction. Consequently, if the process dies in-between, we might end up only with the first entry. The subsequent MV refresh isn’t able to handle this inconsistency. Moreover, the program doesn’t stop after not finding the expected information; it continues, instead, and breaks at some later random point with a random error message.
I’ve provided the query to monitor these inconsistencies in the metadata. Should you encounter such a problem, recreate the MV as a workaround.
Updates
February 18, 2022
The test case produces the same error message on the following Oracle versions:
- Solaris x86-64/non-cdb: 19.10, 19.13
- Linux/multi-tenant: 19.13, 21.4
March 25, 2022
The problem is reproducible in 21.5 as well.
March 28, 2022
I modified the breakpoint definition to make it a little bit easier to reproduce the problem. The breakpoint is ignored for the first time. At the second occurrence, kill is automatically invoked:
b rpisplu_internal if $_regex( (char *)$r8, "(?i)insert.*snap_refop")
commands 1
x/s $r8
kill
end
ignore 1 1
The expected output after the program stops at the breakpoint (answer the question to kill the process with “y”):
Breakpoint 1, 0x000000000432b680 in rpisplu_internal ()
0x15f77a80 : "INSERT INTO sys.snap_refop$ (sowner, vname, tabnum, operation#, cols, sql_txt, instsite, fcmaskvec, ejmaskvec, setnum) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10)"
Kill the program being debugged? (y or n) y
[Inferior 1 (process 3844841) killed]
August 6, 2022: Bug
Oracle Support filed the following bug for this problem:
Bug 34383371 – INCASE OF FIRST REFRESH OF MVIEW WITH OUT_OF_PLACE=> TRUE BREAKS THEN SUBSEQUENT MVIEW REFRESH MAY FAIL WITH ORA-00600: [KGHSTACK_ALLOC], [KKZFRC_OFPBEGIN]