The aim of this blog post is to warn about a potential performance issue in Oracle 12.2 related to PL/SQL exceptions.
The test case is rather simple. First, I’ll create an empty table.
create table t (a number) ;
Secondly, I’ll run the following anonymous block which executes a simple query on an empty table in a loop.
declare
l_c number ;
i number ;
begin
for i in 1..10000
loop
begin
select a into l_c from t ;
exception when no_data_found then
null ;
end ;
end loop ;
end ;
/
The key point here is that the no_data_found exception handler will run with each loop iteration. Of course, null is a lousy exception handler, but I’m using it here just for demonstration purposes.
In my case, the PL/SQL block completed instantly on a 12.1 database, whereas on 12.2 the execution took 12.49 seconds. That’s way too long.
Interestingly, the executed Oracle code is not instrumented. What I mean by that is, the CPU and the total wait time don’t account for the elapsed time.
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 2.25 12.49 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 2.25 12.49 0 0 0 1
...
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
Disk file operations I/O 1 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
Therefore, we need to use some OS diagnostics for further analysis. First of all, I executed pstack on the long running process on an Oracle 12.2 instance to get a rough idea which Oracle functions were being executed. This is what I got:
ffff80ffbcf9298a __read () + a
000000000a1f32b3 slmsrd () + 83
0000000017ec5f6b lmsaprb () + 5b
0000000009cc7c8e lmsagb1 () + 11e
0000000009cc7b1e lmsagbf () + 4e
0000000009cc7240 kgebse () + 180
0000000009cc704c kgebem () + 4c
0000000009cc6f19 ksemgm () + 69
0000000009714f85 opiodr () + 455
...
According to Frits Hoogland’s sf.sh , kge functions are some kernel generic error functions. In fact, it is easy to verify that those functions are called when the PL/SQL exception handling code is being executed. In addition, we can inspect the arguments of the functions above with the following DTrace script:
pid$target::kge*:entry,
pid$target::lms*:entry
{
printf("%x %x %x %x %x %x %x %x \n",arg0,arg1,arg2,arg3,arg4,arg5,arg6,arg7);
}
...
lmsapsc ffff80ffbd5b02c8 57b ffff80ffbfff62db 1f5 2 203a 0 3aad71e4
lmsapbn ffff80ffbd5b02c8 2 57b ffff80ffbfff62db 1f5 0 0 0
...
As you can see, 0x57b is being passed as an argument to some of the functions. 0x57b is 1403 in decimal notation which, in turn, is nothing else but the error code for the “no data found” condition.
As for the next step, I’m going to compare the number of function calls in both Oracle versions.
12.2:
kgecss 2
kge_is_oserr 20000
kge_is_resig_mandatory 20000
kgeade 20000
kgebem 20000
kgebse 20000
kgefrp 20000
kgekeep 20000
kgepop 20000
kgeselv 20000
kgespf 20000
kgespo 20000
kgetop 20000
lmsagb1 20000
lmsagbf 20000
lmsapbn 20000
lmsaprb 20000
lmsapsb 20000
lmsapsc 20000
lmsapts 20000
slmsrd 20000
kge_is_resig_mandatory_errframe 40000
kgeresl 40000
kgegec 80000
12.1:
kgecss 1
kge_is_oserr 10000
kge_is_resig_mandatory 10000
kgeade 10000
kgebem 10000
kgebse 10000
kgefrp 10000
kgegbc 10000
kgekeep 10000
kgepop 10000
kgeselv 10000
kgespf 10000
kgespo 10000
lmsagb1 10000
lmsagbf 10000
lmsapsb 10000
lmsapsc 10000
kge_is_resig_mandatory_errframe 20000
kgeresl 20000
kgetop 20000
kgegec 50000
Although the most obvious difference is twice as much calls for almost every function in 12.2 compared to 12.1, this is not the main reason for the severe performance decrease. As a matter of fact, an attentive reader might notice that the slmsrd calls are entirely missing in the 12.1 output.
Again, I’ll use Frits’ sf.sh to figure out the purpose of the Oracle kernel function slmsrd:
./sf.sh slmsrd
slmsrd -- system specific call
slms stands for “system specific call”. And if “rd” stands for read, the question is, what is being read during exception handling? Knowing that every read operation results in a system call, we can truss the Oracle dedicated process to answer this question:
truss -f -tread -p 5522
...
5222: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
5522: read(9, "\f\0 x05\0\0 P\0 y05\0\0".., 512) = 512
...
As you can see, not only are there thousands of reads on the file behind the descriptor 9, but also the same piece of information is read 20000 times for 10000 raised exceptions.
Next, with pfiles we can find out which file is being read:
pfiles 5522
5522: oracleAVAR19 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
...
9: S_IFREG mode:0755 dev:286,65600 ino:4840002 uid:1000 gid:1000 size:1344512
O_RDONLY|O_LARGEFILE FD_CLOEXEC
.../rdbms/mesg/oraus.msb
...
Finally, the culprit is revealed. It’s oraus.msb that contains all of the error messages. Not only is the error message text completely unnecessary in this case, but also Oracle database doesn’t bother with caching it.
Just imagine a mess which might arise on a highly consolidated platform with applications frequently executing exception handling code and by doing so, heavily reading from oraus.msb at the same time. In addition, a session trace wouldn’t provide any clue that the the application is waiting on a completely irrelevant piece of information to be read from the filesystem.
Workarounds
Unfortunately, I couldn’t find any documents on Metalink about the problem. However, I’d like to suggest some possible mitigation countermeasures. Before doing that, it’s worth noting that the Oracle home in this case was placed on a separate ZFS file system which was not configured for caching data, i.e. primarycache was set to metadata (only). The reason for doing this is to avoid cache thrashing by loading Oracle homes into the cache during OS backups. Frankly, it’s the first time I’ve ever seen that the query elapsed time is depending on the Oracle home file access latency. Anyway, after setting primarycache to all, the code started to run fast in 12.2 as well.
In case you don’t have a possibility of either caching Oracle home or putting it on an SSD drive, there are also couple of things you could do on the application side to reduce the number of message file reads.
First of all, check if you can create conditions in the environment which would lead to less exceptions in total. For instance, if you insert a row into the table above, no_data_found won’t fire any more and the code will execute fast.
Secondly, you could increase the exception handler’s scope if feasible. For example, you can take it out of the loop:
declare
l_c number ;
i number ;
begin
for i in 1..10000
loop
select a into l_c from t ;
end loop ;
exception when no_data_found then
null ;
end ;
/
Lastly, you could replace the entire exception handler with an explicit cursor like this:
declare
l_c number ;
i number ;
cursor c is select a from t ;
begin
open c ;
for i in 1..10000
loop
fetch c into l_c ;
if c%notfound then
null;
end if ;
end loop ;
end ;
/
Summary
In conclusion, a performance issue has been introduced in Oracle 12.2. The message file is repeatedly being read by dedicated processes, even in cases when no error has to be thrown. In particular, applications which frequently raise PL/SQL exceptions could be affected. To make the problem worse, the trace file doesn’t provide any clues to the root-cause. Currently, there doesn’t seem to be any information on Metalink about it. Nevertheless, I highlighted different possibilities of mitigating the issue.
Update February 17, 2022
The problem is fixed in recent Oracle releases. I tested it on 19.14 and 21.4 PDB. There’s only one read per execution:
sudo /bin/strace -p 3924265 -c
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
83.81 0.000088 0 284 getrusage
14.29 0.000015 15 1 read
1.90 0.000002 2 1 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.000105 0 286 total
Hi Nenad, great post and investigation!
Please mind it is important to show the line beneath the function found with sf.sh, which indicates what part of the function name corresponds to known code layers. When you look at slmsrd, you see that the first two letters of the slmsrd function mean it’s in the system specific code layer.
$ ./sf.sh slmsrd
slmsrd — system specific call
—
So ‘sl’ means system specific call, I would say ‘ms’ would mean message or error message, and ‘rd’ obviously read.
In fact, if you use the ‘wildcard mode’ on sf.sh with sl, you’ll see it’s a code layer that performs access to system calls:
$ ./sf.sh -w sl
sl system specific call
slcpu get cpu usage
slcpums get cpu usage (for getrusage)
sltrgatime64 arbitrary time lookup
sltrgftime64 time lookup main (for clock_gettime)
sltrgftime64_cgt clock_gettime
Hi Frits,
First of all, let me thank you for the kind words. I’m honored.
Also, thank you for the additional clarification on sf.sh. In particular, I wasn’t event aware of the wild card mode.
By the way, it’s such a helpful utility when analyzing Oracle function calls.
About the comment on filesystem caching for the Oracle home(s): in the past I’ve experienced performance issues when it the Oracle home was placed on a high latency device (NFS in that case). There are multiple files (I’ve forgotten which exactly) which are very intensely read and quite probably Oracle assumes that they’ll be cached.
Actually, I’ve been completely unaware of this dependency. I mean I’ve never observed that performance depended on Oracle home file access latency, at least not to such extent. On the other hand, it might be that I haven’t hit any such issue so far, simply because I’ve never installed Oracle homes on a high latency file system.
Anyway, I’ll keep your comment in mind, especially in cases where the total of CPU and wait time doesn’t account for the elapsed time.
However, in this case, I’d rather attribute this coding anti-pattern to a bug than to a deliberate decision.