ORA-04036 incident file shows the private memory allocations by the victim process:
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
******************************************************
PRIVATE HEAP SUMMARY DUMP
33 MB total:
32 MB commented, 486 KB permanent
660 KB free (0 KB in empty extents),
23 MB, 3 heaps: "callheap " 38 KB free held
6539 KB, 1 heap: "kxs-heap-c " 257 KB free held
2879 KB, 1 heap: "session heap " 290 KB free held)
The information above is useless, because the victim allocated only a tiny fraction of the configured memory – 33MB out of 2GB. We rather need the information about top consumers, but that’s missing in the releases below 19c. Simply put, such incident file is only useful when the victim is a top consumer.
We can get the top consumers from v$process_memory – if we’re lucky enough to capture them when the error is occuring:
column spid format a5
select spid,pname,round(allocated/1024/1024) alloc_MB
from v$process_memory m, v$process p
where m.pid = p.pid order by allocated;
SPID PNAME ALLOC_MB
----- ----- ----------
...
25465 W01A 155
5316 W00K 155
24014 W00N 155
2432 W01Q 155
24646 W02Q 155
10527 W01N 155
4523 W028 155
7139 W01W 155
28124 W030 155
6468 W012 155
I executed the query above long after the error. Since the space background processes had still been holding the memory, I was able to identify the top consumers. But generally that need not be the case. pga_allocated in v$active_session_history wouldn’t have helped either, because these sessions have mostly been inactive and therefore not captured by ASH.
The PGA heap dump reveals the main allocation reason:
oradebug setospid 6468
oradebug dump heapdump 1
148560936 bytes, 471 chunks: "control file i/o buffer " SQL
kxs-heap-w ds=7fffaf9dcca0 dsprt=7fffbd929050
It’s unusual that background space processes are keeping that much memory allocated for weeks. For some reason all these processes haven’t released “control file i/o buffer” allocations, but you can kill them when they become inactive.
Notice that I limited the damage by reducing the number of processes to 10 (the default is 1024 – far too high):
show parameter bg
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
_max_spacebg_slaves integer 10
Oracle enhanced the diagnostics in 19c – the incident dump file includes the top consumers now:
Private memory usage per Oracle process
-------------------------
Top 10 processes:
-------------------------
(percentage is of 2074 MB total allocated memory)
77% ospid 11023 (J001): 1580 MB used, 1598 MB alloc <= CURRENT PROC
2% ospid 16559 (M003): 33 MB used, 35 MB alloc, 896 KB freeable
2% ospid 9485 (M000): 31 MB used, 33 MB alloc, 832 KB freeable
1% ospid 9490 (M001): 28 MB used, 31 MB alloc, 896 KB freeable
1% ospid 14965 (M002): 28 MB used, 29 MB alloc, 320 KB freeable
1% ospid 5088 (M004): 25 MB used, 27 MB alloc, 1024 KB freeable
1% ospid 9354 (MMON): 5621 KB used, 22 MB alloc, 320 KB freeable
1% ospid 13713 (TT00): 18 MB used, 19 MB alloc
1% ospid 13715 (ARC0): 18 MB used, 19 MB alloc, 320 KB freeable
1% ospid 13722 (ARC1): 18 MB used, 19 MB alloc
In conclusion, with the improved diagnostic in 19c it became easy to identify the culprit. In the older releases you’d need to sample v$process_memory and hope to capture the top consumers.
Hi Nenad,
Please help to understand 6468 process id,since all process was consuming same memory
Hi Krishna,
I randomly picked 6468. It could have been any other process from the list.