Troubleshooting java.lang.OutOfMemoryError in the Oracle Database

java.lang.OutOfMemoryError

Oracle database JVM imposes some undocumented hard-coded limits leading to java.lang.OutOfMemoryError. In this blog post, I’m going to provide general guidelines on how to analyze such exceptions with OS tools when a higher-level information is missing. I used DTrace, but you can easily obtain the same information with, for example, gdb or any tool for tracing the application function calls.

One way to provoke java.lang.OutOfMemoryError is by allocating a large array (let me know if you have other test cases, please). But let’s first try to reproduce it within a conventional, i.e., non-database JVM.

Conventional JVM

The following class allocates an array of integers, whose size is specified as the command line argument:

public class Demo {
  public static void defineIntArray( int p_size) throws Exception {
    int[] v = new int[p_size];
  }
  public static void main( String args[] ) throws Exception {
    System.out.println(args[0]);
    defineIntArray(Integer.parseInt(args[0]));
  }
}

The execution will break for 500 million elements, because the default maximum Java heap size is simply too low:

java Demo 500000000 
500000000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at Demo.defineIntArray(Demo.java:3)
        at Demo.main(Demo.java:7)

Given that each integer takes four bytes, we can easily estimate the total required memory. After specifying a sufficiently large maximum Java heap size, the program indeed completes without errors:

java -Xmx2g Demo 500000000 
500000000

This experiment confirmed that in a conventional JVM java.lang.OutOfMemoryError really means what it says, that is, not enough memory for completing the operation. Consequently, the problem can be remedied by simply assigning more memory to JVM.

Database JVM

Now, let’s create the same Java code in an Oracle database (19c, in this case):

create or replace and compile java source named  "Demo" as
public class Demo {	
  public static void defineIntArray( int p_size) throws Exception {
    int[] v = new int[p_size];
  }
};
/

CREATE OR REPLACE procedure p_define_int_array (p_size number) 
  as LANGUAGE JAVA NAME 'Demo.defineIntArray(int)' ;
/

The maximum array size – irrespective of the available memory – is 134217723 (0x‭7FFFFFB‬). Adding just one additional element will cause OutOfMemoryError:

exec p_define_int_array(134217724);
BEGIN p_define_int_array(134217724); END;

*
ERROR at line 1:
ORA-29532: Java call terminated by uncaught Java exception:
java.lang.OutOfMemoryError
ORA-06512: at "SYS.P_DEFINE_INT_ARRAY", line 1
ORA-06512: at line 1

Unlike in a conventional JVM, this problem can’t be solved by adding more memory. For example, PGA limit wasn’t set at all, and there was plenty of memory on both server and in the java pool. Also, the absence of common memory shortage errors (ORA-04030, ORA-04031 and ORA-04036) suggests that the lack of memory isn’t a problem. In other words, this limit is hard-coded and, therefore, cannot be avoided by resizing the memory parameters. That’s a serious disadvantage. On top of that, the error message doesn’t specify which (undocumented) limit was reached. The only clue is the bare error message java.lang.OutOfMemoryError. But, luckily, that error message is also written into the server process trace file. Why is that beneficial? Because we can capture the call stack which writes the error message, and after doing so, get an idea which Oracle C functions participate in the error processing:

dtrace -p 12473 -x ustackframes=100 -s call_stack_trace.d '"java.lang.OutOfMemoryError"'

Exception in thread "Root Thread" java.lang.OutOfMemoryError

libc.so.1`__write+0xa
a.out`sdbgrfuwf_write_file+0x3d
a.out`sdbgrfwf_write_file+0x3c
a.out`dbgtfdFileWrite+0x20d
a.out`dbgtfdFileAccessCbk+0x19f
a.out`dbgtfPutStr+0x2d7
a.out`dbktPri+0x95
a.out`ksdwrf+0x1bc
a.out`ksdout+0xa9
a.out`jox_out_to_trc+0xce
a.out`jox_out+0xa5e
a.out`ioc_ewrite+0x25
a.out`iocbf_ewrite+0x1c
a.out`ioc_do_call+0x13a
a.out`joet_switched_env_callback+0x276
a.out`ioct_ewrite+0x2a
a.out`jonfos_write_bytes+0x29a
a.out`jonfos8_write_bytes__cst__+0x2e2
a.out`jonfos8_write_bytes+0x6b
0x7fffbc7540df
0x7fffbc753f80
0x7fffbc753cf8
0x7fffbc70d2ec
0x7fffbc7132ea
0x7fffbc7128c4
0x7fffbc58a3d5
0x7fffbc3dced7
a.out`sjoninvk_jit+0x411
a.out`joevm_invoke_jit_0ret+0x18a
a.out`joevm_invokenonvirtual_setup_and_go+0x577
a.out`joevm_invokenonvirtual+0x2ec
a.out`joe_run_vm+0x2948
a.out`joe_run+0x81e
a.out`joez_jit_trampoline+0x1e1b
a.out`joe_jit_void_trampoline+0x17
a.out`sjoe_jit_void_callback+0x1a7
0x7fffbc71f50c
a.out`sjoninvk_jit+0x411
a.out`joevm_invoke_jit_0ret+0x18a
a.out`joevm_invokevirtual_setup_and_go+0x600
a.out`joevm_invokevirtualq_first+0xa5
a.out`joevm_invokevirtual+0x827
a.out`joe_run_vm+0x4d04
a.out`joe_run+0x81e
a.out`joet_aux_thread_main+0x1254
a.out`seoa_note_stack_outside+0x1d
a.out`joet_thread_main+0x43
a.out`sjontlo_initialize+0x94
a.out`joe_enter_vm+0x7ef
a.out`ioei_call_java+0x38ca
a.out`ioesub_CALL_JAVA+0x2e5
a.out`seoa_note_stack_outside+0x1d
a.out`ioe_call_java+0x15d
a.out`jox_invoke_java_+0x3e2
a.out`kkxmjexe+0x63d
a.out`kgmexcb+0x3a
a.out`kkxmswu+0x47
a.out`kgmexwi+0x56c
a.out`kgmexec+0x54f
a.out`pefjavacal+0x5dd
a.out`pefcal+0x37d
a.out`pevm_FCAL+0xe4
a.out`pfrinstr_FCAL+0xb9
oracle`pfrrun_no_tool+0x3c
oracle`pfrrun+0x53a
oracle`plsql_run+0x2c0
...

Just-in-time compiler

As you can see, the stack above also contains some private memory addresses. These memory locations contain precompiled java code stored in /tmp and mapped by the process [1]. This code was produced by the just-in-time (JIT) compiler, which is the default way to execute Java code in the database [2]:

show parameter jit

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
java_jit_enabled                     boolean     TRUE

The precompiled code performs better, but on the downside, it’s extremely difficult to analyze the problem without having function names on the stack.

Interpreted Java

Therefore, I temporarily switched off the JIT compiler, reproduced the error once again, and, finally, captured the stack. The following stack with interpreted Java code is far more descriptive:

alter session set java_jit_enabled=false ;

dtrace -p 12473 -x ustackframes=100 -s call_stack_trace.d '"java.lang.OutOfMemoryError"'

Exception in thread "Root Thread" java.lang.OutOfMemoryError

libc.so.1`__write+0xa
a.out`sdbgrfuwf_write_file+0x3d
a.out`sdbgrfwf_write_file+0x3c
a.out`dbgtfdFileWrite+0x20d
a.out`dbgtfdFileAccessCbk+0x19f
a.out`dbgtfPutStr+0x2d7
a.out`dbktPri+0x95
a.out`ksdwrf+0x1bc
a.out`ksdout+0xa9
a.out`jox_out_to_trc+0xce
a.out`jox_out+0xa5e
a.out`ioc_ewrite+0x25
a.out`iocbf_ewrite+0x1c
a.out`ioc_do_call+0x13a
a.out`joet_switched_env_callback+0x276
a.out`ioct_ewrite+0x2a
a.out`jonfos_write_bytes+0x29a
a.out`jonfos9_write_bytes+0x21
a.out`joevm_invoke_sysnative+0x20f
a.out`joevm_invokenonvirtual_setup_and_go+0x577
a.out`joevm_invokenonvirtual+0x2ec
a.out`joe_run_vm+0x2948
a.out`joe_run+0x81e
a.out`joet_aux_thread_main+0x1254
a.out`seoa_note_stack_outside+0x1d
a.out`joet_thread_main+0x43
a.out`sjontlo_initialize+0x94
a.out`joe_enter_vm+0x7ef
a.out`ioei_call_java+0x38ca
a.out`ioesub_CALL_JAVA+0x2e5
a.out`seoa_note_stack_outside+0x1d
a.out`ioe_call_java+0x15d
a.out`jox_invoke_java_+0x3e2
a.out`kkxmjexe+0x63d
a.out`kgmexcb+0x3a
a.out`kkxmswu+0x47
a.out`kgmexwi+0x56c
a.out`kgmexec+0x54f
a.out`pefjavacal+0x5dd
a.out`pefcal+0x37d
a.out`pevm_FCAL+0xe4
a.out`pfrinstr_FCAL+0xb9
oracle`pfrrun_no_tool+0x3c
oracle`pfrrun+0x53a
oracle`plsql_run+0x2c0
...

It’s worth noting, that many functions begin with joe (which possibly stands for [J]ava [O]racle runtime [e]nvironment or something similar). Therefore, tracing all joe* functions during the good and bad execution should provide further clues:

#!/usr/sbin/dtrace -s

#pragma D option flowindent
#pragma D option dynvarsize=8m

pid$target:oracle:joe*:entry{}
pid$target:oracle:joe*:return{}

Good execution:

      -> joe_invoke
        -> joe_run
          -> joe_run_vm
            -> joevm_invokerecursive
              -> joevm_first_time_setup
              <- joevm_first_time_setup
              -> joevm_invoke_interp_1loc
              <- joevm_invoke_interp_1loc
            <- joevm_invokerecursive
            -> joe_make_primitive_array
              -> joet_switched_env_callback
              <- joet_switched_env_callback
              -> joet_switched_env_callback
              <- joet_switched_env_callback
              -> joet_switched_env_callback
              <- joet_switched_env_callback
<- joe_make_primitive_array
-> joe_return

Bad execution:

-> joe_invoke
  -> joe_run
    -> joe_run_vm     
      -> joevm_invokerecursive
        -> joevm_first_time_setup
        <- joevm_first_time_setup
        -> joevm_invoke_interp_1loc
        <- joevm_invoke_interp_1loc
      <- joevm_invokerecursive
      -> joe_make_primitive_array
        -> joe_really_blow_it_internal
          -> joe_well_known
          <- joe_well_known
          -> joe_blow_status_internal
            -> joet_exf_stack_top
            <- joet_exf_stack_top
          <- joe_blow_status_internal

The key difference between two executions is the call to joe_really_blow_it_internal, which sounds like a bad comedy name, but is actually the Oracle C function for processing Java unhandled exceptions.

Exception handling

Now, we can capture the full stack in joe_really_blow_it_internal execution:

dtrace -n 'pid$target:oracle:joe_really_blow_it_internal:entry { ustack() } -p 12473

oracle`joe_really_blow_it_internal
oracle`joe_make_primitive_array+0x204
oracle`joe_run_vm+0x397d
oracle`joe_run+0x81e
oracle`joe_invoke+0x7e3
oracle`joet_aux_thread_main+0x17b9
a.out`seoa_note_stack_outside+0x1d
oracle`joet_thread_main+0x43
oracle`sjontlo_initialize+0x94
oracle`joe_enter_vm+0x7ef
oracle`ioei_call_java+0x38ca
a.out`ioesub_CALL_JAVA+0x2e5
a.out`seoa_note_stack_outside+0x1d
a.out`ioe_call_java+0x15d
a.out`jox_invoke_java_+0x3e2
a.out`kkxmjexe+0x63d
a.out`kgmexcb+0x3a
a.out`kkxmswu+0x47
a.out`kgmexwi+0x56c
a.out`kgmexec+0x54f

In this case, we can see that the exception was raised by joe_make_primitive_array, which we can map to the array definition in our Java code.

Primitive arrays

From here on, the analysis is case-specific. In our example, we need to focus on joe_make_primitive_array and try to identify the application Java code and conditions that lead to the error.

The array size is passed by the rdx CPU register (that is, the 3rd argument) and, thus, can be traced with the following DTrace script:

#pragma D option quiet
pid$target:oracle:joe_make_primitive_array:entry {
  printf("%s array size: %d\n", probefunc, arg2);
}
dtrace -s make_primitive_array.d -p 12473
joe_make_primitive_array array size: 134217724

By trying out to allocate the arrays of different sizes we can deduce the limit. Then, we can record the allocation sizes with the script above to identify the particular call that caused java.lang.OutOfMemoryError.

General guidelines

Finally, we can derive the general process for analyzing opaque JVM exceptions in the database:
1. switch off JIT compiler to obtain descriptive call stack
2. collect the stack for joe_really_blow_it_internal
3. analyse the joe_really_blow_it_internal caller

Summary

Unlike in conventional JVMs, java.lang.OutOfMemoryError, surprisingly, doesn’t mean a memory shortage in the database JVM. The error message rather indicates that some hard-coded limitation is exceeded. Unfortunately, there aren’t any further clues about what limit exactly was reached, which makes the analysis difficult. A possibile way to go about it is to capture the call stack when the exception is thrown. The caller has to be analyzed further until we can map it to the application Java code.

Update December 6, 2019 – data types

The maximum number of elements in array depends on the data type. However, what’s always the same is the maximum total size in bytes: 536870895. In other words, the larger the data type, the less elements the array can store. By the way, this limit is the same accross different releases (tested with 12.1, 18.5 and 19.5).

While that’s usually sufficient, it might be the problem when handling large BLOBs (in fact, that was exactly the use case which prompted this research). A 3rd party application uses database Java to spool a BLOB in the file, and, before doing so, allocates the buffer of the size of the BLOB:

byte[] buffer = new byte[bufferSize];

Therefore, OutOfMemoryError will be thrown if the BLOB is larger than the hard-coded limit.

References

[1] Tanel Poder, What the heck are the /dev/shm/JOXSHM_EXT_x files on Linux? May 9, 2014
[2] Oracle Corp., Database Reference February, 2019

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.