Bind Variable Length in PL/SQL, C, Java and Python

Bind variable length has a significant impact on the execution plan. The change in size can prompt optimizer to create a new child cursor. Mohamed Houri described this behavior in his blog post Why my execution plan has not been shared – Part III .

In this article I discuss nuances of how different programming languages handle bind variable length. The experiments were performed in Oracle releases 19.16 and 21.7.

SQLPlus

I’ll start with a short demo in SQLPlus. First, I’ll pass the variable of the length 20 to a query:

create table t (c varchar2(4000));

var v varchar2(20)
exec :v := 'AAAA'
select * from t where c=:v ;

The optimizer created a child cursor for the VARCHAR2(20) variable.

select count(*) from v$sql_shared_cursor where sql_id = '5a8bp2fr4v6w9' ;

  COUNT(*)
----------
         1

If we increase the size in declaration to 90, the optimizer will create another cursor:

var v varchar2(90)
exec :v := 'AAAA'
select * from t where c=:v ;

select count(*) from v$sql_shared_cursor where sql_id = '5a8bp2fr4v6w9' ;

  COUNT(*)
----------
         2

It’s worth noting that I just changed the definition of the bind variable, but the value remained the same (‘AAAA’). Although the value hasn’t changed, the optimizer generated a new child cursor.

Two cursors are for two ranges of VARCHAR2 sizes – one with the upper limit of 2000 characters and the other with the upper limit of 128 characters:

select datatype_string from v$sql_bind_capture where  sql_id = '5a8bp2fr4v6w9' ;
DATATYPE_STRING
------------------------------------------------------------
VARCHAR2(2000)
VARCHAR2(128)

Peculiarity

Let’s take a closer look at the cursors and bind variables lengths that were passed when the cursors were created. The first cursor was created for the bind variable length 20 and the second for 90.

select datatype_string from v$sql_bind_capture where  sql_id = '5a8bp2fr4v6w9' ;
DATATYPE_STRING
------------------------------------------------------------
VARCHAR2(2000)
VARCHAR2(128)

But the bind variable lengths 20 and 90 don’t match the buckets for which the child cursors were created (128 and 2000, respectively). Both bind variable lengths, i.e. 20 and 90, should have fit into the first bucket, that is VARCHAR2(128). Simply put, the second cursor, i.e. for the bucket with the upper limit 2000 shouldn’t have been created.

In order to understand why the second cursor was created we need to find out which bind variable length is being passed to the database.

SQL Trace

The information about the bind variable length is burried deep down in the SQL trace file and it isn’t documented. I identified the field where it’s stored by passing variables of different lengths and correlating a known bind variable length with the cryptic data in the raw SQL trace file. For example, below is the bind entry for the SQLPlus call where I passed the value 90:

 Bind#0
  oacdty=01 mxl=2000(270) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000000 frm=01 csi=873 siz=2000 off=0
  kxsbbbfp=7f9982103b40  bln=2000  avl=04  flg=05
  value="AAAA"

In fact, we’re dealing with two sizes. One is the actual size of the passed data, which is 4 for the string “AAAA”. It’s stored in the field avl, as documented in Cary Millsap’s book The Method R Guide to MASTERING ORACLE TRACE DATA. We won’t consider the actual size further, because it’s irrelevant for the optimizer. The value of interest is mxl, which possibly stands for “maximum length”. Unlike avl, mxl contains the bind vaiable length, and it is relevant for the optimizer. It consists of two values: 2000(270). 270 determines in which bucket the bind variable value will be placed. 2000 is the upper limit of the bucket. Simply put, 270 belongs to the bucket with the upper limit 2000.

The question is: where did the value 270 come from? The variable length in declation, i.e. 90, was multiplied by 3 (90*3=270). Optimizer uses 270 for deciding in which bucket to place the bind variable. To understand where the multiplication by 3 had came from, I did an experiment using the Oracle Call Interface (OCI) with C. In C we have the highest degree of control of what’s being sent to the database.

C

I modified cdemo81.c OCI demo program to bind a known value and correlate it with the raw SQL Trace data. The code for binding the variable looks like this:

static text *selt = (text *)"SELECT 1 FROM t WHERE c = :C";
...
text     *c = (text *)"AAAA";
...
  if ((status = OCIBindByName(stmthp1, &bnd1p, errhp, (text *) ":C",
             -1, (dvoid *) c,
             90, SQLT_STR, (dvoid *) 0,
             (ub2 *) 0, (ub2 *) 0, (ub4) 0, (ub4 *) 0, OCI_DEFAULT)) )
  {
    checkerr(errhp, status);
    finish_demo(svchp, srvhp, authp,stmthp, stmthp1,inserthp);
    return OCI_ERROR;
  }

My call to OCIBindByName binds the text “AAAA” to the bind variable “:C”. The specification of OCIBindByPos is as follows:

sword OCIBindByPos ( OCIStmt       *stmtp, 
                     OCIBind      **bindpp,
                     OCIError      *errhp,
                     ub4            position,
                     void          *valuep,
                     sb4            value_sz,
                     ub2            dty,
                     void          *indp,
                     ub2           *alenp,
                     ub2           *rcodep,
                     ub4            maxarr_len,
                     ub4           *curelep, 
                     ub4            mode );

value_sz is the maximum size possible in bytes for the binding data, which is stored in *valuep. Basically, a developer can set value_sz to any value. I set value_sz to 90 and got the following entry in the SQL trace:

 Bind#0
  oacdty=01 mxl=2000(267) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000000 frm=01 csi=873 siz=2000 off=0
  kxsbbbfp=7f2923d0f920  bln=2000  avl=04  flg=05
  value="AAAA"

I passed 90, but this value was transformed to 267. 1 was subtracted from the original value 90 and then multiplied by 3. This behavior is somewhat consistent to what we saw in SQLPlus. This would mean that SQLPlus makes a similar call for binding values where it passes VARCHAR2 length (from the declaration)+1 as value_sz. As a consequence, the bind values can be assigned to wrong buckets that don’t match the actual data size.

This inconsistency prompted me to inspect how different programming languages behave. I started with PL/SQL.

PL/SQL

I declared the bind variable as VARCHAR2(4000) and binded a 90 characters long VARCHAR2:

declare
  l_n integer ;
  l_v varchar2(4000);
begin
  l_v := lpad('A',90,'A');
  begin
    select 1 into l_n from t where c = l_v;
  exception when no_data_found then
    null ;
  end;
end ;
/

Oracle uses the actual length of the data (90) for bucketizing:

 Bind#0
  oacdty=01 mxl=128(90) mxlc=00 mal=00 scl=00 pre=00
  oacflg=13 fl2=206001 frm=01 csi=873 siz=128 off=0
  kxsbbbfp=7f40ed211c88  bln=128  avl=90  flg=09
  value="AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

Two points are worth noting. One is that, unlike SQLPlus, the actual variable size instead of the size in the variable declaration is passed as the bind variable length (mxl). The other is, that, unlike in C, the actual size isn’t manipulated, so the value will be assigned to the bucket where it really belongs.

The fact that the actual size is passed as the maximum value means that we should get multiple cursors created when binding values of different sizes. The following code binds the strings of lengths between 1 and 32767,

declare
  l_n integer;
  l_v varchar2(32767);
begin
  for i in 1..32767
  loop
  	l_v := lpad('A',i,'A');
    begin
      select 1 into l_n from t where c = l_v;
	exception when no_data_found then
	  null;
	end;
  end loop;
end;
/

This yields several buckets with the following upper limits:

select datatype_string from v$sql_bind_capture where  sql_id = '85ff9tkukjsg0' ;

DATATYPE_STRING
------------------------------------------------------------
VARCHAR2(32767)
VARCHAR2(16386)
VARCHAR2(8192)
VARCHAR2(4000)
VARCHAR2(2000)
VARCHAR2(128)
VARCHAR2(32)

In conclusion, there can be up to 7 child cursor for a query due to bind variable length changes.

Java

I did the same experiment in Java – binding the 90 characters long string and looking in SQL trace to see which maximum data length is used for binding.

import java.sql.*;

public class ExecutePreparedStatement4 {
    public static void main(String[] argv) {
        Connection con = null ;
        try {
            Class.forName("oracle.jdbc.driver.OracleDriver");
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            return;
        }

        try {
            con = DriverManager.getConnection(
                "jdbc:oracle:thin:@ " + argv[0], argv[1], argv[2]);
            CallableStatement trace=con.prepareCall( "{ call DBMS_MONITOR.session_trace_enable(waits=>FALSE, binds=>TRUE) }" );
            trace.execute();
            trace.close();

            String sql = "select 1 from t where c = ?";
            PreparedStatement s = con.prepareStatement(sql);

            String bindVar ;

            bindVar = new String(new char[90]).replace('\0', 'A');
            s.setString(1,bindVar);
            s.executeQuery();

        } catch (SQLException e) {
            e.printStackTrace();
            return;
        }
    }
}
 Bind#0
  oacdty=01 mxl=2000(360) mxlc=00 mal=00 scl=00 pre=00
  oacflg=03 fl2=1000010 frm=01 csi=873 siz=2000 off=0
  kxsbbbfp=7f9e1ae28a18  bln=2000  avl=90  flg=05
  value="AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

The actual string length was multiplied by 4.

Python

Python behaves exactly like Java – it calculates the bind variable length by multiplying the actual length with 4:

import cx_Oracle

connection = cx_Oracle.connect(user="u", password="Temp_1234",
                               dsn="DFEX5",
                               encoding="UTF-8")

cur = connection.cursor()

cur.execute("begin DBMS_MONITOR.session_trace_enable(waits=>FALSE, binds=>TRUE); end;");
cur.execute("select 1 from t where c = :c1", ["A" * 90])
 Bind#0
  oacdty=01 mxl=2000(360) mxlc=00 mal=00 scl=00 pre=00
  oacflg=07 fl2=1000010 frm=01 csi=873 siz=2000 off=0
  kxsbbbfp=7f74f187d930  bln=2000  avl=90  flg=05
  value="AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

Summary

In summary, optimizer uses the size of the bind variable for deriving execution plans. The lengths are divided into 7 buckets with the following upper limits:

VARCHAR2(32767)
VARCHAR2(16386)
VARCHAR2(8192)
VARCHAR2(4000)
VARCHAR2(2000)
VARCHAR2(128)
VARCHAR2(32)

When the bind variable size changes, and the child cursor for this size doesn’t exist, the optimizer will create a new one.

The term “bind variable size” is ambigous and programming language specific. It’s derived from the value which is specified in the binding OCI call. In C, an application programmer has the direct control over it and can set it to any value. 1 is subtracted from this value and than multiplied by 3. As a consequence, VARCHAR2 values can be assigned to wrong buckets.

In higher level programming languages like Java and Python, this decision is taken away from the application developer – you can’t specify that value. They are set by language developers instead. It’s worth noting that the actual variable size is multiplied by 4, so also here the variables will be assigned to wrong buckets, but in a different manner than in C.

In PL/SQL there aren’t any manipulations on the bind variable length, so the bucketizing is most precise.

Interestingly, SQLPlus developers decided to pass the variable size from the declaration as (incremented by 1) as opposed to the actual size, so unlike in C, PL/SQL, Java and Python, new cursors won’t be generated if the data size changes as long as the declaration remains the same.

As a consequence, optimizer can do different calculations for the same query in different programming languages. That’s good to be aware of if you’re porting an application to another language. Or, for example, if you’re testing your queries in SQLPlus, you might get different execution plans after you embed the the same query in the application code.

Thanks for sharing

Nenad Noveljic

4 Comments

  1. >@ ver1
    PORT_STRING VERSION BANNER CON_ID
    ——————– ———- —————————————————————————- ——
    IBMPC/WIN_NT64-9.1.0 12.2.0.1.0 Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 – 64bit Production 0

    create table t (c varchar2(4000));

    var v varchar2(20)
    exec :v := ‘AAAA’
    select * from t where c=:v ;

    var v varchar2(90)
    exec :v := ‘AAAA’
    select * from t where c=:v ;

    > select datatype_string from v$sql_bind_capture where sql_id = ‘5a8bp2fr4v6w9’ ;
    DATATYPE_STRING
    —————
    VARCHAR2(128)
    VARCHAR2(32)

    –//PL/SQL
    https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/datatypes.htm

    VARCHAR2 Datatype

    You use the VARCHAR2 datatype to store variable-length character data. How the data is represented internally depends on
    the database character set. The VARCHAR2 datatype takes a required parameter that specifies a maximum size up to 32767
    bytes. The syntax follows:

    VARCHAR2(maximum_size [CHAR | BYTE])

    You cannot use a symbolic constant or variable to specify the maximum size; you must use an integer literal in the range 1 .. 32767.

    Small VARCHAR2 variables are optimized for performance, and larger ones are optimized for efficient memory use. The
    cutoff point is 2000 bytes. For a VARCHAR2 that is 2000 bytes or longer, PL/SQL dynamically allocates only enough memory
    to hold the actual value. For a VARCHAR2 variable that is shorter than 2000 bytes, PL/SQL preallocates the full declared
    length of the variable. For example, if you assign the same 500-byte value to a VARCHAR2(2000 BYTE) variable and to a
    VARCHAR2(1999 BYTE) variable, the former takes up 500 bytes and the latter takes up 1999 bytes.

    –// I think cutoff point is 1000,not 2000.

    $ cat aa.txt
    declare
    instring varchar2(&&1);
    begin
    for i in 1..1000 loop
    instring := rpad(‘X’,i,’X’);
    execute immediate ‘select /*+ find_me &&1 */ count(*) from t where c=:instring’ using instring ;
    end loop;
    end;
    /

    @aa.txt 1000
    @aa.txt 1001

    SCOTT@test01p> select sql_text c100 ,sql_id from v$sqlarea where sql_text like ‘%find_me 100%’ and sql_text not like ‘declare%’;
    C100 SQL_ID
    —————————————————————————————————- ————-
    select sql_text,sql_id from v$sqlarea where sql_text like ‘%find_me 100%’ 8rtb6zntzs09f
    select /*+ find_me 1000 */ count(*) from t where c=:instring 1gkmdzsqch6wc
    select /*+ find_me 1001 */ count(*) from t where c=:instring 0xtswwhbwrjy6

    –//sql_id=1gkmdzsqch6wc ,0xtswwhbwrjy6

    > select datatype_string from v$sql_bind_capture where sql_id = ‘1gkmdzsqch6wc’ ;
    DATATYPE_STRING
    —————
    VARCHAR2(2000)

    > select datatype_string from v$sql_bind_capture where sql_id = ‘0xtswwhbwrjy6’ ;
    DATATYPE_STRING
    —————
    VARCHAR2(2000)
    VARCHAR2(128)
    VARCHAR2(32)

    • Thank you for the explanation and the demo that shows that also the variable length in the declaration has an impact on cursor creation. That’s a distinguishing feature more – between languages where the variables are declared with length, such as PL/SQL and C, and those where the length isn’t specified, like Java and Python.

  2. Hi Nenad
    IMO, since you are using AL32UTF8 (873) as character set, the right multiplication factor should be 4 (and not 3) also in SQL*Plus. In fact, with AL32UTF8, every character can take up to 4 bytes.
    Another thing that according to my tests SQL*Plus doesn’t correctly do is to make a difference between VARCHAR2(n CHAR) and VARCHAR2(n BYTE).
    Best,
    Chris

    • Hi Chris,

      Thanks for this information. The topic becomes more and more interesting as people share the results of their own tests.

      1. The multiplication by 3 isn’t specific to SQLPlus – it probably happens in the OCI layer (my demo written in C “suffers” the multiplication by 3, too).

      But is it really necessary to multiply the string length with maximum character size for the used characterset if the string consists only of single byte characters? In PL/SQL there isn’t any multiplication factor at all. A possible explanation is that the server-side PL/SQL is more sophisticated and it takes the actual bind value length as oposed to calculating the maximum string length for a given characterset. An interesting test would be to bind a varchar2 value containing a multi-byte character and see if mxl changes.

      2. The same handling of CHAR and BYTE must be specific to SQLPlus as programming languages other than PL/SQL don’t discern CHAR and BYTE – they just declare strings. Repeating the same test but with a stored PL/SQL procedure would be an interesting test.

      Best regards,
      Nenad

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.