SQL Server
In this blog post I’ll be measuring the Extended Events Trace overhead. Besides that I’ll be explaining potential pitfalls when interpreting the trace data.
The testing was conducted on SQL Server 2017, build 14.0.1000.169.
The following SQL batch took 61 seconds to complete:
DECLARE @starttime datetime, @endtime datetime
DECLARE @cnt int = 0, @ts_start int ;
SET @starttime =getdate() ;
BEGIN
WHILE @cnt < 100000000
BEGIN
SET @cnt = @cnt + 1;
END;
END
SET @endtime = GETDATE() ;
SELECT @@spid spid, Datediff(s, @starttime, @endtime) elapsed_seconds
GO
spid elapsed_seconds
58 61
By sampling sys.dm_exec_requests in another session we can verify that the batch is CPU bound (the times are in ms):
declare @spid int = 58, @i int = 1, @samples int = 60 ;
begin
while ( @i <= @samples )
begin
select cpu_time,total_elapsed_time from sys.dm_exec_requests
where session_id = @spid
waitfor delay '00:00:01'
set @i = @i + 1
end
end ;
cpu_time total_elapsed_time
56802 60483
After starting the sql_batch_completed trace the execution time increased by more than 20%!
CREATE EVENT SESSION [batch] ON SERVER
ADD EVENT sqlserver.sql_batch_completed(SET collect_batch_text=(1)
WHERE ([sqlserver].[session_id]=(58)))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO
spid elapsed_seconds
58 74
According to the Extended Event Trace the overhead wasn’t spent on the CPU:
cpu_time: 63,781,000 us
duration: 74,653,151 us
In contrast, according to sys.dm_exec_requests the overhead was completely CPU bound:
cpu_time total_elapsed_time
73632 74270
By observing the CPU time with the Task Manager we can confirm that the cpu_time information in sys.dm_exec_requests is correct.
Luckily, not all events are detrimental to the performance. For example, the logout event doesn’t have any measurable impact, although it also collects the cpu_time:
spid elapsed_seconds
58 62
CREATE EVENT SESSION [logout] ON SERVER
ADD EVENT sqlserver.logout(
WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(58))))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO
Oracle
I executed an equivalent test case on an Oracle 12.2 database on Solaris 11.3:
alter session set tracefile_identifier = 'LOOP' ;
alter session set timed_statistics=true ;
alter session set events '10046 trace name context forever, level 12' ;
set timing on
declare
i integer := 0 ;
begin
while i < 500000000
loop
i := i + 1 ;
end loop ;
end ;
/
ALTER SESSION SET EVENTS '10046 trace name context off';
The tracing didn't introduce any measurable difference. The Oracle database CPU instrumentation is very efficient - it performs just two OS kernel calls to get the cumulative process CPU times before and after the SQL execution. Their difference is the time the process spent on the CPU.
By the way, here are two articles about Oracle instrumentation internals:
Conclusions
So, we can draw the following conclusions from the experiment:
- Some SQL Server Extended Events, like sql_batch_completed can cause a significant increase in the execution time.
- This overhead is CPU bound.
- The overhead cpu_time is not part of the cpu_time reported in the the Extended Events trace.
- The overhead cpu_time is contained in sys.dm_exec_requests.cpu_time.
- The Oracle database code instrumentation rocks.
By using the methodology described above it makes sense to quantify the overhead of individual events before interpreting the data. Also, in some cases the trace data can be combined with sys.dm_exec_requests to get accurate results.