SOS_SCHEDULER_YIELD waits during low CPU utilization

SOS_SCHEDULER_YIELD wait event measures the time the threads are spending on SQL Server OS (SOS) scheduler queue. Therefore, it will inherently start increasing on an oversubscribed system.

Paul Randal warns us that on a virtualized platform you also have to consider the CPU utilization on the whole physical server. Expectedly, the threads will be queued on an oversubscribed physical server causing SOS_SCHEDULER_YIELD waits, even in the case of low CPU utilization by the VM.

But I’ve been occasionally seeing high SOS_SCHEDULER_YIELD signal wait time even when both VM and the physical server were underutilized.

This prompted me to do a series of tests on SQL Server 2017 to learn when exactly this happens. In doing so, I measured SOS_SCHEDULER_YIELD wait time, CPU time per scheduler and elapsed time. This benchmark and the statistical analysis are fully automated and can be downloaded from my GitHub repository.

What I found out is that SQL Server will start throttling after the load exceeds 50%. I mean it will put the thread in the queue, instead of using available CPUs. That’s the reason why we’re sometimes seeing SOS_SCHEDULER_YIELD waits even when there’s enough spare CPU capacity.

By the way, I also performed similar load tests with a C program and it scaled perfectly. I did this just to prevent the speculations about common suspects, such as physical host, OS, virtualization layer, hyperthreading, etc.

The following table quantifies the effects of the scheduler throttling (the server has 12 CPU cores):

Load AVG SOS waits(%) MAX SOS waits(%) Run id 1
---- ---------------- ---------------- --------
  23             75.1            101.2     1554
  22             71.3               96      235
  21             68.6             89.5     1263
  20             63.4             84.3     1326
  19             58.1             78.2      465
  18             46.9             65.8     1868
  17             43.2             60.2     2126
  16             35.5             42.6      398
  15             28.2             42.8       91
  14             19.6             34.3      778
  13             15.4             22.5     1665
  12             10.8             24.2     1617
  11              9.7             12.7     1435
  10              9.3             13.9      136
   9              7.4             15.9      329
   8              5.9             17.9     1560
   7              3.9             21.9     1361
   6              0.3             24.3     1432
   5              0.1              0.1      887
   4              0.1              0.5     2181
   3              0.1              0.1     1808
   2              0.1              0.2     1711

The meaning of the columns are:

  • Load – the number concurrent of sessions concurrently executing the load.
  • AVG SOS waits(%) – average SOS_SCHEDULER_YIELD waits overhead per load
  • MAX SOS waits(%) – maximum SOS_SCHEDULER_YIELD waits overhead per load
  • Run Id – run id of the test with the highest SOS_SCHEDULER_YIELD overhead for a given load

As you can see, SQL Server already started throttling when it had to serve 7 concurrent sessions. But there were also some outliers with 6 concurrent sessions, like the test having the run id 1432.

Let’s zoom into it:

===============================
Run:  1432
Threads:  6
Elapsed times:
9245.1896
6905.6312
6819.9694
6463.959
9178.9062
6591.4602


Count    : 6
Average  : 7534.18593333333
Sum      : 45205.1156
Maximum  : 9245.1896
Minimum  : 6463.959
Property : 


wait_type           wait_time_ms signal_wait_time_ms
---------           ------------ -------------------
SOS_SCHEDULER_YIELD         8840                8837



scheduler_id total_cpu_usage_ms
------------ ------------------
           0               6822
           1               6375
           2                  2
           3                  0
           4               6512
           5                  1
           6                  0
           7                  0
           8               9394
           9               6726
          10                  0
          11                  1

In the excerpt above, we can clearly see that only 5 schedulers were busy, even though we had 6 active sessions. This means that in average one session had to wait inspite having enough CPU capacity in both the VM and the physical server. However, not all of the sessions were equally penalized – two slowest sessions took around 40% longer to complete than the fastest one. In other words, you can experience highly volatile response times even with lower CPU usage without any obvious reason.

This variability in relation to the mean can be measured with the coefficient of variation (CV), which is defined as the ratio of the standard deviation to the mean and often expressed as a percentage. Higher the CV, more volatile the execution times.

The table below shows how the CV increases with the load:

Load AVG CV AVG SOS waits(%) MAX CV Run id 1 MAX SOS waits(%) Run id 2
---- ------ ---------------- ------ -------- ---------------- --------
  23   19.7             75.1   32.6     2246            101.2     1554
  22     21             71.3   33.8     2209               96      235
  21   20.3             68.6   41.4     1198             89.5     1263
  20   20.7             63.4   30.8     1183             84.3     1326
  19   20.2             58.1   33.5     2372             78.2      465
  18   19.9             46.9   29.1      812             65.8     1868
  17     20             43.2   30.3     1262             60.2     2126
  16   19.4             35.5   37.9      402             42.6      398
  15     18             28.2   23.4     1671             42.8       91
  14   14.7             19.6   22.1     2314             34.3      778
  13   12.2             15.4   25.5     1158             22.5     1665
  12   10.1             10.8   18.4     2382             24.2     1617
  11    9.5              9.7   14.9      397             12.7     1435
  10    8.8              9.3   14.1      136             13.9      136
   9      8              7.4   15.5      329             15.9      329
   8    7.4              5.9   16.7     1787             17.9     1560
   7    6.6              3.9   21.1     1361             21.9     1361
   6      5              0.3   15.9     1432             24.3     1432
   5    4.1              0.1   10.4     1788              0.1      887
   4    3.9              0.1   12.9      543              0.5     2181
   3    3.2              0.1   16.8     2015              0.1     1808
   2    0.4              0.1    1.7      317              0.2     1711
   1      0              0.1      0     2398              0.4     1853

Notice that we often have a relatively high variability even with insignificant SOS_SCHEDULER_YIELD wait time, which means that these waits aren’t the only contributor to the volatile response times.

This unbalanced scheduling seems to be a known problem. For instance, Chris Adkin suggests combatting it with the trace flag 8008, which is an undocumented flag that forces assigning the thread to the scheduler with the least load.

So, I repeated the tests with the 8008 flag set and obtained, unfortunately, not so spectacular results:

Load AVG CV AVG SOS waits(%) MAX CV Run id 1 MAX SOS waits(%) Run id 2
---- ------ ---------------- ------ -------- ---------------- --------
  22   17.7             68.3   38.8      592             85.3     1589
  23   15.7             69.9   33.1     1639             90.9     1578
  21   18.2             65.1   31.3      462               80     1803
  20   19.5             57.9   31.5      546             74.5      305
  19   19.4             56.2   29.8      227             78.9     1542
  18   19.6             50.8   22.7      635             63.4      566
  17   19.9             42.1   30.6     1515               49      957
  16   19.3             36.6     29     1047             50.7     1499
  15   18.5             28.4     26        1             31.2      200
  14   14.3             19.4   19.6     1184             32.9     1184
  13   12.4             15.5   20.1     1959             36.8     1959
  12    8.7              9.5   16.4     2325             24.5     2325
  11      9              9.7   17.2     1532             27.1     1532
  10    7.4              7.6     13      204             13.8      204
   9    6.8                6   14.9     1901             15.8     1901
   8    6.1                4   18.1      847             18.6      847
   7    4.9              1.1   20.5     1332             22.2     1332
   6    4.8              0.1   10.8     2373              0.1       20
   5    4.1              0.1     11     1504              0.1      676
   4    4.1              0.1   12.6     2122              0.1     1396
   3    3.4              0.1   17.9     1816              0.1     2014
   2    0.5              0.1    1.8      263              0.1      593
   1      0                0      0     2320              0.1     1565

The systemic throttling started with a somewhat higher load – 8 instead of 7, but alas, the magnitude of the problem remained the same:

In conclusion, throttling can occure even though there’s enough CPU capacity to handle the workload. The tracing flag 8008 linders the problem a little bit, but it doesn’t resolve it. The CPU overprovisioning can postpone the throttling and therefore can serve as a workaround.

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.