Solaris 11.3 Hangs Because of Kernel Object Manager

I started occasionally hitting a hanging issue lasting for approximately two minutes after upgrading to Solaris 11.3.

There is extensive kmem_task acitivity just before the hang which I recorded with the following DTrace script:


#!/usr/sbin/dtrace
profile:::profile-997Hz
/ execname == "kmem_task" /
{
@[stack()] = count() ;
}

profile:::tick-1s
{
printf("\n\n%Y\n",walltimestamp);
printa("%k %@d\n",@);
trunc(@);
}

[...truncated...]

genunix`kom_cachetag_destructor+0x27
genunix`kmem_magazine_destroy+0x67
genunix`kmem_depot_ws_reap+0x77
genunix`kmem_cache_reap+0x76
genunix`kmem_do_processing+0x28
genunix`kmem_async+0x159
genunix`kmem_sysdc_pinit+0x9f
unix`thread_start+0x8

168

Note: I keep eye on kmem_task because I’ve already discovered another performance problem caused by this process.

kom_cachetag_destructor, the last function on the stack, is the part of the new kernel memory allocation mechanism, Kernel Object Manager (KOM) which has been introduced in Solaris 11.3 (see Metalink note 1430323.1 How to Understand ZFS File Data Value by mdb and ZFS ARC Size ).

Therefore I started to collect the performance data related to kom. Below is the snapshotted data just before and after the freeze:

kstat -c kom_class

[...truncated...]

module: genunix instance: 1
name: arc_data class: kom_class
crtime 1434292.13355787
defrag_freed 102481
defrag_nomem 52794
defrag_partial 974
defrag_skipped 37792
mem_in_use 300896756224
mem_total 302138785792
snaptime 1575865.05633055

[...truncated...]

module: genunix instance: 1
name: arc_data class: kom_class
crtime 1434292.13355787
defrag_freed 181442
defrag_nomem 219274
defrag_partial 980
defrag_skipped 179675
mem_in_use 136538190336
mem_total 136616869888
snaptime 1575889.24564696

[...truncated...]

It can be seen that kmem_task performs some defragmentation when the problem happens.

The bug described in the Metalink Doc ID 2129106.1 “arc_throttle causes heavy fragmentation on KOM slabs leading to hangs” has similar symptoms.

This is still work in progress. I’ll keep updating the blog post with relevant information, so stay tuned!

Update on 20. June 2016:

Oracle delivered an IDR patch containing the fixes for the following bugs, which completely resolved the issue:

  • 22347071 KOM fragmentation issue
  • 21748206 KOM defrag goes off the rails when ARC throttles under heavy I/O load
  • 23005679 KOM vacate has several rare race conditions
  • 18507051 Can’t boot systems with more than 100 cpus after tuning segkmem_lpsize
  • 23340416 KOM slab double free in capture

The bug fixes are not downloadable via Metalink, but you can get them by requesting an IDR from Oracle Support which I highly recommend to do if you’re on Solaris 11.3.

For the final fix Solaris engineering is targeting SRU 11 at the moment.

Update on 1. December 2016

The above mentioned patches resolved the hanging issue completely. However, we still occasionally see some IO outliers caused by reaping ZFS ARC. Recently, Oracle published the Metalink note Solaris 11.3: kmem cache reaping of buffers with large number of entries can lead to long delays which could cause Cluster node eviction (Doc ID 2205638.1) which relates to the problem. If you’ve found the information in this blog post useful, you might also be interested in ZFS ARC Resizing (user_reserve_hint_pct)

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.