slabinfo

The whole memory on one of our Oracle Linux VMs was exhausted. The out of memory killer started killing the processes:

Jun 29 10:32:48 host kernel: OSWatcher.sh invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=1, oom_score_adj=0

The OS watcher was collecting the information in /proc/meminfo, which proved invaluable for the analysis. Almost the whole memory was consumed by slabs. The slab grew from 2928720 kB to 77052224 kB within a couple of minutes:

grid@host:/u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/archive/oswmeminfo> egrep "zzz|Slab|MemAvailable" host_meminfo_22.06.29.1000.dat
zzz ***Wed Jun 29 10:24:15 CEST 2022
MemAvailable:   50734336 kB
Slab:            2928720 kB
zzz ***Wed Jun 29 10:24:45 CEST 2022
MemAvailable:   50754784 kB
Slab:            2928428 kB
zzz ***Wed Jun 29 10:25:15 CEST 2022
MemAvailable:   50748708 kB
Slab:            2928400 kB
zzz ***Wed Jun 29 10:25:45 CEST 2022
MemAvailable:    4890828 kB
Slab:           47753392 kB
zzz ***Wed Jun 29 10:26:15 CEST 2022
MemAvailable:     853208 kB
Slab:           52054036 kB
zzz ***Wed Jun 29 10:26:45 CEST 2022
MemAvailable:     767924 kB
Slab:           52249192 kB
zzz ***Wed Jun 29 10:27:15 CEST 2022
MemAvailable:     750048 kB
Slab:           52442408 kB
zzz ***Wed Jun 29 10:27:45 CEST 2022
MemAvailable:     673256 kB
Slab:           52612572 kB
zzz ***Wed Jun 29 10:28:15 CEST 2022
MemAvailable:     692748 kB
Slab:           53122860 kB
zzz ***Wed Jun 29 10:28:45 CEST 2022
MemAvailable:     595100 kB
Slab:           53553064 kB
zzz ***Wed Jun 29 10:29:16 CEST 2022
MemAvailable:     568432 kB
Slab:           53969576 kB
zzz ***Wed Jun 29 10:29:46 CEST 2022
MemAvailable:     572700 kB
Slab:           54801556 kB
zzz ***Wed Jun 29 10:30:16 CEST 2022
MemAvailable:     615584 kB
Slab:           55338836 kB
zzz ***Wed Jun 29 10:30:46 CEST 2022
MemAvailable:     599560 kB
Slab:           55715568 kB
zzz ***Wed Jun 29 10:31:16 CEST 2022
MemAvailable:     555600 kB
Slab:           55861032 kB
zzz ***Wed Jun 29 10:31:47 CEST 2022
MemAvailable:     556200 kB
Slab:           56051892 kB
zzz ***Wed Jun 29 10:32:17 CEST 2022
MemAvailable:     491812 kB
Slab:           56165600 kB
zzz ***Wed Jun 29 10:32:48 CEST 2022
MemAvailable:     468500 kB
Slab:           56284980 kB
zzz ***Wed Jun 29 10:33:18 CEST 2022
MemAvailable:    2326488 kB
Slab:           74419980 kB
zzz ***Wed Jun 29 10:35:40 CEST 2022
MemAvailable:     227620 kB
Slab:           77052224 kB
zzz ***Wed Jun 29 10:38:11 CEST 2022
MemAvailable:     229028 kB
Slab:           77059908 kB
zzz ***Wed Jun 29 10:42:13 CEST 2022
MemAvailable:     227812 kB
Slab:           77067504 kB
zzz ***Wed Jun 29 10:48:35 CEST 2022
MemAvailable:     236408 kB
Slab:           77081200 kB
zzz ***Wed Jun 29 10:49:42 CEST 2022
MemAvailable:     224536 kB
Slab:           77095736 kB
zzz ***Wed Jun 29 10:54:03 CEST 2022
MemAvailable:     232752 kB
Slab:           77107492 kB
zzz ***Wed Jun 29 10:55:58 CEST 2022
MemAvailable:     222448 kB
Slab:           77118004 kB
zzz ***Wed Jun 29 10:57:48 CEST 2022
MemAvailable:     227048 kB
Slab:           77127064 kB
zzz ***Wed Jun 29 11:00:05 CEST 2022
MemAvailable:     215600 kB
Slab:           77137560 kB

The nextstep would have been to show the breakdown of slab allocations to see which slab was causing the problem.

This can be done, for example, with slabtop (the outputs below were taken after the problem disappeared):

slabtop
Active / Total Objects (% used)    : 4618223 / 4645386 (99.4%)
 Active / Total Slabs (% used)      : 78179 / 78179 (100.0%)
 Active / Total Caches (% used)     : 101 / 141 (71.6%)
 Active / Total Size (% used)       : 826284.55K / 837056.44K (98.7%)
 Minimum / Average / Maximum Object : 0.01K / 0.18K / 8.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
1029405 1029181  99%    0.20K  26395       39    211160K dentry
566076 566076 100%    0.09K  13478       42     53912K kmalloc-rcl-96
417280 417280 100%    0.01K    815      512      3260K kmalloc-8
378250 378250 100%    0.02K   2225      170      8900K avtab_node
334976 332731  99%    0.03K   2617      128     10468K kmalloc-32
303461 302987  99%    0.05K   4157       73     16628K Acpi-Parse
231040 231040 100%    0.06K   3610       64     14440K kmalloc-64
220672 220672 100%    0.02K    862      256      3448K kmalloc-16
200736 200489  99%    0.04K   1968      102      7872K avtab_extended_perms
195120 193680  99%    1.06K   6504       30    208128K xfs_inode
128632 128632 100%    0.57K   2297       56     73504K radix_tree_node
 94878  94350  99%    0.19K   2259       42     18072K dmaengine-unmap-16
 67524  66956  99%    0.62K   1324       51     42368K inode_cache

Another way is vmstat -m:

vmstat -m | sort -k 3 -n -r
Cache                       Num  Total   Size  Pages
dentry                   1057538 1058109    208     39
kmalloc-rcl-96           566076 566076     96     42
kmalloc-8                417280 417280      8    512
avtab_node               378250 378250     24    170
kmalloc-32               335889 337024     32    128
Acpi-Parse               329632 330106     56     73
kmalloc-64               233600 233600     64     64
kmalloc-16               220928 220928     16    256
avtab_extended_perms     201042 201042     40    102
xfs_inode                195202 195240   1088     30
radix_tree_node          136808 136808    584     56
dmaengine-unmap-16        94920  94920    192     42
inode_cache               67493  67626    640     51

You can also read /proc/slabinfo directly:

cat /proc/slabinfo | sort -k 2 -n -r
slabinfo - version: 2.1
# name                 : tunables  	 	 	 	   : slabdata   
dentry            1075951 1079676    208   39    2 : tunables    0    0    0 : slabdata  27684  27684      0
kmalloc-rcl-96    566160 566160     96   42    1 : tunables    0    0    0 : slabdata  13480  13480      0
kmalloc-8         417280 417280      8  512    1 : tunables    0    0    0 : slabdata    815    815      0
avtab_node        378250 378250     24  170    1 : tunables    0    0    0 : slabdata   2225   2225      0
Acpi-Parse        344990 349086     56   73    1 : tunables    0    0    0 : slabdata   4782   4782      0
kmalloc-32        339085 340352     32  128    1 : tunables    0    0    0 : slabdata   2659   2659      0
kmalloc-64        245855 245952     64   64    1 : tunables    0    0    0 : slabdata   3843   3843      0
kmalloc-16        230400 230400     16  256    1 : tunables    0    0    0 : slabdata    900    900      0
avtab_extended_perms 211344 211344     40  102    1 : tunables    0    0    0 : slabdata   2072   2072      0
xfs_inode         207539 207540   1088   30    8 : tunables    0    0    0 : slabdata   6918   6918      0
radix_tree_node   155786 156016    584   56    8 : tunables    0    0    0 : slabdata   2786   2786      0
dmaengine-unmap-16 100842 100842    192   42    2 : tunables    0    0    0 : slabdata   2401   2401      0
inode_cache        68388  68850    640   51    8 : tunables    0    0    0 : slabdata   1350   1350      0

The OSWatcher chooses the last approach for getting the slabs information – it reads the file /proc/slabinfo.

Unfortunately, this file is readable only by root, so the OSWatcher, which is running under grid, isn’t collecting the information:

ls -l /proc/slabinfo
-r--------. 1 root root 0 Jun 29 15:04 /proc/slabinfo

The purpose of this blog post is to show how to display the slabs allocations and warn that you need to change the permissions on /proc/slabinfo to make sure that OSWatcher can collect this information.

Thanks for sharing

Nenad Noveljic

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.