Oracle Trace File Analyzer (TFA) starts OSWatcher under the user grid.
ps -e -o pid,user,cmd | grep OSW
5007 grid /bin/sh ./OSWatcher.sh 30 48 NONE /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/archive
5689 grid /bin/sh ./OSWatcherFM.sh 48 /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/archive
From the security point of view, it’s correct to run a program with least necessary privileges. However, on Linux, OSWatcher can’t read from /proc/slabinfo because the file is readable only by root:
ls -l /proc/slabinfo
-r--------. 1 root root 0 Jul 14 17:16 /proc/slabinfo
That’s the reason why the OSWatcher archive for slabinfo (oswslabinfo) is empty:
ls /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/archive/oswslabinfo/
The historical slabinfo information is essential for troubleshooting kernel memory leaks like this one, caused by ACFS.
One way to solve the permissions problem is to make your Unix administrator extend the read privileges on slabinfo. Another is to run OSWatcher as root.
I couldn’t find anything in the documentation on how to change the user. So, I traced TFA service start with the eBPF BCC utilities execsnoop and opensnoop. With both utilities you can inspect programs you know nothing about. execsnoop traces process creation, and opensnoop traces system calls for opening files. opensnoop is particularly useful to find out what kind of configuration files a process is reading. execsnoop records all created processes – even those that were running only for a short time, for example the TFA boot script. We can correlate the information in opensnoop and execsnoop output via PID.
sudo /usr/share/bcc/tools/execsnoop > execsnoop.log
sudo /usr/share/bcc/tools/opensnoop -u 0 > opensnoop.log
systemctl start oracle-tfa.service
execsnoop captured the process that starts TFA:
PCOMM PID PPID RET ARGS
perl 73233 73214 0 /bin/perl /opt/oracle.ahf/tfa/bin/tfactl.pl -initstart
The following shell script looks for a possible configuration entry in all the relevant files opened by the tfactl.pl script:
for file in `egrep '^73233 ' opensnoop.log | awk '{print $5}' | sort | uniq | egrep 'config|prop|\.xml'`
do
echo $file
grep grid $file
done
The following entry stands out:
...
/u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/.osw.prop
runuser=grid
After changing the “runuser” parameter to root, killing the old OSWatcher processes and restaring TFA, OSWatcher indeed runs under root:
ps -e -o pid,user,cmd | grep OSW
92676 root /bin/sh ./OSWatcher.sh 30 48 NONE /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/root/archive
92954 root /bin/sh ./OSWatcherFM.sh 48 /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/root/archive
The question we must ask is: who owns the OSWatcher executable? If it isn’t root, the owner could escalate the privileges to root. Fortunately, everything’s clean – the OSWatcher scripts belong to root:
pwdx 92676
92676: /opt/oracle.ahf/tfa/ext/oswbb
pwdx 92954
92954: /opt/oracle.ahf/tfa/ext/oswbb
ls -l /opt/oracle.ahf/tfa/ext/oswbb/OSW
ls -l /opt/oracle.ahf/tfa/ext/oswbb/OSW*
-rwxr-xr-x. 1 root root 8035 Jun 18 2021 /opt/oracle.ahf/tfa/ext/oswbb/OSWatcherFM.sh
-rwxr-xr-x. 1 root root 55636 Jun 18 2021 /opt/oracle.ahf/tfa/ext/oswbb/OSWatcher.sh
It’s worth noting that prior to switching to root, some other OSWatcher files – belonging to grid – were executed:
pwdx 5007
5007: /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/oswbb
pwdx 5689
5689: /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/oswbb
ls -l /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/oswbb/OSW*
-rwxr-xr-x. 1 grid oinstall 8035 Oct 19 2021 /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/oswbb/OSWatcherFM.sh
-rwxr-xr-x. 1 grid oinstall 55636 Oct 19 2021 /u00/oracle/GI/gridbase/oracle.ahf/data/repository/suptools/host/oswbb/grid/oswbb/OSWatcher.sh
In conclusion, it’s safe to let OSWatcher run under root because Oracle switches to other OSWatcher scripts that are writable only by root.
In summary, in the default setup, OSWatcher doesn’t collect the slabinfo information because it runs as the user grid, and grid doesn’t have the read rights for slabinfo. A possible workaround is to run OSWatcher as root. You can achieve that by changing “runas” parameter in the file .osw.prop. Disclaimer: this isn’t a documented procedure.