Profiling DB Resource Manager – VKRM process

It seems that VKRM is a deeply unknow background process. I did a little investigation that will help to understand better all mechanism of profiling Oracle (Thank you Frits Hoogland) and a little more of one of the most underestimated feature of Oracle: Resource Manager.

VKRM manages the CPU scheduling for all Oracle processes and includes the CPU scheduling for the Database Resource Manager. Your DBRM active plan (parameter resource_manager_plan) will be subject to VKRM job to ensure that all your plan directives are fulfilled.
VKRM is a special background process, because it just go away when is not needed (at least in 11gR2) and every time your Resource Manager CPU scheduling kicks in, DBRM process will spawn VKRM again. Please note that DBRM is the “main” process for all Resource Manager tasks, VKRM is only for CPU scheduling.

There is no documentation explaining how VKRM works in detail, so what is left for us is to try some profiling and reach some (basic?) conclusions.

The first thing about VKRM is that, you simply can’t control its behavior…except there is an hidden parameter called _vkrm_schedule_interval exists which is basically VKRM schedule interval (surprise, surprise) that is by default set to 10 milliseconds:

SQL> @phidden _vkrm

-------------------------------------------------- --------------------------------------------------
_vkrm_schedule_interval 			   10

This is easily confirmed by strace on the PID corresponding to VKRM background process:

[oracle@baco scripts]$ ps -ef | grep ora_vkrm
oracle    2566     1  0 Nov01 ?        00:00:25 ora_vkrm_bacodb1
oracle    8965  7296  0 01:02 pts/3    00:00:00 grep ora_vkrm

[root@baco scripts]# strace -p 2566 -o ora_vkrm_strace.out
Process 2566 attached - interrupt to quit
^CProcess 2566 detached

The result is a bunch of nanosleep() Linux kernel functions, that suspends the execution of a calling thread until either at least the time specified (10000000 nanoseconds) has elapsed. On easy words, it is holding a sleep for every 10 milliseconds. On a successful sleep, nanosleep() returns 0.

nanosleep({0, 10000000}, 0x7fff271b1160) = 0
nanosleep({0, 10000000}, 0x7fff271b1160) = 0
nanosleep({0, 10000000}, 0x7fff271b1160) = 0

A small change in the _vkrm_schedule_interval to 5000 milliseconds will result in a different argument call for nanosleep() function and on a different period (every 5 seconds).
This will probably change the behavior of VKRM and CPU scheduling, the greater the value, the less precise will be your scheduling. As you can see in strace output it is possible to change _vkrm_schedule_interval while database is running (scope=memory) and it will take immediate effect on your scheduling behavior:

SQL> alter system set "_vkrm_schedule_interval"=5000 scope=memory;
SQL> alter system set "_vkrm_schedule_interval"=6000 scope=memory;
nanosleep({0, 10000000}, 0x7fff271b1160) = 0
nanosleep({0, 10000000}, 0x7fff271b1160) = 0
nanosleep({5, 0}, 0x7fff271b1160)       = 0
nanosleep({5, 0}, 0x7fff271b1160)       = 0
nanosleep({5, 0}, 0x7fff271b1160)       = 0
nanosleep({6, 0}, 0x7fff271b1160)       = 0

Trace files will also reveal your change:

*** 2014-11-02 04:02:43.992
kskvkrmschedintmod: setting VKRM scheduling interval from (6000)ms to [(10)ms (10000)us]
*** 2014-11-02 04:11:53.078
kskvkrmschedintmod: setting VKRM scheduling interval from (10)ms to [(5000)ms (5000000)us]
kskvkrmschedintmod: setting VKRM scheduling interval from (5000)ms to [(10)ms (10000)us]

Another chapter in profiling VKRM process is to use perf on Linux to see if we can see more interesting stuff. Bellow is the result of a perf report against VKRM process. Top 3 are three different kernel mode executed functions: __do_softirq, finish_task_switch and _raw_spin_unlock_irqrestore.
Most of the work is done in kernel mode, with Linux kernel software interrupts (softirq) and scheduler functions (finish_task_switch) allowing the high-precision CPU scheduling made by VKRM.
Another thing worth mention is usermode Oracle function kskvkrmmain representing only 3.03% of all work done by VKRM.

root@baco outputs]# perf record -g -p 2542 -e cpu-clock
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.451 MB (~19697 samples) ]

[oracle@baco outputs]$ perf report
[vdso] with build id 553f611ad979d16f78a66945dca52ba113827329 not found, continuing without symbols
 39.24%  ora_vkrm_bacodb  [kernel.kallsyms]   [k] __do_softirq
                 -- 99.05%-- do_nanosleep
34.31%  ora_vkrm_bacodb  [kernel.kallsyms]   [k] finish_task_switch
14.22%  ora_vkrm_bacodb  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
3.03%  ora_vkrm_bacodb  oracle              [.] kskvkrmmain
            --- kskvkrmmain

1.25%  ora_vkrm_bacodb  oracle              [.] sltrusleep
            --- sltrusleep

Another shot is oradebug to understand what kind of events happen related with VKRM:

SQL> oradebug setospid 2542
Oracle pid: 10, Unix process pid: 2542, image: oracle@baco (VKRM)
SQL>  oradebug unlimit
Statement processed.
SQL> oradebug event 10046 trace name context forever, level 8;
Statement processed.

*** 2014-11-09 14:06:38.559
WAIT #0: nam='latch free' ela= 21980 address=2722482696 <b>number=467</b> tries=0 obj#=-1 tim=6866775549

*** 2014-11-09 14:09:41.598
WAIT #0: nam='latch free' ela= 31774 address=2722482696 <b>number=467</b> tries=0 obj#=-1 tim=7049814301

The only event that is happening on this trace is latch free wait event.It is possible to identify what latch is related with latch free wait event with a simple query (see bellow). The latch is obviously related to Resource Manager CPU scheduling.

SQL> select latch#, name from v$latchname where latch# = 467;

---------- ----------------------------------------------------------------
       467 resmgr:resource group CPU method

This post has no great conclusions, it is just a pure exercise to understand a little more about a deeply unknow Oracle background process.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s