Transparent HugePages notes for DBAs

Intro

The Linux 2.6.37 showed us a new feature called Transparent Hugepage Support, that works by reducing the number of TLB (Translation Lookaside Buffer) that is basically a cache for Virtual addressing translation. The main ideia beyond the implementation of transparent hugepages is to handle large pages just like normal pages and not just like another parallel memory management subsystem, that means if a large page is available for application, the system will provide it, otherwise the standard 4k page will be used.
This will “connect” the large pages and normal pages memory subsystems making possible large pages swappable as example. This of course sounds even better, if you think on the problems of having only small pages or only large pages. Small page size causes a large management overhead (17GB SGA + 1000 connections = 4.4 million pages and page table size of 16GB) and large pages size (example: 2MB or 4MB) causes fragmentation, as allocating huge pages becomes more difficult as memory is fragmented and the presence of contiguous memory is not available. THP tries to improve and address theses issues regularly scanning the areas marked as huge page candidates and replace a bunch of small pages with huge pages as well as directly allocate huge pages if possible.

Oracle SGA and THP

The current Transparent HugePages (or THP) only maps anonymous memory regions and therefore only affects heap and your SGA (the larger part of your memory allocation, probably) will need access to shared memory, so anonymous memory regions are useless until THP supports a shared memory implementation. THP only supports anonymous memory regions and Huge Pages only work with shared memory. Apart from THP only works with anonymous pages it only handles one huge page size (2MB), so it is not suitable for use and you need to preallocate it and Oracle must map them explicitly. Once they are pinned pages are not swappable. That is the main problem with HugePages, it requires a lot of effort from a System Administrator / DBA point of view and application must support these method, that why Huge Pages is used mainly in large RDBMS systems.

Oracle PGA and THP

As you know, PGA is heap and you can’t use the traditional huge pages, so you don’t consider it for calculations, but you may benefit from THP.  The number of THP used by the system is available using this command:

[oracle@obox ~]$ grep -i AnonHugePages /proc/meminfo
AnonHugePages: 253952 kB

Apart from this, we can check if any Oracle process is using THP. Just read /proc/PID/smaps and count the number of AnonHugePages for each mapping. Replace PID for whatever your dedicated process that handles the connection to database. I have found no AnonHugePages use, meaning that my PGA is probably not using it.

[oracle@obox ~]$ cat /proc/25894/smaps | grep AnonHugePages
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
...

Since THP is enabled, /proc/vmstat has some counters to monitor how sucessfully system is providing huge pages for use:

[oracle@obox ~]$ cat /proc/vmstat | grep thp_
thp_fault_alloc 368
thp_fault_fallback 264
thp_collapse_alloc 301
thp_collapse_alloc_failed 2642
thp_split 222

thp_fault_alloc is incremented every time a huge page is allocated to handle a page fault, since we have a 386 times allocated a huge page means that we are actually are using THP. thp_fault_fallback indicates that kernel fails to allocated huge page and fallback to normal page size. Since vmstat is reporting THP use and this is just a machine that runs only an Oracle database, Oracle must be using THP somewhere.
This made me think and drill down every Oracle process and search for any use of THP. For that i’ve made a shell script that searchs all the PIDs from Oracle processes:

#!/bin/bash
for i in $(pgrep -f ora_)
do
   echo "PID:"$i
   cat /proc/$i/smaps | grep AnonHugePages | sort | uniq
done

#all dedicated servers processes

for i in $(pgrep -f oracleTN*)
do
   echo "PID:"$i
   cat /proc/$i/smaps | grep AnonHugePages | sort | uniq
done

The result is a little surprise for me. Let’s analyse:

PID:58223
AnonHugePages:         0 kB
AnonHugePages:      4096 kB
PID:58363
AnonHugePages:         0 kB
AnonHugePages:      2048 kB
PID:58365
AnonHugePages:         0 kB
AnonHugePages:      2048 kB

[oracle@obox ~]$ ps -ef | grep 58223
oracle   36939 18818  0 16:30 pts/0    00:00:00 grep 58223
oracle   58223     1  0 May06 ?        00:35:34 ora_dbw0_TNMSAM
[oracle@obox ~]$ ps -ef | grep 58363
oracle   36943 18818  0 16:31 pts/0    00:00:00 grep 58363
oracle   58363     1  0 May06 ?        00:05:08 ora_arc0_TNMSAM
[oracle@obox ~]$ ps -ef | grep 58365
oracle   36946 18818  0 16:31 pts/0    00:00:00 grep 58365
oracle   58365     1  0 May06 ?        00:05:05 ora_arc1_TNMSAM

According the results, DBWR and ARCH Oracle processes are using THP.

Update: I did an improvement in the script (Bash Script) to detect THP usage by Oracle processes. Output and script bellow:

#!/bin/bash

for i in $(ps -eo pid,cmd | grep ora_ | grep -v grep | awk '{split($0,a," "); print a[1]"|"a[2]}')
do
   PID=$(echo $i | awk '{split($0,a,"|"); print a[1]}')
   PNAME=$(echo $i | awk '{split($0,a,"|"); print a[2]}') 
   THP=$(cat /proc/$PID/smaps | grep AnonHugePages | awk '{sum+=$2} END  {print sum}') 
  
   echo "PID "$PID "(" $PNAME ") is using " $THP "Kb of THP"

done

Output:
Usage of THP for Linux

Advertisements

4 thoughts on “Transparent HugePages notes for DBAs

  1. Hi
    Is the number of AnonHugePages mean THP ? I have turnoff THP. but the AnonHugePages is not zero. what’s might be wrong ?

    # pgrep -f ora_dbw0_
    2371
    # cat /proc/2371/smaps | grep AnonHugePages | grep -v 0
    AnonHugePages: 6144 kB
    # cat /sys/kernel/mm/transparent_hugepage/enabled
    always madvise [never]

    • Thanks for the info on a new and not well understood topic.

      Greg

      I modded the second script to include asm and eliminate 0kb entries. On my high memory server, running as a single node rac (instance name changed to be generic), I see the following;

      PID 37004 ( asm_diag_+ASM1 ) is using 2048 Kb of THP
      PID 37008 ( asm_dia0_+ASM1 ) is using 4096 Kb of THP
      PID 49630 ( ora_diag_ORCL ) is using 2048 Kb of THP
      PID 49664 ( ora_dia0_ORCL ) is using 4096 Kb of THP
      PID 49729 ( ora_dbw0_ORCL ) is using 6144 Kb of THP
      PID 49769 ( ora_dbw1_ORCL ) is using 6144 Kb of THP
      PID 49772 ( ora_dbw2_ORCL ) is using 6144 Kb of THP
      PID 49782 ( ora_dbw3_ORCL ) is using 6144 Kb of THP
      PID 49786 ( ora_dbw4_ORCL ) is using 6144 Kb of THP
      PID 49795 ( ora_dbw5_ORCL ) is using 6144 Kb of THP
      PID 49797 ( ora_dbw6_ORCL ) is using 6144 Kb of THP
      PID 49799 ( ora_dbw7_ORCL ) is using 6144 Kb of THP
      PID 49866 ( ora_mark_ORCL ) is using 6144 Kb of THP
      PID 49921 ( ora_rsmn_ORCL ) is using 2048 Kb of THP
      PID 50180 ( ora_nsv1_ORCL ) is using 4096 Kb of THP
      PID 50298 ( ora_arc0_ORCL ) is using 30720 Kb of THP
      PID 50305 ( ora_arc1_ORCL ) is using 2048 Kb of THP
      PID 50310 ( ora_arc2_ORCL ) is using 2048 Kb of THP
      PID 50366 ( ora_arc4_ORCL ) is using 2048 Kb of THP
      PID 50401 ( ora_nsa2_ORCL ) is using 12288 Kb of THP
      PID 50409 ( ora_arc5_ORCL ) is using 24576 Kb of THP
      PID 50413 ( ora_arc6_ORCL ) is using 2048 Kb of THP
      PID 50415 ( ora_arc7_ORCL ) is using 2048 Kb of THP
      PID 50418 ( ora_arc8_ORCL ) is using 2048 Kb of THP
      PID 50443 ( ora_arc9_ORCL ) is using 24576 Kb of THP

      It’s a 256g system with both hugepages configured.

      AnonHugePages: 913408 kB
      HugePages_Total: 81920
      HugePages_Free: 49450
      HugePages_Rsvd: 44331
      HugePages_Surp: 0
      Hugepagesize: 2048 kB

  2. see the Doc ID 1557478.1 from oracle:
    Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s