Thursday, January 09, 2014

Linux Kernel Tuning - swappiness and overcommit_memory

The linux kernel tunable parameter vm.swappiness (/proc/sys/vm/swappiness) can be used to define how aggressively memory pages are swapped to disk. Linux moves memory pages that have not been accessed for some time to the swap space even if there is enough free memory available. By change the value of kernel parameter “swappiness”  /proc/sys/vm/swappiness.

A high swappiness value means that the kernel will be more apt to unmap mapped pages. A low swappiness value means the opposite, the kernel will be less apt to unmap mapped pages. In other words, the higher the vm.swappiness value, the more the system will swap.

vm.swappiness takes a value between 0 and 100 to change the balance between swapping applications and freeing cache. At 100, the kernel will always prefer to find inactive pages and swap them out; in other cases, whether a swap out occurs depends on how much application memory is in use and how poorly the cache is doing at finding and releasing inactive items.

# echo 0 > /proc/sys/vm/swappiness
# cat /proc/sys/vm/swappiness

Processes commonly allocate memory by calling the function malloc(). The kernel decides if enough RAM is available and either grants or denies the allocation request. Linux (and a few other Unix variants) support the ability to overcommit memory; that is, to permit more memory to be allocated than is available in physical RAM plus swap. This is scary, but sometimes it is necessary since applications commonly allocate memory for “worst case” scenarios but never use it.
There are three possible settings for vm.overcommit_memory.

  • 0 (zero): Check if enough memory is available and, if so, allow the allocation. If there isn’t enough memory, deny the request and return an error to the application.
  • 1 (one): Permit memory allocation in excess of physical RAM plus swap, as defined by vm.overcommit_ratio. The vm.overcommit_ratio parameter is a percentage added to the amount of RAM when deciding how much the kernel can overcommit. For instance, a vm.overcommit_ratio of 50 and 1 GB of RAM would mean the kernel would permit up to 1.5 GB, plus swap, of memory to be allocated before a request failed.
  • 2 (two): The kernel’s equivalent of “all bets are off,” a setting of 2 tells the kernel to always return success to an application’s request for memory. This is absolutely as weird and scary as it sounds.

When a process forks, or calls the fork() function, its entire page table is cloned. In other words, the child process has a complete copy of the parent’s memory space, which requires, as you’d expect, twice the amount of RAM. If that child’s intention is to immediately call exec() (which replaces one process with another) the act of cloning the parent’s memory is a waste of time. Because this pattern is so common, the vfork() function was created, which unlike fork(), does not clone the parent memory, instead blocking it until the child either calls exec() or exits. The problem is that the HotSpot JVM developers implemented Java’s fork operation using fork() rather than vfork().

# echo 1 >  /proc/sys/vm/overcommit_memory
# cat /proc/sys/vm/overcommit_memory

To change permanently:
# vi /etc/sysctl.conf
add the following (dash not included)
---------------------------------
# Hadoop Kernel tuning
vm.swappiness=0
vm.overcommit_memory=1
---------------------------------

No comments: