Linux Ate My RAM…

Home » CentOS » Linux Ate My RAM…
CentOS 6 Comments

Hello everyone,

Excuse the title. I’m trying to do something very specific that goes against some common assumptions.

I am aware of how Linux uses available memory to cache. This, in almost all cases, is desirable. I’ve spent years explaining to users how to properly read the free output.

I’m now trying to increase VM density on host systems (by host, I mean the physical system, not the underlying guest machines).

VMWare can over-allocate memory as long as it’s not being used. Because of caching, from VMWare‘s perspective, all Linux memory is being “used”. I am aware of the inherent risks in over-allocation of resources on a VMWare system.This tuning is strictly for development systems where performance and stability are not as critical. The increase in vm density is an acceptable tradeoff.

My questions:
1) What options are available in CentOS to limit the page cache? SuSe has vm.pagecache_limit_mb and vm.pagecache_limit_ignore_dirty which, in conjunction with swappiness tweaks, appears to do what I need.

2) Any experience with enabling /sys/kernel/mm/ksm/run on non-KVM
workloads? As KSM only applies to non-pagecache memory, it doesn’t immediately help me here but could be incrementally useful
(https://www.kernel.org/doc/Documentation/vm/ksm.txt).

3) Is there any way to control /proc/sys/vm/drop_caches and limit it to a number of entries or age? Dropping the filesystem cache, though it unmarks those pages, has performance implications.

Thanks in advance for any input or links.

6 thoughts on - Linux Ate My RAM…

  • Nope. VMware’s memory ballooning feature purposely keeps some of the guest’s RAM locked away from the kernel. This is where RAM comes from when another guest needs more physical RAM than it currently has access to:

    https://blogs.vmware.com/virtualreality/2008/10/memory-overcomm.html

    There are downsides.

    One is that pages locked up by the balloon driver aren’t being used by Linux’s buffer cache. But on the other hand, the hypervisor itself fulfills some of that role, which is why rebooting a VM guest is typically much faster than rebooting the same OS on the same bare hardware.

    Another, of course, is that oversubscription risks running out of RAM, if all of the guests decide to try and use all the RAM the host told them it gave. All of the guests end up being forced to deflate their balloons until there is no more balloon memory left.

    Instead of oversubscribing the real RAM of the system, consider starting and stopping VMs at need, so that only a subset of them are running at a given time. That lets you host more VMs underneath a given hypervisor than would run simultaneously, as long as you don’t need too many of the VMs at once.

    This patterns works well for a suite of test VMs, since you probably don’t need to test all configurations in parallel. You might need only one or two of the guests at any given time.

    Again, you should not be tuning the Linux’s virtual memory manager to make the VM host happy. That’s one of the jobs VMware Tools performs.

  • Warren Young wrote:


    Note that in ’09, VMWare was advising us (where I was working at the time)
    to not go over 2 or 2.5 times real memory….

    mark

  • That’s fine if you’re after the isolation feature of VMs, but no good if you’re trying to run entirely different OSes. nspawn/containers/jails don’t let me build EL5 RPMs on my EL7 box, and they certainly aren’t going to run my Windows XP IE7 VM, which I use to ensure that our web app is still semi-functional on it.

    Still, good tip, and worth keeping in mind the next time I find myself considering a heavier solution.

  • Warren:
    Thanks for the good info and link.

    Hmm.. I may be misunderstanding how the balloon driver is working…

    I’m looking at section 3.3 in this guide:
    https://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdf

    When a guest starts up, the cached memory is very low. This is reflected in the VMWare hypervisor view that shows a small percentage of host memory being used. After disk activity, the host memory allocation grows to the point that it’s allocating all the configured memory in the hypervisor view. The guest ‘free’ still shows the majority of memory as available (though “cached”).

    “vmware-toolbox-cmd stat balloon” reports 0MB have been ballooned on these instances.

    From the PDF above, it seems that only under memory pressure on the hypervisor level does the ballooning kick. Unfortunately, I don’t have a way to safely test this.

    This is interesting. We may be double-caching then if VMWare host is also doing some caching.

    This is a possibility. It will be a hard sell but may work for some.

    Agreed.. I don’t want to do too much on the guest side but we’re getting heat to increase density. This is caused by some app owners that throw memory at systems as a first step in troubleshooting. :D

    Thanks again for your feedback..

    Kwan

  • Interesting.. Not an option for us currently but perhaps as an alternative to Docker it will come in handy.

    Thanks for the feedback.

    Kwan