Reducing host memory consumption

Site Home Page
The UML Wiki
UML Community Site
The UML roadmap
What it's good for
Case Studies
Kernel Capabilities
Downloading it
Running it
Compiling
Installation
Skas Mode
Incremental Patches
Test Suite
Host memory use
Building filesystems
Troubles
User Contributions
Related Links
The ToDo list
Projects
Diary
Thanks
Contacts

Tutorials
The HOWTO (html)
The HOWTO (text)
Host file access
Device inputs
Sharing filesystems
Creating filesystems
Resizing filesystems
Virtual Networking
Management Console
Kernel Debugging
UML Honeypots
gprof and gcov
Running X
Diagnosing problems
Configuration
Installing Slackware
Porting UML
IO memory emulation
UML on 2G/2G hosts
Adding a UML system call
Running nested UMLs

How you can help
Overview
Documentation
Utilities
Kernel bugs
Kernel projects

Screenshots
A virtual network
An X session

Transcripts
A login session
A debugging session
Slackware installation

Reference
Kernel switches
Slackware README

Papers
ALS 2000 paper (html)
ALS 2000 paper (TeX)
ALS 2000 slides
LCA 2001 slides
OLS 2001 paper (html)
OLS 2001 paper (TeX)
ALS 2001 paper (html)
ALS 2001 paper (TeX)
UML security (html)
LCA 2002 (html)
WVU 2002 (html)
Security Roundtable (html)
OLS 2002 slides
LWE 2005 slides

Fun and Games
Kernel Hangman
Disaster of the Month

Reducing host memory consumption

The problem

When you start UML, it will create a file whose size is the same as the UML physical memory size, which is whatever you specified with the 'mem=' switch or its default. Initially, none of that is backed with real host memory, but that changes as it touches its physical memory. Its host memory consumption will asymptotically approach its physical memory limit as it reads data into cache. That consumption will not tend to decrease for two reasons

UML, like any other Linux kernel, will keep data in cache as long as there is no pressing need for memory. This makes sense for a native kernel running on hardware since there is no other possible use for the memory. However, for UML, the host quite possibly could make better use of any under-used memory.

Even if UML did release a bunch of cache, it has no way of communicating that to the host, which will still see that memory as being dirty, and needing preservation, by swapping it if necessary.

Another problem is the unnecessary duplication of data in host memory. If you have a number, say 'n', of UMLs booted from similar disk images, the data that's the same is present 2 * n times in the host memory, n times in the host page cache, and n times in the UML page caches.
Booting those UMLs from the same image with COW files will reduce the number of copies in the host page cache to 1, but there will still be n copies in the UML page caches, one in each UML. These copies can be eliminated with the use of ubd=mmap, which causes the ubd driver to mmap pages from disk images rather than using read() and write(). This causes UML to share page cache with the host.
In order to see any reduction in host memory use, it is necessary to free the UML phsyical memory pages that were over-mapped by the ubd driver. If these are mapped in from tmpfs or any other filesystem, they won't be freed since filesystems preserve data even when the data isn't mapped in anywhere. So, this is where the host /dev/anon driver comes in. It has the property that when a page is mapped in, and subsequently unmapped, the page will be freed back to the host, and any data that was stored there will be lost. Obviously, this is wrong for a filesystem, but this is exactly what's needed for UML.

/dev/anon

To have UML use /dev/anon, you need to do the following:

apply the devanon patch (available from the download page) to your host kernel

make /dev/anon a character device with major number 1 and minor 10
mknod /dev/anon c 1 10
with read-write permissions for everyone (or at least everyone who will be running UML)
chmod 666 /dev/anon

get a UML that supports /dev/anon (2.4.23-2 or later)

run it with the 'ubd=mmap' switch

make sure that the UML filesystems have 4K (one page) blocksize

NOTE: At this point, 'ubd=mmap' is known to eat filesystems, so don't try this yet with data that you care about. If you use COW files, the backing files are safe since they're opened read-only, but any data that UML has write access to is at risk from 'ubd=mmap'.
Use of the 'ubd=mmap' switch is needed in order to get any memory use benefits from /dev/anon. However, UML will use it regardless of whether the ubd driver is doing mmap. This is almost the same as having it use files in /tmp for its physical memory, except that the UMLs won't be limited to the capacity of /tmp. This makes the host management somewhat easier. Without /dev/anon, you need tmpfs mounted on /tmp for optimal UML performance. With it, you get the same performance without needing to make sure that your tmpfs mount is big enough for the UMLs running on the system.

Memory savings

In my testing, which involves booting a number of instances on a Debian image to a login prompt, the combination of /dev/anon and 'ubd=mmap' results in about a 25% decrease in host memory consumption. This was measured by looking at the decrease in free memory per UML when the host was not under memory pressure. Without 'ubd=mmap', this is ~28M. With 'ubd=mmap', this went down to ~21M. I checked this by counting the number of instances I could boot before the host went into swap. This increased from 16 to 20, again a 25% increase.
This is obviously workload-dependent. Workloads that involve lots of readonly data, such as code, will benefit more than those that involve read-write data.

Performance of 'ubd=mmap'

People commonly ask if switching the ubd driver from using read() and write() (and the data copying they do) to using mmap (and the tlb flushing that it does) will help the performance of a single UML. The answer is that the performance is probably about the same either way. The reason is that mmap is an expensive operation for the CPU. People have measured how large data copies have to be in order to be more expensive than playing games with mmap.
The size that I've seen is about 8K, or two pages on x86. Since UML is doing maps with 1 page granularity, which is in the ballpark, you'd expect the performance to be about the same. My testing, which has mostly been kernel builds, bears this out. The speed is pretty close, to within the error, either way.

Future work

This is leading to more flexibility in managing the memory consumption of UMLs and the host. The next step is to allow the host to add and remove memory from UMLs at run-time, i.e. hot-plug memory.
This will allow a utility on the host to keep track of the memory usage of the UMLs and the host, and shuffle memory between the UMLs and to and from the host in order to implement whatever policy is desired.
I am planning a shipping a sample utility which implements the host-should-not-swap policy. This would keep track of the free memory on the host, and, when it gets low, it would remove memory from an idle UML and release it to the host. When the host has ample memory, it would look for a UML that's under memory pressure and give it some memory.
It is also possible to use this facility to reverse the monotonic increase in UML memory consumption described above. By pulling memory out of a UML until it squeals, you will force it to drop its page cache, resetting its host memory consumption to what it was much earlier in its life. Doing this periodically to all the UMLs on the host will result in their cached data being much more up-to-date, at the expense of some performance loss from old cache data not being available and needing to be reread.
These are probably a decent start for any UML site, and might be sufficient for many, but some may need something different. For example, a UML ISP might have different classes of UMLs on the same host, and would want the expensive UMLs to have preference over the cheap ones for whatever memory is available. UML will provide mechanism, plus an example policy or two, and it will be up to the user community to implement whatever policies are appropriate for host memory management.

Hosted at