Understanding “vmalloc region overlap”

I recently came across the following disconcerting message in my kernel’s boot output:

Truncating RAM at 40000000-5fffffff to -57ffffff (vmalloc region overlap).
Kernel command line: console=ttySC0,115200 mem=512M
Memory: 384MB = 384MB total

Which is the kernel’s way of saying “I understand there may be some RAM here – but I’m not going to use it all”. So what is the cause of this warning? And what do we need to do to reclaim that lost RAM?

To fully understand what is going on here we need to understand how the kernel utilises it’s virtual memory. Thankfully during the kernel boot, the kernel outputs a useful ‘Virtual kernel memory layout’ table as shown below:

Memory: 384MB = 384MB total
Memory: 385876k/385876k available, 7340k reserved, 0K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    DMA     : 0xfde00000 - 0xffe00000   (  32 MB)
    vmalloc : 0xd8800000 - 0xe0000000   ( 120 MB)
    lowmem  : 0xc0000000 - 0xd8000000   ( 384 MB)
    modules : 0xbf000000 - 0xc0000000   (  16 MB)
      .init : 0xc0008000 - 0xc0021000   ( 100 kB)
      .text : 0xc0021000 - 0xc036b000   (3368 kB)
      .data : 0xc0384000 - 0xc03b05c0   ( 178 kB)

The first point to make is that the kernel splits it’s virtual memory into kernel and user memory – of the 4GB of available address space (assuming a 32bit system) – the kernel usually uses the first 3GB for whatever user process is currently running and the last 1GB exclusively for itself. I say usually because you can configure it with the KConfig system (have a look at Kernel Features | Memory Split). The virtual memory address of the start of the kernel region is described by CONFIG_PAGE_OFFSET – it’s usually 0xc0000000.

In order to make life easy for the kernel – it permanently maps as much physical memory into the virtual address space as possible – this memory is called lowmem and is very efficient for the kernel to access . It’s mapped in at PAGE_OFFSET. As you can see from the above table the kernel has mapped this RAM from 0xc0000000 to 0xd8000000 – but as we are aware it was only able to map in 384 MB and not the full 512 MB we have installed. The problem we have is that the kernel has already set aside some virtual memory for a region called vmalloc – if we mapped in all the physical memory we desired then it would overlap this area (hence the warning)…

We’ll come back to this shortly – but let’s quickly examine the other areas in the kernel’s virtual memory map:

  • The .modules, .init, .text and .data sections refer to the kernel executable itself. The kernel’s address in virtual memory is usually hardcoded at 0xc0008000 (CONFIG_PAGE_OFFSET + TEXT_OFFSET).
  • The ‘vector’ section refers to the processor’s exception table and is usually defined as CONFIG_VECTORS_BASE.
  • The ‘fixmap‘ region refers to an area of virtual memory set aside for simple and fixed virtual memory mappings whose virtual addresses must be known at compile time and whose corresponding physical addresses can be changed later on. It lives at FIXADDR_START through to FIXADDR_TOP. Though it seems rarely used under ARM.
  • The ‘DMA’ region refers to memory set aside for DMA operations – defined by CONSISTENT_BASE and CONSISTENT_END.
  • Finally the ‘vmalloc’ region refers to dynamic allocations of virtually contiguous memory within the kernel. It lives in an area of memory defined by VMALLOC_START through to VMALLOC_END. The VMALLOC_END macro is defined by specific platforms (we’ll find out why soon) usually in a file such as arch/arm/mach-xxx/include/mach/vmalloc.h. The VMALLOC_START address is a set distance (VMALLOC_OFFSET) away from the end of the low mem region (high_memory).

So back to our problem – the kernel cannot map in all the physical memory because it has set aside other regions of virtual memory for other purposes at fixed addresses. However the sharp eyed amongst us may notice there are gaps in this table – therefore there is scope for moving things about a little so that we can fit everything in. The following illustration provides a graphical representation of the kernel’s memory map – it should make the gaps more visible.

Kernel Virtual Memory Map

Kernel Virtual Memory Map

You may ask why can’t the kernel dynamically shuffle these regions around for us? Well the reason is that some of these gaps may actually be used by platform code mapping in their own I/O peripherals. They do this by using things like iotable_init often through the boards .map_io hook and define both the physical and corresponding virtual memory address of the intended mappings. They use the IO_ADDRESS macro defined by the platform code to provide the address translation. And this is also the reason why they define VMALLOC_END – to ensure the kernel won’t use virtual memory that may be used by the IO_ADDRESS macro – It’s all a bit ugly.

Therefore to solve our problem we need to redefine the IO_ADDRESS macro to ensure that all mappings used by it end up higher up in virtual memory and that they won’t clash with any of the other kernel regions. Once we have done this we can then modify (increase the value of) VMALLOC_END to reflect the reclaimed space previously used for IO. Provided the platform code doesn’t use stacks of statically mapped I/O; the gap between VMALLOC_START and VMALLOC_END should be large enough to fit in our RAM.

Of course, when using the 3GB/1GB memory split – the kernel will only ever be able to use less than a 1GB of RAM as low memory. If you want to use more you need to enable high memory ‘CONFIG_HIGHMEM’ (which can be found in the KConfig system in Kernel Features | High Memory Support). When this option is enabled – the kernel is able to map in previously unaccessable memory with temporary memory mappings when required. Though due to the overhead of managing memory mappings it can come at a cost to performance. We could have worked around our ‘vmalloc region overlap’ warning by simply switching on CONFIG_HIGHMEM – as the unaccessable memory would then be treated as high memory.

The other workaround would have been to use the ‘vmalloc’ kernel argument. This argument allows you to specify (in bytes) the precise size (within limits) of the vmalloc region. By default it will ensure the region is 120* MB in size and will truncate your physical RAM in order to meet this constraint (with the outcome of reduced VMALLOC_START address) – therefore reducing the size of this region will allow you to reclaim some lost RAM (though at a compromise). * The kernel ensures there is a gap of 8MB between the end of the mapped in physical RAM and the vmalloc region (to catch errors). The default size of the vmalloc area (vmalloc_reserve) is actually 128MB but includes this dead space.

After careful consideration of how my particular platform was using static I/O I was able to increase both VMALLOC_END and the addresses produced by IO_ADDRESS up by 128MB – which allowed me to use all of my physical RAM as low memory!

Given the use of dynamic I/O mapping functions such as ioremap and the difficulties caused by platform code using iotable_init – what are the benefits of iotable_init? Should it be deprecated? I’d love to hear your thoughts on the matter. [© 2011 embedded-bits.co.uk]

, , , , , , , , , , , , , , , , , , ,

About Andrew Murray

Andrew is an experienced commercial Linux developer with a first class degree in Software Engineering and is the founder of Embedded Bits Limited. His day-to-day role fulfils his passion for learning and provides him with plenty of embedded Linux experience including kernel and embedded applications development on a wide variety of platforms. He loves to talk about boot time reduction and has performed a number of presentations on the topic at technical conferences - he has also been successful in achieving sub-second cold boot on Linux based products. Feel free to drop him an email at amurray@embedded-bits.co.uk
No comments yet.

Leave a Reply