Linux manages its physical memory in clever and often efficient ways – as a result, it’s not uncommon to only think about how the memory in your system is being used when we run into performance issues. And this is where the frustration can begin – without fully understanding how memory is managed, it can be very difficult to answer some seemingly straight-forward questions like ‘How much free memory do I have?‘ or ‘How much memory is this process taking?‘. There are a lot of complications and as a result, performance monitoring can be a challenge during the development of an IoT solutions.
I was determined to fully understand precisely what the various memory figures report by the kernel mean and understand – on a practical level – the implications of Linux’s memory management on our performance sensitive applications.
MemTotal: In this multi-part post we’ll attempt to debunk many of the mysteries of Linux’s memory.
MemTotal / Memory Available
We’ll start off by looking at the ‘MemTotal’ line – as reported by ‘cat
/proc/meminfo’. For the purpose of these tutorial’s, I’ll be using a Linux 2.6.35 kernel on a virtualized (with QEMU) ARM versatilepb board.
# cat /proc/meminfo_x000D_ MemTotal: 29372 kB_x000D_ ..._x000D_ _x000D_ # cat /proc/cmdline_x000D_ console=ttyAMA0 mem=32M loglevel=9 root=/dev/ram _x000D_ rdinit=/sbin/init |
The first thing you may notice is that there is a slight difference between the memory we’ve allowed the kernel to use: 32 MB and the amount that the kernel is reporting as it’s total for use: 28.68 MB – in other words we seem to have already lost nearly 4 MB. To shed some light on this, we’ll examine the kernel log.
# dmesg_x000D_ ..._x000D_ On node 0 totalpages: 8192_x000D_ free_area_init_node: node 0, pgdat c05e7dcc, node_mem_map c0613000_x000D_ Normal zone: 64 pages used for memmap_x000D_ Normal zone: 0 pages reserved_x000D_ Normal zone: 8128 pages, LIFO batch:0_x000D_ Built 1 zonelists in Zone order, mobility grouping on._x000D_ Total pages: 8128_x000D_ Kernel command line: console=ttyAMA0 mem=32M loglevel=9_x000D_ root=/dev/ram rdinit=/sbin/init_x000D_ PID hash table entries: 128 (order: -3, 512 bytes)_x000D_ Dentry cache hash table entries: 4096 (order: 2, 16384 bytes)_x000D_ Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)_x000D_ Memory: 32MB = 32MB total_x000D_ Memory: 26264k/26264k available, 6504k reserved, 0K highmem_x000D_ Virtual kernel memory layout:_x000D_ vector : 0xffff0000 - 0xffff1000 ( 4 kB)_x000D_ fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)_x000D_ DMA : 0xffc00000 - 0xffe00000 ( 2 MB)_x000D_ vmalloc : 0xx2800000 - 0xc2000000 ( 344 MB)_x000D_ lowmem : 0xc0000000 - 0xc2000000 ( 32 MB)_x000D_ modules : 0xbf000000 - 0xc0000000 ( 16 MB)_x000D_ .init : 0xc0008000 - 0xc0311000 (3108 kB)_x000D_ .text : 0xc0311000 - 0xc05be000 (2740 kB)_x000D_ .data : 0xc05be000 - 0xc05d83e0 ( 105 kB) |
The first observation we can make can be found on the following lines:
Memory: 32MB = 32MB total_x000D_ Memory: 26264k/26264k available, 6504k reserved, 0K highmem |
We can determine that the kernel has picked up our request for it to use 32MB or memory. We can also conclude that of the total 32MB of memory – 26264k is available and 6504k of that has been ‘reserved’. For the sharp among us – you may also have noticed the ‘available’ memory reported here is different to MemTotal displayed from /proc/mem/info at the end of the boot. This difference can be explained via my previous post on the Init Call Mechanism in Linux – though in a nut shell: The kernel is able to free memory previously occupied by initialization code upon boot – as this code will never be executed again. This code is contained within the ‘.init’ section of the kernel image and its size is reported by the ‘Virtual kernel memory layout’ table shown above. Finally, we can explain where our MemTotal and Memory Available figures come from:
Available = total - reserved_x000D_ 26264k = 32768k - 6504k_x000D_ _x000D_ MemTotal = total - reserved + .init_x000D_ 29372k = 32768k - 6504k + 3108k |
Reserved Memory
We’ve seen from the kernel output that 6504k of memory gets ‘reserved’ and thus eats into our available memory – so what is this reserved memory? Taking a high-level view – it tends to be memory which the kernel reserves/allocates for its own use and for which it never intends to release. To give you an example, I added some instrumentation into the code to find out exactly where this reserved memory goes:
6184 kb - arch/arm/mm/mmu.c:reserve_node_zero (memory occupied for the kernel iteself (kernel _stext > _end)) _x000D_ 56 kb - mm/page_alloc.c:alloc_node_mem_map _x000D_ 6 kb - fs/dcache.c:dcache_init_early (directory entry cache for the VFS sub system) _x000D_ 6 kb - arch/arm/mm/mmu.c:reserve_node_zero - memory from page tables 8 kb - kernel/pid.c:pidhash_init - PID hash table (hash table for PID lookups) _x000D_ 4 kb - arch/arm/mm/init.c:bootmem_init_node - memory for bootmem allocator bitmap _x000D_ 4 kb - arch/arm/mm/init.c:free_area_init_node _x000D_ 8 kb - arch/arm/mm/mmu.c:paging_init - zero page _x000D_ 8 kb - arch/arm/mm/mmu.c:create_mapping _x000D_ 4 kb - arch/arm/kernel/setup.c:request_standard_resources |
The kernel, of course, occupies some memory and thus a large portion of the reserved space is used to cover itself (to prevent its pages being allocated). You’ll notice that this size is similar to that reported by ‘size’ of your vmlinux (proper) image. Some reserved memory is used for various caches or hash tables. And the remaining memory is used to initialize the structures required for the memory sub system and physical (zone) memory allocator.
There is a bit of a chicken and egg situation here – in order to support the complex memory allocators and subsystem – these allocators need their own memory allocator to allocate memory for themselves to initialise! In order to achieve this the kernel has a ‘bootmem‘ allocator just for this purpose. It’s life ends very early during boot once the physical memory allocator has initialised. In fact when we see the ‘Memory: 26264k/26264k available, 6504k reserved, 0K highmem’ line – this is the point where it passes its free pages over to the physical memory allocator. The ‘reserved’ memory is all the memory the bootmem allocator has allocated which hasn’t been freed at this point in time.
Take Aways
The one useful thing we can take from this is that we can maximise the available system RAM for user processes by reducing the amount of memory reserved by the kernel. And we can do this in the following two ways:
- Reduce kernel size – We can do this by removing un-necessary drivers and functionality (for example kallsyms, IKCONFIG)
- Ensure maximum use of the .init section – You should make sure that all initialisation code and data is correctly stored in the .init section such that it can be reclaimed after boot (via __init and __initdata).
Further Information
Whilst trying to understand this lot, I came across some very useful documentation on how the kernel manages it’s memory. If you wish to read more try here, and some info on QEMU here.
Fist pulication date on Embedded Bits blog : 2011