Turning on an ARM MMU and Living to tell the Tale: Some Theory

Homepage Turning on an ARM MMU and Living to tell the Tale: Some Theory

In my latest article ”Create your own MLO for a Beagleboard XM”  I wrote some bare metal code which ran on a BeagleBoard xM as an MLO – I’d like to extend this by running this code with the MMU switched on. I want to write the absolute minimum amount of code required to turn on an ARM MMU and to come out the other side in one piece. This post describes the basic principles of operation of an MMU – we’ll come on to writing code in my next post.

One of the most fundamental tasks of an MMU is to translate virtual addresses into physical addresses. How virtual addresses map onto physical addresses is entirely a matter of software design – the ARM MMU design provides great flexibility for helping you in this area. Just to illustrate this and to demonstrate the capability of these MMUs, I’ve come up with some perfectly valid schemes (though some of which at first may seem nonsensical):

  • Split the entire virtual address range into 1KB areas (let’s call these areas ‘pages’) which all point to the same 1KB of physical memory, i.e. a many-to-one mapping.
  • Map multiple areas of physical memory to the same area of virtual memory – but only map one of them at a time.
  • Don’t map a set of virtual address space to anything – but use it anyway
  • Map the entire virtual address range to the entire physical address range – in other words, an identity mapping – a type of one-to-one mapping.

However strange these schemes or mappings may sound – most operating systems use a combination of these schemes to form the basis of some complex performance optimisations. For example, consider the first scheme.

Quite often there is a need to access memory that is all zeroed (in other words memset to 0) – this is often the case when allocating memory for users – it’s important not to give users previously used memory which may contain data that is sensitive or represents security concerns. However memsetting memory can be time-consuming. And this is where the MMU comes in – when user’s request memory the MMU creates a set of mappings specifically for the user which points as many pages of virtual memory as the user requires to the single physical page of zeroed memory (sometimes called the zero page).

The obvious flaw here is that if the user writes to this memory all the other pages in their mapping (and other user’s mappings) will all change. Fortunately, the MMU also allows us to set permissions (amongst other attributes) which allow us to make these regions of virtual memory read-only – when the user tries to write, the operating system copies the zero page somewhere else in physical RAM and updates the mappings. This technique is sometimes called ‘copy on write‘. More generally this type of many-to-one mapping is also used to support things like shared libraries and the fork system call – but we can come onto these another time.

The second mapping scheme illustrates that mappings can change on the fly. Most operating systems give the illusion that each process has the entire virtual address space – and indeed applications depend on this as they are often linked to run at specific memory addresses.

In order to support this each time the scheduler let’s your process run – it maps in that processes image into virtual address in place of the last process. Without this processes would either have to be built as position independent or would have to have prior knowledge of where in memory they will live and how much memory they can use.

The third mapping scheme is also used by operating systems to improve performance. When something tries to access a virtual address that has no corresponding physical page (or the permissions are invalid) a page fault occurs. The operating system can provide a handler for this and act accordingly.

This can be very useful – for example when you execute a process in Linux it doesn’t bother to load the image into RAM prior to starting it – it will just jump to the place in RAM where it should be. Of course, when it does this a page fault occurs – it’s only then that the page is actually loaded into RAM! This provides a performance boost as not all the pages in a process are always used and certainly not straight away. This technique is known as demand paging – it’s also used to support page swapping (virtual memory) – i.e. allowing a process to use more RAM than is physically available.

And finally, the last mapping is what we intend to implement – a very simple mapping where the conversion between virtual and physical addresses can be expressed by a formula (e.g. va = pa) – in other words, an identity mapping.

There are a few resources of information (a combined 5330 pages) that may prove to be essential if you plan to explore in this area (with a BeagleBoard xM) – I’d strongly recommend being aware of the following:

    • A.N. Sloss, D. Symes, C. Wright, ARM System Developer’s Guide, Elsevier, 2004 [Available online] – This book provides a very good introduction to the ARM architecture from a software perspective. It includes information, with very good examples, on how to utilize things like exceptions, caches, MMUs, etc.
    • ARM Architecture Reference Manual, ARM, ddi0100e, 2000 [Available online] – This reference manual provides the authoritative description of the ARM architecture.
    • BeagleBoard-xM Rev C System Reference Manual, beagleboard.org, BB_SRM_xM rev1.0, 2010 Available here – Hardware description of the BeagleBoard xM

Want to go further? Discover our next article : Turning on an ARM MMU and Living to tell the tale: The code

Andrew Murray - UK Managing Director
12 January 2018