Demystifying the MMU - Part II
The Story So Far …
In Part I of this article series, we saw the problems OS designers faced when they had to share the computer’s memory between multiple processes. We hinted that a separate hardware unit called the MMU was added to help OS designers solve these problems. In this article, we will look at the functionality of the the MMU and how exactly these problems were solved by the MMU. But before that lets take a quick digression into emergency numbers in telephone systems.
To help people understand the MMU, we would like to use the Emergency Number service as an analogy. All phones have a phone number associated with them. A call can be made to the phone by dialing its phone number. Let’s call these phone numbers associated with a phone as physical phone numbers. The following figure shows a bunch of phones and their associated physical phone numbers.
Apart from the regular phone numbers, there are emergency numbers like 100, 101 and 102, for Police, Fire and Ambulance respectively (this is specific to India, but we hope others can relate to it). These numbers do not correspond to a physical phone. When any of these numbers are dialed, the number is mapped to another physical phone number by the local telephone exchange, and the call is redirected to that physical phone. The exchange maintains a table that maps the emergency phone numbers to physical phone numbers. We call these emergency numbers as virtual phone numbers, since they are always converted by the local telephone exchange to a physical phone number. The virtual phone numbers are not directly associated with a particular physical phone.
The advantage of emergency numbers is that, when the user moves from one city to another, the local exchange will map the emergency numbers to local services of that city. That way the user does not have to remember / figure out the physical phone number for each local emergency service.
Key Points to Note
Virtual phone numbers are always converted to physical phone numbers by the exchange using a mapping table.
When the user moves from one city to another, she gets connected to a different exchange, which has a different mapping table, that maps the same virtual phone numbers, to a different set of physical phone numbers.
The CPU and memory is connected using the System Bus, as shown in the following diagram.
The MMU sits between the CPU and System Bus.
With the MMU in place, the addresses generated by the CPU are called Virtual Addresses. The MMU takes these addresses and translates them to Physical Addresses, using a mapping table. Just like the telephone exchange translates virtual phone numbers to physical phone numbers, the MMU translates virtual addresses to physical addresses.
Pages and Page Tables
In the case of Emergency Numbers, only a handful of mappings is required. But in the case of memory, a huge set of mappings is required, if the mappings are to be done on an address-to-address basis. Hence, the mapping is not done on address-to-address basis, instead the virtual address space is divided into 4K sized chunks called pages. The virtual pages are give unique page numbers. The splitting of pages in a 32-bit system is shown in the following diagram.
Similarly, the physical address space is also split into pages.
Now, the mappings are specified on a page-to-page basis, as shown in the following diagram. This table that maps virtual pages to physical pages is called the page table.
With the above mapping in place, the virtual address
(first address of virtual page
0) will be translated to the physical
0x0000_1000 (first address of physical page
examples of translations from virtual to physical addresses is shown
in the following table.
Using the MMU
Now that we know how the MMU functions, let’s see how the OS utilizes the MMU.
When a process P1 is started, the OS loads the corresponding code and
data into available physical pages. It then creates page table entries
that maps a set of virtual pages to those physical pages, as shown in
the figure below. When the CPU generates virtual addresses in the range
0x0000_2000 it ends up accessing P1’s code and data.
When another process P2 is started, the existing page table entries
are cleared. The OS loads P2’s code and data into available physical
pages. It then creates page table entries that maps a set of virtual
pages to those physical pages, as shown in the figure below. When the
CPU generates virtual addresses in the range
0x0000_2000 it ends up accessing P2’s code and data.
It should be noted that the OS gives each process its own page
table. When the OS switches to a process, the process' page table
entries are loaded, and the corresponding mapping takes effect. The
same virtual address
0x0000_1000 is mapped to different physical
address depending upon which process is executing.
Process P1: VA
0x0000_1000mapped to PA
Process P2: VA
0x0000_1000mapped to PA
This is similar to moving from one city to another city and dialing the same emergency number. Even though the emergency number is the same, since the mapping has changed, the user ends up calling a different physical phone number.
It is to be noted that, it is not necessary to have mapping for every virtual page. If the CPU generates a virtual address that corresponds to a unmapped physical page, then the MMU raises a Page Fault interrupt to the CPU.
The OS handles the page fault interrupt, and will generally terminate the err-ing process. That is when the user sees the notorious "Segmentation Fault" message on the terminal.
In this section, we will look at how the issues discussed in Part I of the article are resolved.
Let’s reconsider the previous scenario now, with page tables in
place. There are two process P1 and P2. If P1 generates virtual
addresses in the range
0x0000_2000 it ends up
accessing it’s own code and data. If P1 generates a virtual address
outside this range, it results in a page fault and P1 will be
terminated by the OS. The same applies to P2 as well.
There is no way that a process can generate an address that will result, in another process code and data being accessed. In essence, the OS with the help of the MMU, has narrowed the physical memory view of process to its own code and data.
Let’s try to understand how the problem of fragmentation is solved. Let’s say there is no contiguous physical memory available, to execute a process, the OS can slice the process into pieces and load the pieces into available physical memory location. The OS then maps these non-contiguous pieces into contiguous locations in virtual address space. Even though the process is dis-contiguous in physical address space, it is contiguous in virtual address space.
Let’s try to understand this better with an array example. Let’s say an array is sliced right through the middle, and put up in different addresses in physical address space. The program generates contiguous virtual addresses to access the array, and the MMU translates them to the corresponding dis-contiguous locations in physical memory. In fact the process does not even realize that it is put in dis-contiguous physical memory locations. The OS with the help of the MMU is able to shield away the fact that the process is in dis-contiguous location in physical memory!
It is to be noted that all the addresses used within a program, correspond to virtual addresses. A program generally does not deal with physical addresses at all.
When a process is swapped-out and later swapped-in, it is likely to be relocated to a different address. Let’s try to understand how the problem of relocation is solved. The following diagram shows a scenario in which process P2 is swapped-out to disk and swapped-in to a different physical address. When process is swapped-in to a different physical address the page table mappings are updated, so that the virtual address to point to the new set of physical addresses.
Since all the pointers within a program correspond to virtual addresses, and since the virtual address have not changed during relocation, the program will execute, without realizing the fact it has been relocated in physical address space!
Translation Look-Aside Buffer (TLB)
Despite the fact that the page table is only mapping on a page-to-page basis, the no. of page table entries is quiet large. For a 4K page size, about 1048576 entries are required. Providing such memory to store the page table entries within the MMU in practice is not feasible.
So processors, provide space for a small no. of page table entries, say 512 or 1024, called the Translation Look-Aside Buffer (TLB). Actual mapping is stored in RAM, using a data structure optimized for space and lookup. When the MMU needs to do a translation, it first looks into the TLB for a mapping. If the mapping is not found the required entry needs to be fetched from the data structure in memory into the TLB, and the translation is re-tried.
In summary, the entire page table is not stored within the MMU. Rather a cache called the TLB, is used to store recently accessed page table entries. And required page table entries are loaded into the TLB on demand.
Through this two part article article series, hope we have acheived the following:
Highlight the problems leading to the inclusion of the MMU
Explain how the OS uses it to overcome them
Explain the functionality of the MMU through models of increasing complexity
Please feel free to share your thoughts in the comments section below.