Memory management on the x86

The developement of the x86 family of processors has seen two major memory management techniques, real mode and protected mode, which are both based on the memory segmentation principle.

At least next time your computer crashes and you see these weird memory locations and cpu register dumps on the blue screen of death, you'll know what they mean.

Memory Segmentation

Memory is divided into blocks of bytes called segments. Each byte in a segment is indexed by its offset. These numbers are typically separated by a semicolon. For example 0010:00123456 refers to byte number 0x123456 (1193046 decimal) of segment number 0x10 (16 decimal).

The couple SEGMENT:OFFSET is composed of two types of registers : SEGMENT is a segment register (CS, DS, ES, FS, GS, SS) and OFFSET is a general purpose register (AX/EAX, BX/EBX, CX/ECX, DX/EDX, BP/EBP, SP/ESP, IP/EIP...). Each segment register has a specific task. The CS register is called Code Segment register and is used, in conjunction with the IP Instruction Pointer register to point to the instruction that is being executed (CS:IP). Other segment registers include the Data Segment DS register, the Stack Segment SS register and the Extra Segment ES register. Except for the SS register, which is used with the Stack Pointer SP register to point to the top of the stack (SS:SP), the OFFSET register can be any of the general purpose registers.

Say a program's entry point is loaded at memory location 0010:00123456, the processor will branch its execution to this point to run the program (CS=0010 and EIP=00123456). When it has executed say a 2 byte instruction, it steps to the next one at 0010:00123458 and so on.

Splitting memory into different segments is a good idea, but in fact physical memory is a linear space and memory segmentation was not invented because it was flexible but rather because registers became too small. Physical addresses start at byte 0 and end at byte MAX_MEM-1. The first problem appeared when MAX_MEM-1 became greater than the capacity of offset registers. With a 16 bit register, you can't index more than 216 bytes = 65536 bytes = 64 Kb.

16-bit Real mode Memory Segmentation

Chronologically, the first memory model to appear was 16 bit real mode segmentation. Starting with the 80x86, the memory bus was composed of 20 wires. In other words, the 80x86 could address up to 220 bytes of memory (= 1Mb). 16-bit registers were not enough and memory segmentation was introduced.

A 16-bit segment has a fixed size of 64 Kb. When the processor sees a SEGMENT:OFFSET memory location, it has to translate it into the corresponding physical memory location. For example when it has to execute the MOV AX, [DS:BX] (copy contents of memory location DS:BX to register AX) instruction, it has to translate DS:BX to the corresponding physical address φ and send the command 'put (contents of AX) at physical address φ' to the memory circuit.

Translation is done as follows :

    φ = SEGMENT * 0x10 + OFFSET
The insightful reader will have noticed that different SEGMENT:OFFSET combinations can represent the same physical address location. That is perfectly correct. 0000:0010 and 0001:0000 both represent the 16th byte of memory. At that time, memory sure was a really big mess.

32-bit Protected mode Memory Segmentation

With the 80x386 appeared, Intel developers introduced the protected mode. An archaic version of protected mode was introduced with the 80x286 Firstly, the old 20-bit address BUS was getting too small (despite many computer engineers thinking that "640K ought to be enough for anybody") and was extended to 32 bits. Secondly, segments were given the ability to have whatever size was needed and could have their base address anywhere in the memory.

Segment Descriptors and Segment Selectors

A segment can start at a random physical address and have any given length. This information (and the segment's type (code, data, stack), privilege level 0-3...) is stored in a data structure called the segment descriptor. Segment descriptors are stored in a table called the Global Descriptor Table GDT. The new GDTR register has been introduced to contain the physical address the GDT can be found at.

Segment registers are now called segment selectors because they do not map directly to a physical address but point to an entry of the descriptor table. When the processor needs to translate a memory location SEGMENT:OFFSET to its corresponding physical address φ it takes the following steps :

  1. Find the start of the descriptor table (GDTR register)
  2. Find the SEGMENTth entry of the table, this is the segment descriptor corresponding to the segment SEGMENT.
  3. Find the base physical address ψ of the segment
  4. Compute φ = ψ + OFFSET
Of course, the CPU takes additional steps needed to ensure that the offset lies in the segment boundary and other segment writeability/readability checks. If one test fails, it raises an exception fault.

At boot time, the x86 (this also applies to the latest Pentium) are in real mode. The first thing modern OSes do is load a micro kernel that fires the protected mode up.

Protected Mode's strengths

Memory segmentation under protected mode is very flexible and offers many advantages. Unfortunately not many operating systems take advantage of them.

Because the segment selector is not directly linked to the physical address, it is possible for the OS to move the entire segment to another location without the application noticing it. This is a very useful feature because memory tends to get fragmented and needs to be cleaned up.

Segments also have access rights (this accounts for the word 'protected') : They are given a privilege number (ranging from ring0 to ring3). An application executed in a ring(n) segment can't access ring(k) segments, where k < n.

Segments also have writeability, readability and executeability flags. For example under Linux, code segments are neither readable nor writable to prevent another application from modifying the code. But the stack segment can contain executable code. This can be considered as a security breach. It is possible to have more restrictive rights. Under Solaris, for example, the stack is non executable.

Advanced Memory Management

With a 20-bit address BUS, computers were limited to 1Mb of RAM. But the system used about half of it (for example the entire segment A000 was used to store what was displayed on the screen), leaving only 640Kb to the user. This is the famous 640Kb limit.

Expanded Memory Specification EMS was introduced to allow applications to access memory above 1Mb. The memory manager defines a window located under 1Mb that maps to a variable location of the same size above 1Mb. More than 1Mb memory could be installed on the computers but applications could access only 640Kb at a time.

With the 80x286, the address BUS was enlarged and the EMS technology was replaced by the eXtended Memory Specification XMS which took advantage of the protected mode to achieve the same thing as EMS.

Closing Words...

Good memory management involves good knowledge of the base system and the habits of the programmers. The protected mode has been a great improvement since its introduction. However it is really a pity that so few operating systems take full advantage of its features.

See memory allocation for an explanation on how applications deal with their memory needs.

http://www.intel.com/ Intel processor references