The 80386 is a 32-bit processor, with a 32-bit addressable memory space. The designers of the Paging subsystem noted that a 4K page design mapped into those 32 bits in quite a neat way - 10 bits, 10 bits and 12 bits:
+-----------+------------+------------+
| Dir index | Page index | Byte index |
+-----------+------------+------------+
3 2 2 1 1 0 Bit
1 2 1 2 1 0 number
That meant that the Byte index was 12 bits wide, which would index into a 4K Page. The Directory and Page indexes were 10 bits, which would each map into a 1,024-entry table - and if those table entries were each 4 bytes, that would be 4K per table: also a Page!
So that's what they did:
Both the top-level Directory and the next-level Page Table is comprised of 1,024 Page Entries. The most important part of these entries is the address of what it is indexing: a Page Table or an actual Page. Note that this address doesn't need the full 32 bits - since everything is a Page, only the top 20 bits are significant. Thus the other 12 bits in the Page Entry can be used for other things: whether the next level is even present; housekeeping as to whether the page has been accessed or written to; and even whether writes should even be allowed!
+--------------+----+------+-----+---+---+
| Page Address | OS | Used | Sup | W | P |
+--------------+----+------+-----+---+---+
Page Address = Top 20 bits of Page Table or Page address
OS = Available for OS use
Used = Whether this page has been accessed or written to
Sup = Whether this page is Supervisory - only accessible by the OS
W = Whether this page is allowed to be Written
P = Whether this page is even Present
Note that if the P
bit is 0, then the rest of the Entry is allowed to have anything that the OS wants to put in there - such as where the Page's contents mught be on the hard disk!
PDBR
)If each program has its own Directory, how does the hardware know where to start mapping? Since the CPU is only running one program at a time, it has a single Control Register to hold the address of the current program's Directory. This is the Page Directory Base Register (CR3
). As the OS swaps between different programs, it updates the PDBR
with the relevant Page Directory for the program.
Every time the CPU accesses memory, it has to map the indicated virtual address into the appropriate physical address. This is a three-step process:
PDBR
to get the address of the appropriate Page Table;Because both steps 1. and 2. above use Page Entries, each Entry could indicate a problem:
When such a problem is noted by the hardware, instead of completing the access it raises a Fault: Interrupt #14, the Page Fault. It also fills in some specific Control Registers with the information of why the Fault occurred: the address referenced; whether it was a Supervisor access; and whether it was a Write attempt.
The OS is expected to trap that Fault, decode the Control Registers, and decide what to do. If it's an invalid access, it can terminate the faulting program. If it's a Virtual Memory access though, the OS should allocate a new Page (which may need to vacate a Page that is already in use!), fill it with the required contents (either all zeroes, or the previous contents loaded back from disk), map the new Page into the appropriate Page Table, mark it as present, then resume the faulting instruction. This time the access will progress successfully, and the program will proceed with no knowledge that anything special happened (unless it takes a look at the clock!)