How exactly is a page table used to look up an address?
The CPU has a page table base register (PTBR)which points to the base (entry 0) of the level-0 page table. Each process has its own page table, and so in a context switch, the PTBR is updated along with the other context registers. The PTBR contains a physical address, not a virtual address.When theMMU receives a virtual address which it needs to translate to a physical address, it uses the PTBR to go to the the level-0 page table. Then it uses the level-0 index fromthemost-signi?cant bits (MSBs) of the virtual address to ?nd the appropriate table entry, which contains a pointer to the base address of the appropriate level-1 page table. Then, from that base address, it uses the level-1 index to ?nd the appropriate entry. In a 2-level page table, the level-1 entry is a PTE, and points to the physical page itself. In a 3-level (or higher) page table, there would be more steps:
This sounds pretty slow: N page table lookups for everymemory access. But is it necessarily slow? A special cache called a TLB1 caches the PTEs from recent lookups, and so if a page's PTE is in the TLB cache, this improves a multi-level page table access time down to the access time for a single-level page table.
When a scheduler switches processes, it invalidates all the TLB entries (also known as TLB shoot- down). The new process then starts with a "cold cache" for its TLB, and takes a while for the TLB to "warm up". The scheduler therefore should not switch too frequently between processes, since a "warm" TLB is critical to making memory accesses fast. This is one reason that threads are so useful: switching threads within a process does not require the TLB to be invalidated; switching to a new thread within the same process lets it start up with a "warm" TLB cache right away. So what are the drawbacks of TLBs? The main drawback is that they need to be extremely fast, fully associative caches. Therefore TLBs are very expensive in terms of power consumption, and have an impact on chip real estate, and increasing chip real estate drives up price dramatically. The TLB can account a signi?cant fraction of the total power consumed by a microprocessor, on the order of 10% or more. TLBs are therefore kept relatively small, and typical sizes are between 8 and 2048 entries.