Page Table Structure and Hardware Support:
Different operating systems have different methods for implementing page tables.
Most allocate a page table per process. A pointer to page table of each process
is maintained with all of the other register values associated with the process
and stored in its PCB.
Reloading these register values and defining the correct hardware page table
values from the stored process/user page table is part of a context-switch.
The actual implementation of the page table can be accomplished in several
different ways. The simplest technique involves defining a dedicated set of
registers to be used as the page table. Since every memory access requires going
through the page table, these registers need very high-speed logic associated
with them to make the address translation efficient. This register technique
works reasonably well if the size of page table is relatively small.
Most modern systems allow extremely large page tables i.e. a million entries or
more. r;6r such machines, the register implementation of the page table is not
feasible primarily due to cost considerations. Instead of registers such
machines keep the page table in main memory and use a page table base register (PTBR)
to maintain a pointer to the page table. Changing page tables during a context
switch requires changing only the value in this register rather than physically
loading a large number of registers.
This approach has an inherent problem with the time that is required to access a
logical memory address. To access logical address n, access must first be into
the page table using the PTBR offset by the logical page number in which n is
located. This requires a memory access. This access provides the physical frame
number in which the logical page holding n is currently located. Then the memory
access to this address occurs. This scheme requires two memory accesses for
every logical address generated by the CPU. This doubles the time required to
perform a physical memory access when the CPU generates a logical address. Such
an increase in the time required to perform a memory access is not tolerable.
The standard solution to this problem is to use content-associative memory
(CAM). It is also called content-addressable memory, associative registers or
buffers (TLBs). CAM is built from extremely high-speed memory where each cell
(these can be thought of as registers) consists of two parts: a key and a value.
When the CAM is presented with an item to match, that item is compared with all
the keys simultaneously and if one of the cells keys matches with the item to be
matched, its value component is output.
When used as a page table, the CAM is presented with an item to match that
represents a logical page number. Each cell in the CAM represents one page table
entry where the value part of the cell holds the physical frame number in which
the logical page currently resides. This is the value, which is output by the
cell. If the logical page number is found in the CAM, its frame number becomes
immediately available and is used to access the physical memory. While this type
of memory is quite expensive, it is also extremely fast. Typically a CAM used
for such purposes contains between 8 and 2048 cells. If the page number is not
in the CAM, then a memory reference to the page table (in memory) must be made.
When the frame number is obtained from the physical memory then it can be used
for the translation and performing the second memory access. This page number
and frame number are added to the CAM so, that on the next request it will be in
If the CAM is already full, the operating system must select a CAM entry for
removal so that the one can be entered. Operating system will use a CAM entry
replacement protocol for the basis of this decision. Each context-switch will
require that the CAM be flushed to ensure that the next process does not use the
translation information left behind by the process just switched out.
Each logical address request generated by the CPU whose translation information
is in the CAM at the time of the request is called a CAM hit. The percentage of
time that this
occurs is called the CAM hit ratio. An 86% hit ratio means that 86% of the time
the necessary translation information is in the CAM. For example, if it takes 15
nanoseconds to search the CAM and a memory access requires 100 nanoseconds, then
the mapped memory access requires a total of 115 nanoseconds if there is a CAM
hit. If there is a CAM miss on the logical address request, then the total time
required will be 215 nanoseconds, since two memory accesses are required in
addition to the CAM search time. Assume negligible time to add the new entry to
the CAM - although in reality this time is not negligible. To find the effective
memory access time (its like the average access rime under these conditions)
each case must be multiplied by its probability, which gives:
Effective memory access time = (0.86 x 115) + (0.14 x 215) = 98.9 + 30.1 = 129
Thus, the effective memory access time is 129 nanoseconds that represent a 14%
slow down approximately,
The hit ratio is related to the number of cells in the CAM. When the number of
cells ranges between 16 and 512 a hit ratio of between 80%-98% can be achieved.
Intel's 80486 chip uses 32 cells. Following diagram shows the address
translation that occurs when using a CAM to speed-up page table look-up.