In accord with the beautification principle, paging makes the main memory of the computer look more “beautiful'’ in several ways.
· It gives each process its own virtual memory, which looks like a private version of the main memory of the computer. In this sense, paging does for memory what the process abstraction does for the CPU. Even though the computer hardware may have only one CPU (or perhaps a few CPUs), each “user'’ can have his own private virtual CPU (process). Similarly, paging gives each process its own virtual memory, which is separate from the memories of other processes and protected from them.
· Each virtual memory looks like a linear array of bytes, with addresses starting at zero. This feature simplifies relocation: Every program can be compiled under the assumption that it will start at address zero.
· It makes the memory look bigger, by keeping infrequently used portions of the virtual memory space of a process on disk rather than in main memory. This feature both promotes more efficient sharing of the scarce memory resource among processes and allows each process to treat its memory as essentially unbounded in size. Just as a process doesn’t have to worry about doing some operation that may block because it knows that the OS will run some other process while it is waiting, it doesn’t have to worry about allocating lots of space to a rarely (or sparsely) used data structure because the OS will only allocate real memory to the part that’s actually being used.
Segmentation caries this feature one step further by allowing each process to have multiple “simulated memories.'’ Each of these memories (called a segment) starts at address zero, is independently protected, and can be separately paged. In a segmented system, a memory address has two parts: a segment number and a segment offset. Most systems have some sort of segmentation, but often it is quite limited. Unix has exactly three segments per process. One segment (called the text segment) holds the executable code of the process. It is generally read-only, fixed in size when the process starts, and shared among all processes running the same program. Sometimes read-only data (such as constants) are also placed in this segment. Another segment (the data segment) holds the memory used for global variables. Its protection is read/write (but usually not executable), and is normally not shared between processes. There is a special system call to extend the size of the data segment of a process. The third segment is the stack segment. As the name implies, it is used for the process’ stack, which is used to hold information used in procedure calls and returns (return address, saved contents of registers, etc.) as well as local variables of procedures. Like the data segment, the stack is read/write but usually not executable. The stack is automatically extended by the OS whenever the process causes a fault by referencing an address beyond the current size of the stack (usually in the course of a procedure call). It is not shared between processes. Some variants of Unix have a fourth segment, which contains part of the OS data structures. It is read-only and shared by all processes.
Many application programs would be easier to write if they could have as many segments as they liked. As an example of an application program that might want multiple segments, consider a compiler. In addition to the usual text, data, and stack segments, it could use one segment for the source of the program being compiled, one for the symbol table, etc. (see Fig 9.18 on page 287). Breaking the address space up into segments also helps sharing (see Fig. 9.19 on page 288). For example, most programs in Unix include the library program printf. If the executable code of printf were in a separate segment, that segment could easily be shared by multiple processes, allowing (slightly) more efficient sharing of physical memory.
If you think of the virtual address as being the concatenation of the segment number and the segment offset, segmentation looks superficially like paging. The main difference is that the application programmer is aware of the segment boundaries, but can ignore the fact that the address space is divided up into pages.
The implementation of segmentation is also superficially similar to the implementation of paging (see Fig 9.17 on page 286). The segment number is used to index into a table of “segment descriptors,'’ each of which contains the length and starting address of a segment as well as protection information. If the segment offset not less than the segment length, the MMU traps with a segmentation violation. Otherwise, the segment offset is added to the starting address in the descriptor to get the resulting physical address. There are several differences between the implementation of segments and pages, all derived from the fact that the size of a segment is variable, while the size of a page is “built-in.'’
· The size of the segment is stored in the segment descriptor and compared with the segment offset. The size of a page need not be stored anywhere because it is always the same. It is always a power of two and the page offset has just enough bits to represent any legal offset, so it is impossible for the page offset to be out of bounds. For example, if the page size is 4k (4096) bytes, the page offset is a 12-bit field, which can only contain numbers in the range 0…4095.
· The segment descriptor contains the physical address of the start of the segment. Since all page frames are required to start at an address that is a multiple of the page size, which is a power of two, the low-order bits of the physical address of a frame are always zero. For example, if pages are 4k bytes, the physical address of each page frame ends with 12 zeros. Thus a page table entry contains a frame number, which is just the higher-order bits of the physical address of the frame, and the MMU concatenates the frame number with the page offset, as contrasted with adding the physical address of a segment with the segment offset.