Address Spaces and Processes
2002 Jean Bellec
Before stored programs
computers were born, programs were implemented through a connections panel, a
matrix of wires that defined functions like code conversions, filed selections etc
derived from first tabulators of the first half of 20th century. A new feature
was created using the same technology, the "programming step" where a sequence
of instructions could be executed at the rhythm of the mechanical clock. Those
programs were not modifiable dynamically and their address space was limited
to a few registers. At the end of that small procedure operating on large amount of
punched cards (later emulated on first magnetic media), the program had to be manually
changed by introducing a new connections panel. That was the era of punched cards
tabulators and of their associated calculators, such as the Gamma 3, but also
the first scientific computers (ASCC, ENIAC) Then came the stored program computer and the so-called Von Neumann
architecture, where programs are located in writable storage. That was initially
considered as a genius decision because programs were able to be self-modified and that
architecture raised a lot of speculation on the possibility of an artificial intelligence.
But, in fact, apart the design of innovations like LISP, that feature came nowhere because
the bulk of the market was processing large amount of data for scientific usage or
business data processing. Stored programs computers had to wait the language compilers (in
the late 1950s) to compete usefully against connections panels. The addressing of the
storage was non-uniform in many machines, because a large part of the address space was
within a secondary storage (usually a drum). At that time, there was an address space that was the physical
memory and that belong to single execution of a program (i.e. containing the program and
the registers). It was the time of Uni-programming (from IBM 704 to 1401). In those
machines, the application program was booted on a bare machine after being linked to some
system subroutines. Later, at the end of the 1950s, when a job terminated (almost)
normally, an embryonic operating system, sometimes called monitor, took the place of the
terminating job and loaded automatically the following job, decreasing the burden on the
operator. Among the functions linked to the user application, a piece of software,
called an overlay manager insure a multiplexing of the (physical) address space on several
pieces of code loaded in memory at different time. Those overlays had to be carefully
planned by the programmer to allow the sharing of a part of the address space between
successive overlays. Then there was a thing (later considered as a process), the
supervisor sometimes also called monitor-, that allocated a separate distinct
address space to each of several program executions. It was the time of Multiprogramming
(GECOS II, OS/MFT
). Those address space were allocated for the time of a job and
then disappeared. Then the minicomputer appeared. A human controlled the resources via
commands. Each command was allocated the totality of the physical memory. When it was the
execution of a user program, it was likely that the machine has to be rebooted. Those
systems reigned on part of the world from IBM 1620 to CP/M and MS/DOS. Then the time-sharing was born (CTSS, CP/CMS time). Each user ought to feel
as he had for himself a minicomputer. Each user was given a virtual processor. Each user
was running a succession of commands (process) operating in a virtual address space that
looked as a subset of the physical address. When a command was not in execution the
shell was in control of the virtual processor. The concept of daemon was also introduced as special virtual users that do
things like copying cards into files , printing files
Then came MULTICS.
Instead of defining a limited virtual space to each user, it gave him the whole world of
file system. More exactly, each user had a dynamic virtual space where segmentation was
mapping the useful part of the file system. In early MULTICS, the user had a virtual
single processor. He was mapped on a process. Daemons also were considered as processes. The SABRE system, born at the same time as CTSS, had
to handle many users, much more than does a time-sharing system. But all the programs were
American Airlines or IBM written and presumably debugged. The tasks of each terminal were
sharing a large common address space (procedures and windows in the data base), they had
only an additional limited private address space for working (stack and context of the
transaction). However, it was somewhat unpractical to allocate a virtual processor to each
of the reservation clerk. So, there was mapping of those tasks on a handful of
processes. The idea of letting an application program to do parallelism with several
(virtual or real) processors probably came from the scientific world (tbc). There was an
argument that the complexity of such an operation came from the lack of appropriate
clauses in programming languages. So was born PL/1 that created tasks forking and
synchronization. Address spaces of subtasks were defined within the primary task address
space by means of the block structure of the language source program. PL/1 introduced a
concept that would allow apparently to program
easily a transaction system, like SABRE. However, there was no way of protecting a task
against another, but by checking the program correctness. UNIX came in the early 1970s and originally it was a
minicomputer with a Multics-like file system. When UNIX became multi-user it allocated to
each user a single address space (initially only real, then virtual) distinct from the
file system. The word process became equivalent of a user and of that address
space. Daemons were also processes created by God (the root) or by men (users). That was
the paradigm of all UNIX systems until 1990. The kernel of most UNIX systems had been
monolithic and not structured before a few micro-kernels like CMU MACH and Chorus were
designed in the late 1980s. Micro-kernels implemented what they called threads to
structure asynchronous actions in the kernel not just to improve the reliability by better
structuring, but also to be able to distribute the kernel on separate processors
(including NUMA systems). Micro-kernel architecture was challenged on the name of
efficiency and OSF did not pursue the MACH approach. |
The availability of UNIX on
multiprocessor systems (including supercomputers) raised again a demand for
multi-threading several tasks on behalf on a single user. There was, for some time,
SMP-safe UNIX systems where all system commands and services were controlled by a big
system lock, but progressively real multi-threading became available on UNIX (from the
1990s). UNIX on Intel x86 (not only Linux) architecture used almost exclusively the
linear address space of model 386 and did not used the segmented space , a 286 legacy. A
portability objective was the prime motivation behind that decision but 286 segmentation
got a negative image by the design flows of the 286. A revisiting of the UNIX design would
have been necessary to use segments anyway. Windows 16-bits did multi-threading on behalf on a single user, all threads
sharing the same address space and bumping against each other. Win32 allocated a single
address space to each asynchronous command launched from the shell. Windows/NT (and
succesors) was supporting, at least conceptually, several users (i.e. several shells). It also allowed multithreading but, as UNIX, it is sharing the same
address space between all the threads of the same program. Open systems entered the transaction processing market via the client/server
approach and avoided, at least temporarily, a difficult software architecture dilemma.
Transactions are performed in a server (a potentially multi-threaded- daemon )
that accesses a data-base and maintain journals of updates. The server address space had
relatively modest size. Obviously, the database is not directly included in that address
space. The state of the transaction is stored in the user computer acting not only
as a terminal but also keeping the transaction context. The solution works well when the
loss of that storage in the user computer does not impact the business of the owner of the
database. It may require complex at programmer level- recovery procedures when the
transaction is mission critical. Many client/server operations were not making
better than the 1960s first TP systems. GCOS8 is a derivative of a traditional architecture born in the 1960s. In
the mid-1970s it was extended with a capability mechanism implemented through a
segmentation mechanism. Unhappily, the new system architecture has not been used
coherently. Much of the batch system was unchanged and did not use it. The transaction
processing eventually used a subset of the architecture to provide reentrance of TPR. The
linkage conventions were not uniformly implemented and stayed dependent on languages. The
multi-threading was progressively extended, but the support of several processors by a
single TP subsystem was only a little bit earlier than IBMs. GCOS64 (the future GCOS7) was born in the 1970s (around the same time as
UNIX). Its marketing objectives were oriented towards traditional mainframes applications
and excluded time-sharing technical applications. However, high reliability and efficient
support of complex applications such as simultaneous operation of native mode and
emulators led us to propose a structured system architecture using several features
borrowed from Multics. Level 64 got from Multics a hardware segmentation mechanism,
somewhat constrained by the 32-bits word and a ring-protection mechanism. In addition, it
froze in firmware a micro-kernel mechanism that architectured the concept of
process (including their synchronization) and a segmented address space,
divided into the public part, the dynamically linked part, the process group
shared part and the private part. The new concept was the process group
that covers miscellaneous entities like the system itself, the non-permanent daemons, the
transaction subsystems, and the emulators. It allowed eventually to have a UNIX port
operating as an emulator. Conventional job steps in execution were loaded as process
groups, although few of them were multi-threaded in several processes. But a transaction
server was usually multi-threaded as two dozens of processes, although
thousands of transactions were mapped by the server into those processes. The limited address space due to the 32-bits words have been for long
bypassed by the software implementations that adopted more and more a server approach that
reduced the needs in address space. Bull and NEC had designed in the late 1980s an
extension of the architecture named XSA that put aside most of those limits. Unhappily,
Bull had already decided to wind down GCOS7 in front of non genuine architectures and NEC
alone did only use XSA on a limited amount. It
should also be recognized that network computer transfer the software emphasis on
distinctive address spaces addressing a global file system (referenced by URLs). Each
computer space became only a cache for software modules, documents and windows inside
databases. Main frame computers evolve towards a cluster of servers run under a variety of
operating systems frequently invisible to end users and even to programmers. The last
avatar of GCOS7 will be to operate, emulated on Intel IA-64 hardware, in a cluster of
commodity hardware. The reliability offered as a transaction server by its genuine
architecture will be kept, while many functions (essentially
those related to interactive processing) are moved to the several varieties of Open
architectures. The discussion of many features of the software architecture is still
interesting as a school case. New operating systems are sometimes reinventing the wheel
and a better knowledge of their ancestors may help young designers. It is not certain that
such discussions have an economic impact. If GCOS64 architecture has been maintained
secret by Honeywell management in 1970-1974, it was not that management estimated that its
value was great; it was because Honeywell believed that the software architecture features
were only local mechanisms to run applications. They eventually became right, but they
ignored that a not so different architecture (Intel 386's) invented 10 years later has
succeeded to dominate the world and has made a small semiconductor company a
quasi-monopoly during more than two decades. |