Chapter 3A Flashcards

1
Q

A process

A

is a program in execution, and the status of the current activity of a process is represented by
the program counter, as well as other registers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A process is independent

A

if it does not share data with any other processes executing in the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

process is cooperating

A

if it can affect or be affected by the other processes executing in the system.
Clearly, any process that shares data with other processes is a cooperating process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

As a process executes, it changes state

A

The state of a process is defined in part by the current activity of
that process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Types of processes:

A
  • New. The process is being created.
  • Running. Instructions are being executed.
  • Waiting. The process is waiting for some event to occur (such as an I/O completion or reception of a
    signal).
  • Ready. The process is waiting to be assigned to a processor.
  • Terminated. The process has finished execution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Each process is represented in the operating system by a process control block (PCB)—also called a task
control block

A

. A PCB is shown in Figure 3.3. It contains many pieces of information associated with a
specific process, including these: * Process state. The state may be new, ready, running, waiting, halted,
and so on. * Program counter. The counter indicates the address of the next instruction to be executed for
this process.
*

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  • CPU registers.
A

The registers vary in number and type, depending on the computer architecture. They
include accumulators, index registers, stack pointers, and general-purpose registers, plus any
condition-code information. Along with the program counter, this state information must be saved when
an interrupt occurs, to allow the process to be continued correctly afterward when it is rescheduled to run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  • CPU-scheduling information
A

This information includes a process priority, pointers to scheduling
queues, and any other scheduling parameters. (Chapter 5 describes process scheduling.) *

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Memory-management information

A

This information may include such items as the value of the base and
limit registers and the page tables, or the segment tables, depending on the memory system used by the
operating system (Chapter 9). *

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  • Accounting information.
A

This information includes the amount of CPU
and real time used, time limits, account numbers, job or process numbers, and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  • I/O status
    information.
A

This information includes the list of I/O devices allocated to the process, a list of open files,
and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Threads

A

The process model discussed so far has implied that a process is a program that performs a single
thread of execution. For example, when a process is running a word-processor program, a single thread of
instructions is being executed. This single thread of control allows the process to perform only one task at
a time. Thus, the user cannot simultaneously type in characters and run the spell checker. Most modern
operating systems have extended the process concept to allow a process to have multiple threads of
execution and thus to perform more than one task at a time. This feature is especially beneficial on
multicore systems, where multiple threads can run in parallel. A multithreaded word processor could, for
example, assign one thread to manage user input while another thread runs the spell checker. On systems
that support threads, the PCB is expanded to include information for each thread. Other changes
throughout the system are also needed to support threads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The number of processes currently in memory is known as the degree of multiprogramming.

A

Balancing
the objectives of multiprogramming and time sharing also requires taking the general behavior of a
process into account. In general, most processes can be described as either I/O bound or CPU bound. An
I/O-bound process is one that spends more of its time doing I/O than it spends doing computations. A
CPU-bound process, in contrast, generates I/O requests infrequently, using more of its time doing
computations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

As processes enter the system, they are put into a ready queue, where they are ready and waiting to
execute on a CPU’s core

A

This queue is generally stored as a linked list; a ready-queue header contains
pointers to the first PCB in the list, and each PCB includes a pointer field that points to the next PCB in
the ready queue. The system also includes other queues. When a process is allocated a CPU core, it
executes for a while and eventually terminates, is interrupted, or waits for the occurrence of a particular
event, such as the completion of an I/O request. Suppose the process makes an I/O request to a device
such as a disk. Since devices run significantly slower than processors, the process will have to wait for the
I/O to become available. Processes that are waiting for a certain event to occur — such as completion of
I/O — are placed in a wait queue (Figure 3.4). A common representation of process scheduling is a
queueing diagram, such as that in Figure 3.5. Two types of queues are present: the ready queue and a set
of wait queues. The circles represent the resources that serve the queues, and the arrows indicate the flow
of processes in the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Once the process is allocated a CPU core and is executing, one of several events could occur:

A
  • The process could issue an I/O request and then be placed in an I/O wait queue. * The process could
    create a new child process and then be placed in a wait queue while it awaits the child’s termination. * The
    process could be removed forcibly from the core, as a result of an interrupt or having its time slice expire,
    and be put back in the ready queue. In the first two cases, the process eventually switches from the
    waiting state to the ready state and is then put back in the ready queue. A process continues this cycle
    until it terminates, at which time it is removed from all queues and has its PCB and resources deallocated.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Tree of processes

A

During the course of execution, a process may create several new processes. As
mentioned earlier, the creating process is called a parent process, and the new processes are called the
children of that process. Each of these new processes may in turn create other processes, forming a tree of
processes
PID -> Most operating systems (including UNIX, Linux, and Windows) identify processes according to a
unique process identifier (or pid), which is typically an integer number. The pid provides a unique value
for each process in the system, and it can be used as an index to access various attributes of a process
within the kernel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In general, when a process creates a child process,

A

s, that child process will need certain resources (CPU
time, memory, files, I/O devices) to accomplish its task. A child process may be able to obtain its
resources directly from the operating system, or it may be constrained to a subset of the resources of the
parent process. The parent may have to partition its resources among its children, or it may be able to
share some resources (such as memory or files) among several of its children. Restricting a child process
to a subset of the parent’s resources prevents any process from overloading the system by creating too
many child processes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When a process creates a new process, two possibilities for execution exist:

A
  1. The parent continues to execute concurrently with its children. 2. The parent waits until some or all of
    its children have terminated.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

There are also two address-space possibilities for the new process:

A
  1. The child process is a duplicate of the parent process (it has the same program and data as the parent).
  2. The child process has a new program loaded into it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

FORK() SYSTEM CALL

A

A new process is created by the fork() system call. The new process consists of
a copy of the address space of the original process. This mechanism allows the parent process to
communicate easily with its child process. Both processes (the parent and the child) continue execution at
the instruction after the fork(), with one difference: the return code for the fork() is zero for the new
(child) process, whereas the (nonzero) process identifier of the child is returned to the parent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

EXEC() SYSTEM CALL

A

exec() system call to replace the process’s memory space with a new program.
The exec() system call loads a binary file into memory (destroying the memory image of the program
containing the exec() system call) and starts its execution. In this manner, the two processes are able to
communicate and then go their separate ways. The parent can then create more children; or, if it has
nothing else to do while the child runs, it can issue a wait() system call to move itself off the ready queue
until the termination of the child. Because the call to exec() overlays the process’s address space with a
new program, exec() does not return control unless an error occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The parent waits for the child process

A

s to complete with the wait() system call. When the child process
completes (by either implicitly or explicitly invoking exit()), the parent process resumes from the call to
wait(), where it completes using the exit() system call.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Windows system calls

A

As an alternative example, we next consider process creation in Windows.
Processes are created in the Windows API using the CreateProcess() function, which is similar to fork() in
that a parent creates a new child process. However, whereas fork() has the child process inheriting the
address space of its parent, CreateProcess() requires loading a specified program into the address space of
the child process at process creation. Furthermore, whereas fork() is passed no parameters,
CreateProcess() expects no fewer than ten parameters. The C program shown in Figure 3.10 illustrates the
CreateProcess() function, which creates a child process that loads the application mspaint.exe. We opt for
many of the default values of the ten parameters passed to CreateProcess(). Readers interested in pursuing
the details of process creation and management in the Windows API are encouraged to consult the
bibliographical notes at the end of this chapter. The two parameters passed to the CreateProcess() function
are instances of the STARTUPINFO and PROCESS INFORMATION structures. STARTUPINFO
specifies many properties of the new process, such as window size and appearance and handles to
standard input and output files. The PROCESS INFORMATION structure contains a handle and the
identifiers to the newly created process and its thread. We invoke the ZeroMemory() function to allocate
memory for each of these structures before proceeding with CreateProcess(). The first two parameters
passed to CreateProcess() are the application name and command-line parameters. If the application name
is NULL (as it is in this case), the command-line parameter specifies the application to load.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Process Termination

A

A process terminates when it finishes executing its final statement and asks the
operating system to delete it by using the exit() system call. At that point, the process may return a status
value (typically an integer) to its waiting parent process (via the wait() system call). All the resources of
the process —including physical and virtual memory, open files, and I/O buffers—are deallocated and
reclaimed by the operating system. Termination can occur in other circumstances as well. A process can
cause the termination of another process via an appropriate system call (for example, TerminateProcess()
in Windows). Usually, such a system call can be invoked only by the parent of the process that is to be
terminated. Otherwise, a user— or a misbehaving application—could arbitrarily kill another user’s
processes. Note that a parent needs to know the identities of its children if it is to terminate them. Thus,
when one process creates a new process, the identity of the newly created process is passed to the parent.
A parent may terminate the execution of one of its children for a variety of reasons, such as these: * The
child has exceeded its usage of some of the resources that it has been allocated. (To determine whether
this has occurred, the parent must have a mechanism to inspect the state of its children.) * The task
assigned to the child is no longer required. * The parent is exiting, and the operating system does not
allow a child to continue if its parent terminates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Cascading termination

A

is normally initiated by the operating system. To illustrate process execution and
termination, consider that, in Linux and UNIX systems, we can terminate a process by using the exit()
system call, providing an exit status as a parameter: /* exit with status 1 */ exit(

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

under normal termination -> exit(

A

will be called either directly (as shown above) or indirectly, as the C
run-time library (which is added to UNIX executable files) will include a call to exit() by default

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

wait() system call

A

> A parent process may wait for the termination of a child process by using the wait()
system call. The wait() system call is passed a parameter that allows the parent to obtain the exit status of
the child. This system call also returns the process identifier of the terminated child so that the parent can
tell which of its children has terminated: pid t pid; int status; pid = wait(&status); When a process
terminates, its resources are deallocated by the operating system. However, its entry in the process table
must remain there until the parent calls wait(), because the process table contains the process’s exit status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Orphans -

A

Now consider what would happen if a parent did not invoke wait() and instead terminated,
thereby leaving its child processes as orphans. Traditional UNIX systems addressed this scenario by
assigning the init process as the new parent to orphan processes. (Recall from Section 3.3.1 that init
serves as the root of the process hierarchy in UNIX systems.) The init process periodically invokes wait(),
thereby allowing the exit status of any orphaned process to be collected and releasing the orphan’s process
identifier and process-table entry. Although most Linux systems have replaced init with systemd, the
latter process can still serve the same role, although Linux also allows processes other than systemd to
inherit orphan processes and manage their termination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Android Process Hierarchy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q
  • Foreground process—
A

The current process visible on the screen, representing the application the user is
currently interacting with *

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q
  • Visible process—
A

A process that is not directly visible on the foreground but
that is performing an activity that the foreground process is referring to (that is, a process performing an
activity whose status is displayed on the foreground process)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q
  • Service process
A

—A process that is similar to a background process but is performing an activity that is
apparent to the user (such as streaming music)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q
  • Background process—
A

—A process that may be performing
an activity but is not apparent to the user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q
  • Empty process
A

A process that holds no active components
associated with any application

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Context switch

A

As mentioned in Section 1.2.1, interrupts cause the operating system to change a CPU core from its
current task and to run a kernel routine. Such operations happen frequently on general-purpose systems.
When an interrupt occurs, the system needs to save the current context of the process running on the CPU
core so that it can restore that context when its processing is done, essentially suspending the process and
then resuming it. The context is represented in the PCB of the process. It includes the value of the CPU
registers, the process state (see Figure 3.2), and memory-management information. Generically, we
perform a state save of the current state of the CPU core, be it in kernel or user mode, and then a state
restore to resume operations. Switching the CPU core to another process requires performing a state save
of the current process and a state restore of a different process. This task is known as a context switch and
is illustrated in Figure 3.6. When a context switch occurs, the kernel saves the context of the old process
in its PCB and loads the saved context of the new process scheduled to run. Contextswitch time is pure
overhead, because the system does no useful work while switching. Switching speed varies from machine
to machine, depending on the
memory speed, the number of registers that must be copied, and the existence of special instructions (such
as a single instruction to load or store all registers). A typical speed is a several microseconds.
There are several reasons for providing an environment that allows process cooperation:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q
  • Information sharing
A

g. Since several applications may be interested in the same piece of information (for
instance, copying and pasting), we must provide an environment to allow concurrent access to such
information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q
  • Computation speedup
A

. If we want a particular task to run faster, we must break it into subtasks, each of
which will be executing in parallel with the others. Notice that such a speedup can be achieved only if the
computer has multiple processing cores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q
  • Modularity.
A

We may want to construct the system in a modular fashion, dividing the system functions
into separate processes or threads, as we discussed in Chapter 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Cooperating processes require an interprocess communication (IPC)

A

mechanism that will allow them to
exchange data— that is, send data to and receive data from each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Shared memory -

A
  • In the shared-memory model, a region of memory that is shared by the cooperating
    processes is established. Processes can then exchange information by reading and writing data to the
    shared region. Shared memory can be faster than message passing, since message-passing systems are
    typically implemented using system calls and thus require the more time-consuming task of kernel
    intervention. In shared-memory systems, system calls are required only to establish shared memory
    regions. Once shared memory is established, all accesses are treated as routine memory accesses, and no
    assistance from the kernel is required.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Message-passing model

A

communication takes place by means of messages exchanged between the
cooperating processes.Message passing is useful for exchanging smaller amounts of data, because no
conflicts need be avoided. Message passing is also easier to implement in a distributed system than shared
memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Interprocess communication using shared memory requires communicating processes to establish a region
of shared memory

A

Typically, a shared-memory region resides in the address space of the process creating
the shared-memory segment. Other processes that wish to communicate using this shared-memory
segment must attach it to their address space. Recall that, normally, the operating system tries to prevent
one process from accessing another process’s memory. Shared memory requires that two or more
processes agree to remove this restriction. They can then exchange information by reading and writing
data in the shared areas. The form of the data and the location are determined by these processes and are
not under the operating system’s control. The processes are also responsible for ensuring that they are not
writing to the same location simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

the producer–consumer problem

A
  • which is a common paradigm for cooperating processes. A producer
    process produces information that is consumed by a consumer process. For example, a compiler may
    produce assembly code that is consumed by an assembler. The assembler, in turn, may produce object
    modules that are consumed by the loader. The producer–consumer problem also provides a useful
    metaphor for the client–server paradigm. We generally think of a server as a producer and a client as a
    consumer. For example, a web server produces (that is, provides) web content such as HTML files and
    images, which are consumed (that is, read) by the client web browser requesting the resource. One
    solution to the producer–consumer problem uses shared memory. To allow producer and consumer
    processes to run concurrently, we must have available a buffer of items that can be filled by the producer
    and emptied by the consumer. This buffer will reside in a region of memory that is shared by the producer
    and consumer processes. A producer can produce one item while the consumer is consuming another
    item. The producer and consumer must be synchronized, so that the consumer does not try to consume an
    item that has not yet been produced
44
Q

2 buffers for the producer-cosumer problem

A
45
Q
  • The unbounded buffer
A

places no practical limit on the size
of the buffer. The consumer may have to wait for new items, but the producer can always produce new
items. The bounded buffer assumes a fixed buffer size. In this case, the consumer must wait if the buffer
is empty, and the producer must wait if the buffer is full.

46
Q

The shared buffer

A

is implemented as a circular array with two logical pointers: in and out. The variable in
points to the next free position in the buffer; out points to the first full position in the buffer. The buffer is
empty when in == out; the buffer is full when ((in + 1) % BUFFER SIZE) == out. The code for the
producer process is shown in Figure 3.12, and the code for the consumer process is shown in Figure 3.13.
The producer process has a local variable next produced in which the new item to be produced is stored.
The consumer process has a local variable next consumed in which the item to be consumed is stored.
This scheme allows at most BUFFER SIZE − 1 items in the buffer at the same time. We leave it as an
exercise for you to provide a solution in which BUFFER SIZE items can be in the buffer at the same time.
In Section 3.7.1, we illustrate the POSIX API for shared memory

47
Q

Message passing provides a mechanism

A

m to allow processes to communicate and to synchronize their
actions without sharing the same address space. It is particularly useful in a distributed environment,
where the communicating processes may reside on different computers connected by a network. For
example, an Internet chat program could be designed so that chat participants communicate with one
another by exchanging messages. A message-passing facility provides at least two operations:
send(message) and receive(message) Messages sent by a process can be either fixed or variable in size. If
only fixed-sized messages can be sent, the system-level implementation is straightforward. This
restriction, however, makes the task of programming more difficult. Conversely, variable-sized messages
require a more complex system-level implementation, but the programming task becomes simpler.

48
Q

Direct communication-

A
  • Under direct communication, each process that wants to communicate must
    explicitly name the recipient or sender of the communication. In this scheme, the send() and receive()
    primitives are defined as: * send(P, message)—Send a message to process P. * receive(Q,
    message)—Receive a message from process Q.
49
Q

A communication link in this scheme has the following properties

A

A link is established automatically
between every pair of processes that want to communicate. The processes need to know only each other’s
identity to communicate.| * A link is associated with exactly two processes. * Between each pair of
processes, there exists exactly one link

50
Q

A process can communicate with another process via a number of different mailboxes, but two processes
can communicate only if they have a shared mailbox. The send() and receive() primitives are defined as
follows: *

A

send(A, message)—Send a message to mailbox A. * receive(A, message)—Receive a message
from mailbox A. In this scheme, a communication link has the following properties: * A link is
established between a pair of processes only if both members of the pair have a shared mailbox. * A link
may be associated with more than two processes. * Between each pair of communicating processes, a
number of different links may exist, with each link corresponding to one mailbox. Now suppose that
processes P1 , P2 , and P3 all share mailbox A. Process P1 sends a message to A, while both P2 and P3
execute a receive() from A. Which process will receive the message sent by P1 ? The answer depends on
which of the following methods we choose: * Allow a link to be associated with two processes at most.
* Allow at most one process at a time to execute a receive() operation. * Allow the system to select
arbitrarily which process will receive the message (that is, either P2 or P3 , but not both, will receive the
message). The system may define an algorithm for selecting which process will receive the message (for
example, round robin, where processes take turns receiving messages). The system may identify the
receiver to the sender.

51
Q

Communication between processes takes place through calls to send() and receive() primitives. There are
different design options for implementing each primitive. Message passing may be either blocking or
nonblocking— also known as synchronous and asynchronous.

A

. (Throughout this text, you will encounter
the concepts of synchronous and asynchronous behavior in relation to various operating-system
algorithms.) * Blocking send. The sending process is blocked until the message is received by the
receiving process or by the mailbox. * Nonblocking send. The sending process sends the message and
resumes operation. * Blocking receive. The receiver blocks until a message is available. * Nonblocking
receive. The receiver retrieves either a valid message or a null.

52
Q

When both send() and receive() are blocking, we have a rendezvous between the sender and the receiver.

A

The solution to the producer–consumer problem becomes trivial when we use blocking send() and
receive() statements. The producer merely invokes the blocking send() call and waits until the message is
delivered to either the receiver or the mailbox. Likewise, when the consumer invokes receive(), it blocks
until a message is available

53
Q

messages exchanged by communicating processes reside in a temporary queue

A

Basically, such queues
can be implemented in three ways: * Zero capacity. The queue has a maximum length of zero; thus, the
link cannot have any messages waiting in it. In this case, the sender must block until the recipient receives
the message. * Bounded capacity. The queue has finite length n; thus, at most n messages can reside in it.
If the queue is not full when a new message is sent, the message is placed in the queue (either the message
is copied or a pointer to the message is kept), and the sender can continue execution without waiting. The
link’s capacity is finite, however. If the link is full, the sender must block until space is available in the
queue. * Unbounded capacity. The queue’s length is potentially infinite; thus, any number of messages can
wait in it. The sender never blocks. The zero-capacity case is sometimes referred to as a message system
with no buffering. The other cases are referred to as systems with automatic buffering

54
Q

One of the earliest POSIX shared memory

A

is organized using - memory-mapped files, which associate the
region of shared memory with a file. A process must first create a shared-memory object using the shm
open() system call, as follows: fd = shm open(name, O CREAT | O RDWR, 0666); The first parameter
specifies the name of the shared-memory object. Processes that wish to access this shared memory must
refer to the object by this name. The subsequent parameters specify that the shared-memory object is to be
created if it does not yet exist (O CREAT) and that the object is open for reading and writing (O RDWR).
The last parameter establishes the file-access permissions of the shared-memory object. A successful call
to shm open() returns an integer file descriptor for the shared-memory object. Once the object is
established, the ftruncate() function is used to configure the size of the object in bytes. The call
ftruncate(fd, 4096); sets the size of the object to 4,096 bytes. Finally, the mmap() function establishes a
memory-mapped file containing the shared-memory object. It also returns a pointer to the
memory-mapped file that is used for accessing the shared-memory object

55
Q

Mach Message Passing

A

Mach was especially designed for distributed systems, but was shown to be
suitable for desktop and mobile systems as well, as evidenced by its inclusion in the macOS and iOS
operating systems, as discussed in Chapter 2. The Mach kernel supports the creation and destruction of
multiple tasks, which are similar to processes but have multiple threads of control and fewer associated
resources. Most communication in Mach—including all intertask communication—is carried out by
messages. Messages are sent to, and received from, mailboxes, which are called ports in Mach. Ports are
finite in size and unidirectional; for two-way communication, a message is sent to one port, and a
response is sent to a separate reply port. Each port may have multiple senders, but only one receiver.
Mach uses ports to represent resources such as tasks, threads, memory, and processors, while message
passing provides an object-oriented approach for interacting with these system resources and services.
Message passing may occur between any two ports on the same host or on separate hosts on a distributed
system. Associated with each port is a collection of port rights that identify the capabilities necessary for a
task to interact with the port. For example, for a task to receive a message from a port, it must have the
capability MACH PORT RIGHT RECEIVE for that port. The task that creates a port is that port’s owner,
and the owner is the only task that is allowed to receive messages from that port. A port’s owner may also
manipulate the capabilities for a port.

56
Q

When a task is created, two special ports— the Task Self port and the Notify port—are also created

A

The
kernel has receive rights to the Task Self port, which allows a task to send messages to the kernel. The
kernel can send notification of event occurrences to a task’s Notify port (to which, of course, the task has
receive rights). The mach port allocate() function call creates a new port and allocates space for its queue
of messages. It also identifies the rights for the port. Each port right represents a name for that port, and a
port can only be accessed via a right. Port names are simple integer values and behave much like UNIX
file descriptors.

57
Q

All messages are delivered reliably and have the same priority. Mach guarantees that multiple messages
from the same sender are queued in first-in, firstout (FIFO)

A

order but does not guarantee an absolute
ordering. For instance, messages from two senders may be queued in any order. Mach messages contain
the following two fields: * A fixed-size message header containing metadata about the message, including
the size of the message as well as source and destination ports. Commonly, the sending thread expects a
reply, so the port name of the source is passed on to the receiving task, which can use it as a “return
address” in sending a reply. * A variable-sized body containing data.

58
Q

Messages may be either simple or complex. A simple message contains ordinary, unstructured user data
that are not interpreted by the kernel.

A

A complex message may contain pointers to memory locations
containing data (known as “out-of-line” data) or may also be used for transferring port rights to another
task. Out-of-line data pointers are especially useful when a message must pass large chunks of data. A
simple message would require copying and packaging the data in the message; out-of-line data
transmission requires only a pointer that refers to the memory location where the data are stored. The
function mach msg() is the standard API for both sending and receiving messages. The value of one of the
function’s parameters—either MACH SEND MSG or MACH RCV MSG—indicates if it is a send or
receive operation. We now illustrate how it is used when a client task sends a simple message to a server
task. Assume there are two ports—client and server—associated with the client and server tasks,
respectively. The code in Figure 3.18 shows the client task constructing a header and sending a message
to the server, as well as the server task receiving the message sent from the client. The mach msg()
function call is invoked by user programs for performing message passing. mach msg() then invokes the
function mach msg trap(), which is a system call to the Mach kernel. Within the kernel, mach msg trap(next calls the function mach msg overwrite trap(), which then handles the actual passing of the message.

59
Q

If the port’s queue is full, the sender has several options (specified via parameters to mach msg():

A
  1. Wait indefinitely until there is room in the queue. 2. Wait at most n milliseconds. 3. Do not wait at all
    but rather return immediately. 4. Temporarily cache a message. Here, a message is given to the operating
    system to keep, even though the queue to which that message is being sent is full. When the message can
    be put in the queue, a notification message is sent back to the sender. Only one message to a full queue
    can be pending at any time for a given sending thread.
60
Q

The message-passing facility in Windows is called the advanced local procedure call (ALPC) facility. It is
used for communication between two processes on the same machine.

A

It is similar to the standard remote
procedure call (RPC) mechanism that is widely used, but it is optimized for and specific to Windows.
(Remote procedure calls are covered in detail in Section 3.8.2.) Like Mach, Windows uses a port object to
establish and maintain a connection between two processes. Windows uses two types of ports: connection
ports and communication ports. Server processes publish connection-port objects that are visible to all
processes. When a client wants services from a subsystem, it opens a handle to the server’s
connection-port object and sends a connection request to that port. The server then creates a channel and
returns a handle to the client. The channel consists of a pair of private communication ports: one for
client–server messages, the other for server–client messages. Additionally, communication channels
support a callback mechanism that allows the client and server to accept requests when they would
normally be expecting a reply

61
Q

When an ALPC channel is created, one of three message-passing techniques is chosen: 1. For small
messages (up to 256 bytes), the port’s message queue is used as intermediate storage, and the messages
are copied from one process to the other

A

r. 2. Larger messages must be passed through a section object,
which is a region of shared memory associated with the channel. 3. When the amount of data is too large
to fit into a section object, an API is available that allows server processes to read and write directly into
the address space of a client.

62
Q

A pipe acts as a conduit allowing two processes to communicate. Pipes were one of the first IPC
mechanisms in early UNIX systems. They typically provide one of the simpler ways for processes to
communicate with one another, although they also have some limitations. In implementing a pipe, four
issues must be considered

A
  1. Does the pipe allow bidirectional communication, or is communication unidirectional? 2. If two-way
    communication is allowed, is it half duplex (data can travel only one way at a time) or full duplex (data
    can travel in both directions at the same time)? 3. Must a relationship (such as parent–child) exist between
    the communicating processes? 4. Can the pipes communicate over a network, or must the communicating
    processes reside on the same machine?
63
Q

Ordinary pipes allow two processes to communicate

A

in standard producer– consumer fashion: the
producer writes to one end of the pipe (the write end) and the consumer reads from the other end (the read
end). As a result, ordinary pipes are unidirectional, allowing only one-way communication. If two-way
communication is required, two pipes must be used, with each pipe sending data in a different direction.
the parent process creates a pipe and then sends a fork() call creating the child process. What occurs after
the fork() call depends on how the data are to flow through the pipe. In this instance, the parent writes to
the pipe, and the child reads from it. It is important to notice that both the parent process and the child
process initially close their unused ends of the pipe

64
Q

Ordinary pipes on Windows systems are termed anonymous pipes, and they behave similarly to their
UNIX counterparts: they are unidirectional and employ parent–child relationships between the
communicating processes

A

. In addition, reading and writing to the pipe can be accomplished with the
ordinary ReadFile() and WriteFile() functions. The Windows API for creating pipes is the CreatePipe()
function, which is passed four parameters. The parameters provide separate handles for (1) reading and
(2) writing to the pipe, as well as (3) an instance of the STARTUPINFO structure, which is used to
specify that the child process is to inherit the handles of the pipe.

65
Q

Named Pipes

A

Named pipes provide a much more powerful communication tool. Communication can be bidirectional,
and no parent–child relationship is required. Once a named pipe is established, several processes can use
it for communication. In fact, in a typical scenario, a named pipe has several writers. Additionally, named
pipes continue to exist after communicating processes have finished. Both UNIX and Windows systems
support named pipes, although the details of implementation differ greatly. Next, we explore named pipes
in each of these systems. Named pipes are referred to as FIFOs in UNIX systems. Once created, they
appear as typical files in the file system. A FIFO is created with the mkfifo() system call and manipulated
with the ordinary open(), read(), write(), and close() system calls. It will continue to exist until it is
explicitly deleted Although FIFOs allow bidirectional communication, only half-duplex transmission is
permitted. If data must travel in both directions, two FIFOs are typically used. Additionally, the
communicating processes must reside on the same machine. If intermachine communication is required,
sockets (Section 3.8.1) must be used. Named pipes on Windows systems provide a richer communication
mechanism than their UNIX counterparts. Full-duplex communication is allowed, and the communicating
processes may reside on either the same or different machines. Additionally, only byte-oriented data may
be transmitted across a UNIX FIFO, whereas Windows systems allow either byte- or message-oriented
data. Named pipes are created with the CreateNamedPipe() function, and a client can connect to a named
pipe using ConnectNamedPipe(). Communication over the named pipe can be accomplished using the
ReadFile() and WriteFile() functions.

66
Q

A socket

A

is defined as an endpoint for communication. A pair of processes communicating over a network
employs a pair of sockets—one for each process. A socket is identified by an IP address concatenated
with a port number. In general, sockets use a client–server architecture. The server waits for incoming
client requests by listening to a specified port. Once a request is received, the server accepts a connection
from the client socket to complete the connection. Servers implementing specific services (such as SSH,
FTP, and HTTP) listen to well-known ports (an SSH server listens to port 22; an FTP server listens to port
21; and a web, or HTTP, server listens to port 80). All ports below 1024 are considered well known and
are used to implement standard services.

67
Q

Java provides three different types of sockets.

A

Connection-oriented (TCP) sockets are implemented with
the Socket class. Connectionless (UDP) sockets use the DatagramSocket class. Finally, the
MulticastSocket class is a subclass of the DatagramSocket class. A multicast socket allows data to be sent
to multiple recipients. Our example describes a date server that uses connection-oriented TCP sockets.
The operation allows clients to request the current date and time from the server. The server listens to port
6013, although the port could have any arbitrary, unused number greater than 1024. When a connection is
received, the server returns the date and time to the client.
The IP address 127.0.0.1 is a special IP address known as the loopback. When a computer refers to IP
address 127.0.0.1, it is referring to itself. This mechanism allows a client and server
on the same host to communicate using the TCP/IP protocol.
A port in this context is simply a number included at the start of a message packet. Whereas a system
normally has one network address, it can have many ports within that address to differentiate the many
network services it supports. If a remote process needs a service, it addresses a message to the proper port.
For instance, if a system wished to allow other systems to be able to list its current users, it would have a
daemon supporting such an RPC attached to a port—say, port 3027. Any remote system could obtain the
needed information (that is, the list of current users) by sending an RPC message to port 3027 on the
server. The data would be received in a reply message. The semantics of RPCs allows a client to invoke a
procedure on a remote host as it would invoke a procedure locally. The RPC system hides the details that
allow communication to take place by providing a stub on the client side. Typically, a separate stub exists
for each separate remote procedure. When the client invokes a remote procedure, the RPC system calls
the appropriate stub, passing it the parameters provided to the remote procedure. This stub locates the port
on the server and marshals the parameters. The stub then transmits a message to the server using message
passing. A similar stub on the server side receives this message and invokes the procedure on the server.
If necessary, return values are passed back to the client using the same technique. On Windows systems,
stub code is compiled from a specification written in the Microsoft Interface Definitio Language (MIDL),
which is used for defining the interfaces between client and server programs. Parameter marshaling
addresses the issue concerning differences in data representation on the client and server machines.
Consider the representation of 32-bit integers. Some systems (known as big-endian) store the most
significant byte first, while other systems (known as little-endian) store the least significant byte first.
Neither order is “better” per se; rather, the choice is arbitrary within a computer architecture. To resolve
differences like this, many RPC systems define a machine-independent representation of data. One such
representation is known as external data representation (XDR). On the client side, parameter marshaling
involves converting the machine-dependent data into XDR before they are sent to the server. On the
server side, the XDR data are unmarshaled and converted to the machine-dependent representation for the
server

68
Q

The Android operating system has a rich set of IPC mechanisms contained in its binder framework,
including RPCs that allow one process to request services from another process.

A

Android defines an
application component as a basic building block that provides utility to an Android application, and an
app may combine multiple application components to provide functionality to an apption component is a
service, which has no user interface but instead runs in the background while executing long-running
operations or performing work for remote processes. Examples of services include playing music in the
background and retrieving data over a network connection on behalf of another process, thereby
preventing the other process from blocking as the data are being downloaded. When a client app invokes
the bindService() method of a service, that service is “bound” and available to provide client-server
communication using either message passing or RPCs. A bound service must extend the Android class
Service and must implement the method onBind(), which is invoked when a client calls bindService(). In
the case of message passing, the onBind() method returns a Messenger service, which is used for sending
messages from the client to the service. The Messenger service is only one-way; if the service must send a
reply back to the client, the client must also provide a Messenger service, which is contained in the
replyTo field of the Message object sent to the service. The service can then send messages back to the
client

69
Q

A thread

A

is a basic unit of CPU utilization; it comprises a thread ID, a program counter (PC), a register
set, and a stack. It shares with other threads belonging to the same process its code section, data section,
and other operating-system resources, such as open files and signals. A traditional process has a single
thread of control. If a process has multiple threads of control, it can perform more than one task at a time.
Figure 4.1 illustrates the difference between a traditional single-threaded process and a multithreaded
process.

70
Q

The benefits of multithreaded programming can be broken down into four major categories:

A
  1. Responsiveness. Multithreading an interactive application may allow a program to continue running
    even if part of it is blocked or is performing a lengthy operation, thereby increasing responsiveness to the
    user. This quality is especially useful in designing user interfaces. For instance, consider what happens
    when a user clicks a button that results in the performance of a time-consuming operation. A
    single-threaded application would be unresponsive to the user until the operation had been completed. In
    contrast, if the time-consuming operation is performed in a separate, asynchronous thread, the application
    remains responsive to the user.
  2. Resource sharing. Processes can share resources only through techniques such as shared memory and
    message passing. Such techniques must be explicitly arranged by the programmer. However, threads share
    the memory and the resources of the process to which they belong by default. The benefit of sharing code
    and data is that it allows an application to have several different threads of activity within the same
    address space.
  3. Economy. Allocating memory and resources for process creation is costly. Because threads share the
    resources of the process to which they belong, it is more economical to create and context-switch threads.
    Empirically gauging the difference in overhead can be difficult, but in general thread creation consumes
    less time and memory than process creation. Additionally, context switching is typically faster between
    threads than between processes.
  4. Scalability. The benefits of multithreading can be even greater in a multiprocessor architecture, where
    threads may be running in parallel on different processing cores. A single-threaded process can run on
    only one processor, regardless how many are available.
71
Q

concurrency vs parallelism:

A

A concurrent system supports more than one task by allowing all the tasks to
make progress. In contrast, a parallel system can perform more than one task simultaneously. Thus, it is
possible to have concurrency without parallelism. Before the advent of multiprocessor and multicore
architectures, most computer systems had only a single processor, and CPU schedulers were designed to
provide the illusion of parallelism by rapidly switching between processes, thereby allowing each process
to make progress. Such processes were running concurrently, but not in parallel.
5 challenges in multicore programming
1. Identifying tasks. This involves examining applications to find areas that can be divided into separate,
concurrent tasks. Ideally, tasks are independent of one another and thus can run in parallel on individual
cores.
2. Balance. While identifying tasks that can run in parallel, programmers must also ensure that the tasks
perform equal work of equal value. In some instances, a certain task may not contribute as much value to
the overall process as other tasks. Using a separate execution core to run that task may not be worth the
cost.
3. Data splitting. Just as applications are divided into separate tasks, the data accessed and manipulated by
the tasks must be divided to run on separate cores. 4. Data dependency. The data accessed by the tasks
must be examined for dependencies between two or more tasks. When one task depends on data from
another, programmers must ensure that the execution of the tasks is synchronized to accommodate the
data dependency. We examine such strategies in Chapter 6.
5. Testing and debugging. When a program is running in parallel on multiple cores, many different
execution paths are possible. Testing and debugging such concurrent programs is inherently more difficult
than testing and debugging single-threaded applications.

72
Q

Data parallelism focuses

A

on distributing subsets of the same data across multiple computing cores and
performing the same operation on each core. Consider, for example, summing the contents of an array of
size N. On a single-core system, one thread would simply sum the elements [0] …[N − 1]. On a dual-core
system, however, thread A, running on core 0, could sum the elements [0] . . . [N∕2 − 1] while thread B,
running on core 1, could sum the elements [N∕2] . . . [N − 1]. The two threads would be running in parallel
on separate computing cores

73
Q

Task parallelism involves

A

distributing not data but tasks (threads) across multiple computing cores. Each
thread is performing a unique operation. Different threads may be operating on the same data, or they may
be operating on different data. Consider again our example above. In contrast to that situation, an example
of task parallelism might involve two threads, each performing a unique statistical operation on the array
of elements. The threads again are operating in parallel on separate computing cores, but each is
performing a unique operation.

74
Q

Kernel vs User

A

threads user threads, or by the kernel, for kernel threads. User threads are supported above
the kernel and are managed without kernel support, whereas kernel threads are supported and managed
directly by the operating system. Virtually all contemporary operating systems—including Windows,
Linux, and macOS— support kernel threads.

75
Q

Many-to-One Model

A

The many-to-one model (Figure 4.7) maps many user-level threads to one kernel thread. Thread
management is done by the thread library in user space, so it is efficient (we discuss thread libraries in
Section 4.4). However, the entire process will block if a thread makes a blocking system call. Also,
because only one thread can access the kernel at a time, multiple threads are unable to run in parallel on
multicore systems. Green threads—a thread library available for Solaris systems and adopted in early
versions of Java—used the many-toone model. However, very few systems continue to use the model
because of its inability to take advantage of multiple processing cores, which have now become standard
on most computer systems

76
Q

One-to-One Model

A

The one-to-one model (Figure 4.8) maps each user thread to a kernel thread. It provides more concurrency
than the many-to-one model by allowing another thread to run when a thread makes a blocking system
call. It also allows multiple threads to run in parallel on multiprocessors. The only drawback to this model
is that creating a user thread requires creating the corresponding kernel thread, and a large number of
kernel threads may burden the performance of a system. Linux, along with the family of Windows
operating systems, implement the one-to-one model.

77
Q

Many-to-Many Model

A

The many-to-many model (Figure 4.9) multiplexes many user-level threads to a smaller or equal number
of kernel threads. The number of kernel threads may be specific to either a particular application or a
particular machine (an application may be allocated more kernel threads on a system with eight
processing cores than a system with four cores). Although the many-to-many model appears to be the
most flexible of the models discussed, in practice it is difficult to implement. In addition, with an
increasing number of processing cores appearing on most systems, limiting the number of kernel threads
has become less important. As a result, most operating systems now use the one-to-one model.

78
Q

Two lvl model

A

Let’s consider the effect of this design on concurrency. Whereas the manyto-one model allows the
developer to create as many user threads as she wishes, it does not result in parallelism, because the
kernel can schedule only one kernel thread at a time. The one-to-one model allows greater concurrency,
but the developer has to be careful not to create too many threads within an application. (In fact, on some
systems, she may be limited in the number of threads she can create.) The many-to-many model suffers
from neither of these shortcomings: developers can create as many user threads as necessary, and the
corresponding kernel threads can run in parallel on a multiprocessor. Also, when a thread performs a
blocking system call, the kernel can schedule another thread for execution. One variation on the
many-to-many model still multiplexes many userlevel threads to a smaller or equal number of kernel
threads but also allows a user-level thread to be bound to a kernel thread.

79
Q

A thread library provides

A

the programmer with an API for creating and managing threads. There are two
primary ways of implementing a thread library. The first approach is to provide a library entirely in user
space with no kernel support. All code and data structures for the library exist in user space. This means
that invoking a function in the library results in a local function call in user space and not a system call.
The second approach is to implement a kernel-level library supported directly by the operating system. In
this case, code and data structures for the library exist in kernel space. Invoking a function in the API for
the library typically results in a system call to the kernel.

80
Q

For POSIX and Windows threading,

A

any data declared globally— that is, declared outside of any
function—are shared among all threads belonging to the same process. Because Java has no equivalent
notion of global data, access to shared data must be explicitly arranged between threads.

81
Q

With asynchronous threading

A

g, once the parent creates a child thread, the parent resumes its execution, so
that the parent and child execute concurrently and independently of one another. Because the threads are
independent, there is typically little data sharing between them. Asynchronous threading is the strategy
used in the multithreaded server illustrated in Figure 4.2 and is also commonly used for designing
responsive user interfaces.

82
Q

Synchronous threading

A

occurs when the parent thread creates one or more children and then must wait for
all of its children to terminate before it resumes. Here, the threads created by the parent perform work
concurrently, but the parent cannot continue until this work has been completed. Once each thread has
finished its work, it terminates and joins with its parent. Only after all of the children have joined can the
parent resume execution. Typically, synchronous threading involves significant data sharing among
threads. For example, the parent thread may combine the results calculated by its various children. All of
the following examples use synchronous threading

83
Q

Pthreads refers to the POSIX standard (IEEE 1003.1c)

A

defining an API for thread creation and
synchronization. This is a specification for thread behavior, not an implementation. Operating-system
designers may implement the specification

84
Q

Windows Threads

A

Threads are created in the Windows API using the CreateThread() function, and—just as in Pthreads—a
set of attributes for the thread is passed to this function. These attributes include security information, the
size of the stack, and a flag that can be set to indicate if the thread is to start in a suspended state. In this
program, we use the default values for these attributes. (The default values do not initially set the thread
to a suspended state and instead make it eligible to be run by the CPU scheduler.) Once the summation
thread is created, the parent must wait for it to complete before outputting the value of Sum, as the value
is set by the summation thread. Recall that the Pthread program (Figure 4.11) had the parent thread wait
for the summation thread using the pthread join() statement. We perform the equivalent of this in the
Windows API using the WaitForSingleObject() function, which causes the creating thread to block until
the summation thread has exited. In situations that require waiting for multiple threads to complete, the
WaitForMultipleObjects() function is used. This function is passed four parameters: 1. The number of
objects to wait for 2. A pointer to the array of objects 3. A flag indicating whether all objects have been
signaled 4. A timeout duration (or INFINITE) For example, if THandles is an array of thread HANDLE
objects of size N, the parent thread can wait for all its child threads to complete with this statement:
WaitForMultipleObjects(N, THandles, TRUE, INFINITE);

85
Q

Java Threads

A

s Threads are the fundamental model of program execution in a Java program, and the Java
language and its API provide a rich set of features for the creation and management of threads. All Java
programs comprise at least a single thread of control—even a simple Java program consisting of only a
main() method runs as a single thread in the JVM. Java threads are available on any system that provides
a JVM including Windows, Linux, and macOS. The Java thread API is available for Android applications
as well. There are two techniques for explicitly creating threads in a Java program. One approach is to
create a new class that is derived from the Thread class and to override its run() method. An
alternative—and more commonly used — technique is to define a class that implements the Runnable
interface. This interface defines a single abstract method with the signature public void run(). The code in
the run() method of a class that implements Runnable is what executes in a separate thread. An example is
shown below: class Task implements Runnable { public void run() { System.out.println(“I am a thread.”);
} }
Threads in Java is lambda

86
Q

Thread creation in Java involves

A

creating a Thread object and passing it an instance of a class that
implements Runnable, followed by invoking the start() method on the Thread object. This appears in the
following example: Thread worker = new Thread(new Task()); worker.start(); Invoking the start() method
for the new Thread object does two things: 1. It allocates memory and initializes a new thread in the JVM.
2. It calls the run() method, making the thread eligible to be run by the JVM. (Note again that we never
call the run() method directly. Rather, we call the start() method, and it calls the run() method on our
behalf.)

87
Q

What problems does thread pool solve?

A

In this situation, whenever the server receives a request, it creates a separate thread to service the request.
Whereas creating a separate thread is certainly superior to creating a separate process, a multithreaded
server nonetheless has potential problems. The first issue concerns the amount of time required to create
the thread, together with the fact that the thread will be discarded once it has completed its work. The
second issue is more troublesome. If we allow each concurrent request to be serviced in a new thread, we
have not placed a bound on the number of threads concurrently active in the system. Unlimited threads
could exhaust system resources, such as CPU time or memory. One solution to this problem is to use a
thread pool.

88
Q

Thread pools

A

work well when the tasks submitted to the pool can be executed asynchronously. Thread
pools offer these benefits:
1. Servicing a request with an existing thread is often faster than waiting to create a thread. 2. A thread
pool limits the number of threads that exist at any one point. This is particularly important on systems that
cannot support a large number of concurrent threads. 3. Separating the task to be performed from the
mechanics of creating the task allows us to use different strategies for running the task. For example, the
task could be scheduled to execute after a time delay or to execute periodically. The number of threads in
the pool can be set heuristically based on factors such as the number of CPUs in the system, the amount of
physical memory, and the expected number of concurrent client requests. More sophisticated threadpool
architectures can dynamically adjust the number of threads in the pool according to usage patterns. Such
architectures provide the further benefit of having a smaller pool— thereby consuming less
memory—when the load on the system is low.

89
Q

Java Thread Pools

A

The java.util.concurrent package includes an API for several varieties of thread-pool architectures. Here,
we focus on the following three models: 1. Single thread executor—newSingleThreadExecutor()—creates
a pool of size 1. 2. Fixed thread executor—newFixedThreadPool(int size)—creates a thread pool with a
specified number of threads. 3. Cached thread executor—newCachedThreadPool()—creates an
unbounded thread pool, reusing threads in many instances. We have, in fact, already seen the use of a Java
thread pool in Section 4.4.3, where we created a newSingleThreadExecutor in the program example
shown in Figure 4.14. In that section, we noted that the Java executor framework can be used to construct
more robust threading tools. We now describe how it can be used to create thread pools. A thread pool is
created using one of the factory methods in the Executors class: * static ExecutorService
newSingleThreadExecutor() * static ExecutorService newFixedThreadPool(int size) * static
ExecutorService newCachedThreadPool() Each of these factory methods creates and returns an object
instance that implements the ExecutorService interface.

90
Q

THE JVM AND THE HOST OPERATING SYSTEM

A

The JVM is typically implemented on top of a
host operating system (see Figure 18.10). This setup allows the JVM to hide the implementation details of
the underlying operating system and to provide a consistent, abstract environment that allows Java
programs to operate on any platform that supports a JVM. The specification for the JVM does not indicate
how Java threads are to be mapped to the underlying operating system, instead leaving that decision to the
particular implementation of the JVM. For example, the Windows operating system uses the one-to-one
model; therefore, each Java thread for a JVM running on Windows maps to a kernel thread. In addition,
there may be a relationship between the Java thread library and the thread library on the host operating
system. For example, implementations of a JVM for the Windows family of operating systems might use
the Windows API when creating Java threads; Linux and macOS systems might use the Pthreads API.
Thread pool -> The general idea behind a thread pool is to create a number of threads at start-up and
place them into a pool, where they sit and wait for work. When a server receives a request, rather than
creating a thread, it instead submits the request to the thread pool and resumes waiting for additional
requests. If there is an available thread in the pool, it is awakened, and the request is serviced
immediately. If the pool contains no available thread, the task is queued until one becomes free. Once a
thread completes its service, it returns to the pool and awaits more work. Thread pools work well when
the tasks submitted to the pool can be executed asynchronously. Thread pools offer these benefits: 1.
Servicing a request with an existing thread is often faster than waiting to create a thread. 2. A thread pool
limits the number of threads that exist at any one point. This is particularly important on systems that
cannot support a large number of concurrent threads. 3. Separating the task to be performed from the
mechanics of creating the task allows us to use different strategies for running the task. For example, the
task could be scheduled to execute after a time delay or to execute periodically

91
Q

Fork Join

A

The strategy for thread creation covered in Section 4.4 is often known as the fork-join model. Recall that
with this method, the main parent thread creates (forks) one or more child threads and then waits for the
children to terminate and join with it, at which point it can retrieve and combine their results. This
synchronous model is often characterized as explicit thread creation, but it is also an excellent candidate
for implicit threading. In the latter situation, threads are not constructed directly during the fork stage;
rather, parallel tasks are designated. This model is illustrated in Figure 4.16. A library manages the
number of threads that are created and is also responsible for assigning tasks to threads. In some ways,
this fork-join model is a synchronous version of thread pools in which a library determines the actual
number of threads to create— for example, by using the heuristics described in Section 4.5.1.

92
Q

OpenMP

A

OpenMP is a set of compiler directives as well as an API for programs written in C, C++, or
FORTRAN that provides support for parallel programming in sharedmemory environments. OpenMP
identifies parallel regions as blocks of code that may run in parallel. Application developers insert
compiler directives into their code at parallel regions, and these directives instruct the OpenMP run
time library to execute the region in parallel

93
Q

Grand Central Dispatch

A

Grand Central Dispatch (GCD) is a technology developed by Apple for its macOS and iOS operating
systems. It is a combination of a run-time library, an API, and language extensions that allow developers
to identify sections of code (tasks) to run in parallel. Like OpenMP, GCD manages most of the details of
threading. GCD schedules tasks for run-time execution by placing them on a dispatch queue. When it
removes a task from a queue, it assigns the task to an available thread from a pool of threads that it
manages. GCD identifies two types of dispatch queues: serial and concurrent. Tasks placed on a serial
queue are removed in FIFO order. Once a task has been removed from the queue, it must complete
execution before another task is removed. Each process has its own serial queue (known as its main
queue), and developers can create additional serial queues that are local to a particular process. (This is
why serial queues are also known as private dispatch queues.) Serial queues are useful for ensuring the
sequential execution of several tasks. Tasks placed on a concurrent queue are also removed in FIFO order,
but several tasks may be removed at a time, thus allowing multiple tasks to execute in parallel. There are
several system-wide concurrent queues (also known as global dispatch queues), which are divided into
four primary quality-of-service classes: * QOS CLASS USER INTERACTIVE—The user-interactive
class represents tasks that interact with the user, such as the user interface and event handling, to ensure a
responsive user interface. Completing a task belonging to this class should require only a small amount of
work. * QOS CLASS USER INITIATED—The user-initiated class is similar to the user-interactive class
in that tasks are associated with a responsive user interface; however, user-initiated tasks may require
longer processing times. Opening a file or a URL is a user-initiated task, for example. Tasks belonging to
this class must be completed for the user to continue interacting with the system, but they do not need to
be serviced as quickly as tasks in the user-interactive queue. * QOS CLASS UTILITY —The utility class
represents tasks that require a longer time to complete but do not demand immediate results. This class
includes work such as importing data. * QOS CLASS BACKGROUND —Tasks belonging to the
background class are not visible to the user and are not time sensitive. Examples include indexing a
mailbox system and performing backups.

94
Q

The fork() and exec() System Calls

A

In Chapter 3, we described how the fork() system call is used to create a separate, duplicate process. The
semantics of the fork() and exec() system calls change in a multithreaded program. If one thread in a
program calls fork(), does the new process duplicate all threads, or is the new process single-threaded?
Some UNIX systems have chosen to have two versions of fork(), one that duplicates all threads and
another that duplicates only the thread that invoked the fork() system call. The exec() system call typically
works in the same way as described in Chapter 3. That is, if a thread invokes the exec() system call, the
program specified in the parameter to exec() will replace the entire process—including all threads. Which
of the two versions of fork() to use depends on the application. If exec() is called immediately after
forking, then duplicating all threads is unnecessary, as the program specified in the parameters to exec()
will replace the process. In this instance, duplicating only the calling thread is appropriate. If, however,
the separate process does not call exec() after forking, the separate process should duplicate all threads.

95
Q

A signal

A

is used in UNIX systems to notify a process that a particular event has occurred. A signal may be
received either synchronously or asynchronously, depending on the source of and the reason for the event
being signaled. All signals, whether synchronous or asynchronous, follow the same pattern: 1. A signal is
generated by the occurrence of a particular event. 2. The signal is delivered to a process. 3. Once
delivered, the signal must be handled. Examples of synchronous signals include illegal memory access
and division by 0. If a running program performs either of these actions, a signal is generated.
Synchronous signals are delivered to the same process that performed the operation that caused the signal
(that is the reason they are considered synchronous). When a signal is generated by an event external to a
running process, that process receives the signal asynchronously. Examples of such signals include
terminating a process with specific keystrokes (such as ) and having a timer expire. Typically, an
asynchronous signal is sent to another process. A signal may be handled by one of two possible handlers:
1. A default signal handler 2. A user-defined signal handler

96
Q

Every signal has a default signal handler that the kernel runs when handling that signal. This default
action can be overridden by a user-define signal handler that is called to handle the signal.

A

Signals are handled in different ways. Some signals may be ignored, while others (for example, an illegal
memory access) are handled by terminating the program. Handling signals in single-threaded programs is
straightforward: signals are always delivered to a process. However, delivering signals is more
complicated in multithreaded programs, where a process may have several threads. Where, then, should a
signal be delivered? In general, the following options exist: 1. Deliver the signal to the thread to which the
signal applies. 2. Deliver the signal to every thread in the process. 3. Deliver the signal to certain threads
in the process. 4. Assign a specific thread to receive all signals for the process. The method for delivering
a signal depends on the type of signal generated. For example, synchronous signals need to be delivered
to the thread causing the signal and not to other threads in the process. However, the situation with
asynchronous signals is not as clear. Some asynchronous signals—such as a signal that terminates a
process (, for example)—should be sent to all threads.

97
Q

The standard UNIX function for delivering a signal is kill(pid t pid, int signal)

A

This function specifies the
process (pid) to which a particular signal (signal) is to be delivered. Most multithreaded versions of UNIX
allow a thread to specify which signals it will accept and which it will block. Therefore, in some cases, an
asynchronous signal may be delivered only to those threads that are not blocking it. However, because
signals need to be handled only once, a signal is typically delivered only to the first thread found that is
not blocking it. POSIX Pthreads provides the following function, which allows a signal to be delivered to
a specified thread (tid): pthread kill(pthread t tid, int signal) Although Windows does not explicitly
provide support for signals, it allows us to emulate them using asynchronous procedure calls (APCs). The
APC facility enables a user thread to specify a function that is to be called when the user thread receives
notification of a particular event. As indicated by its name, an APC is roughly equivalent to an
asynchronous signal in UNIX. However, whereas UNIX must contend with how to deal with signals in a
multithreaded environment, the APC facility is more straightforward, since an APC is delivered to a
particular thread rather than a process

98
Q

Thread cancellation

A

involves terminating a thread before it has completed. For example, if multiple threads are concurrently
searching through a database and one thread returns the result, the remaining threads might be canceled.
Another situation might occur when a user presses a button on a web browser that stops a web page from
loading any further. Often, a web page loads using several threads—each image is loaded in a separate
thread. When a user presses the stop button on the browser, all threads loading the page are canceled.

99
Q

A thread that is to be canceled is often referred to as the target thread. Cancellation of a target thread may
occur in two different scenarios

A
  1. Asynchronous cancellation. One thread immediately terminates the
    target thread. 2. Deferred cancellation. The target thread periodically checks whether it should terminate,
    allowing it an opportunity to terminate itself in an orderly fashion.
100
Q

The difficulty with cancellation occurs in situations where resources have been allocated to a canceled
thread or where a thread is canceled while in the midst of updating data it is sharing with other threads.

A

This becomes especially troublesome with asynchronous cancellation. Often, the operating system will
reclaim system resources from a canceled thread but will not reclaim all resources. Therefore, canceling a
thread asynchronously may not free a necessary system-wide resource. With deferred cancellation, in
contrast, one thread indicates that a target thread is to be canceled, but cancellation occurs only after the
target thread has checked a flag to determine whether or not it should be canceled. The thread can perform
this check at a point at which it can be canceled safely. In Pthreads, thread cancellation is initiated using
the pthread cancel() function. The identifier of the target thread is passed as a parameter to the function.

101
Q

Pthreads supports three cancellation modes. Each mode is defined as a state and a type, as illustrated in
the table below. A thread may set its cancellation state and type using an API

A

off, deferred, asynchronous

102
Q

The default cancellation type is deferred cancellation. However, cancellation occurs only when a thread
reaches a cancellation point.

A

Most of the blocking system calls in the POSIX and standard C library are
defined as cancellation points, and these are listed when invoking the command man pthreads on a Linux
system. For example, the read() system call is a cancellation point that allows cancelling a thread that is
blocked while awaiting input from read(). One technique for establishing a cancellation point is to invoke
the pthread testcancel() function. If a cancellation request is found to be pending, the call to pthread
testcancel() will not return, and the thread will terminate; otherwise, the call to the function will return,
and the thread will continue to run. Additionally, Pthreads allows a function known as a cleanup handler
to be invoked if a thread is canceled. This function allows any resources a thread may have acquired to be
released before the thread is terminated.

103
Q

Thread-Local Storage

A

Threads belonging to a process share the data of the process. Indeed, this data sharing provides one of the
benefits of multithreaded programming. However, in some circumstances, each thread might need its own
copy of certain data. We will call such data thread-local storage (or TLS). For example, in a
transaction-processing system, we might service each transaction in a separate thread. Furthermore, each
transaction might be assigned a unique identifier. To associate each thread with its unique transaction
identifier, we could use thread-local storage. It is easy to confuse TLS with local variables. However,
local variables are visible only during a single function invocation, whereas TLS data are visible across
function invocations. Additionally, when the developer has no control over the thread creation process—
for example, when using an implicit technique such as a thread pool— then an alternative approach is
necessary. In some ways, TLS is similar to static data; the difference is that TLS data are unique to each
thread. (In fact, TLS is usually declared as static.) Most thread libraries and compilers provide support for
TLS. For example, Java provides a ThreadLocal class with set() and get() methods for ThreadLocal
objects. Pthreads includes the type pthread key t, which provides a key that is specific to each thread. This
key can then be used to access TLS data. Microsoft’s C# language simply requires adding the storage
attribute [ThreadStatic] to declare thread-local data. The gcc compiler provides the storage class keyword
thread for declaring TLS data. For example, if we wished to assign a unique identifier for each thread, we
would declare it as follows: static thread int threadID;

104
Q

Scheduler Activations

A

A final issue to be considered with multithreaded programs concerns communication between the kernel
and the thread library, which may be required by the many-to-many and two-level models discussed in
Section 4.3.3. Such coordination allows the number of kernel threads to be dynamically adjusted to help
ensure the best performance. Many systems implementing either the many-to-many or the two-level
model place an intermediate data structure between the user and kernel threads. This data structure—
typically known as a lightweight process, or LWP—is shown in Figure 4.20. To the user-thread library,
the LWP appears to be a virtual processor on which the application can schedule a user thread to run.
Each LWP is attached to a kernel thread, and it is kernel threads that the operating system schedules to
run on physical processors. If a kernel thread blocks (such as while waiting for an I/O operation to
complete), the LWP blocks as well. Up the chain, the user-level thread attached to the LWP also blocks.
An application may require any number of LWPs to run efficiently. Consider a CPU-bound application
running on a single processor. In this scenario, only one thread can run at a time, so one LWP is sufficient.
An application that is I/O intensive may require multiple LWPs to execute, however. Typically, an LWP is
required for each concurrent blocking system call. Suppose, for example, that five different file-read
requests occur simultaneously. Five LWPs are needed, because all could be waiting for I/O completion in
the kernel. If a process has only four LWPs, then the fifth request must wait for one of the LWPs to return
from the kernel. One scheme for communication between the user-thread library and the kernel is known
as scheduler activation. It works as follows: The kernel provides an application with a set of virtual
processors (LWPs), and the application can schedule user threads onto an available virtual processor.
Furthermore, the kernel must inform an application about certain events. This procedure is known as an
upcall. Upcalls are handled by the thread library with an upcall handler, and upcall handlers must run on a
virtual processor. One event that triggers an upcall occurs when an application thread is about to block. In
this scenario, the kernel makes an upcall to the application informing it that a thread is about to block and
identifying the specific thread. The kernel then allocates a new virtual processor to the application. The
application runs an upcall handler on this new virtual processor, which saves the state of the blocking
thread and relinquishes the virtual processor on which the blocking thread is running. The upcall handler
then schedules another thread that is eligible to run on the new virtual processor. When the event that the
blocking thread was waiting for occurs, the kernel makes another upcall to the thread library informing it
that the previously blocked thread is now eligible to run. The upcall handler for this event also requires a
virtual processor, and the kernel may allocate a new virtual processor or preempt one of the user threads
and run the upcall handler on its virtual processor. After marking the unblocked thread as eligible to run,
the application schedules an eligible thread to run on an available virtual processor.

105
Q

Windows Threads

A

A Windows application runs as a separate process, and each process may contain one or more threads.
The Windows API for creating threads is covered in Section 4.4.2. Additionally,Windows uses the
one-to-one mapping described in Section 4.3.2, where each user-level thread maps to an associated kernel
thread. The general components of a thread include: * A thread ID uniquely identifying the thread * A
register set representing the status of the processor * A program counter * A user stack, employed when
the thread is running in user mode, and a kernel stack, employed when the thread is running in kernel
mode * A private storage area used by various run-time libraries and dynamic link libraries (DLLs) The
register set, stacks, and private storage area are known as the context of the thread. The primary data
structures of a thread include: * ETHREAD—executive thread block * KTHREAD—kernel thread block *
TEB— thread environment block The key components of the ETHREAD include a pointer to the process
to which the thread belongs and the address of the routine in which the thread starts control. The
ETHREAD also contains a pointer to the corresponding KTHREAD. The KTHREAD includes
scheduling and synchronization information for the thread. In addition, the KTHREAD includes the
kernel stack (used when the thread is running in kernel mode) and a pointer to the TEB. The ETHREAD
and the KTHREAD exist entirely in kernel space; this means that only the kernel can access them. The
TEB is a user-space data structure that is accessed when the thread is running in user mode. Among other
fields, the TEB contains the thread identifier, a user-mode stack, and an array for thread-local storage. The
structure of a Windows thread is illustrated in Figure 4.21

106
Q

Linux Threads

A

Linux provides the fork() system call with the traditional functionality of duplicating a process, as
described in Chapter 3. Linux also provides the ability to create threads using the clone() system call.
However, Linux does not distinguish between processes and threads. In fact, Linux uses the term task
—rather than process or thread— when referring to a flow of control within a program. When clone() is
invoked, it is passed a set of flags that determine how much sharing is to take place between the parent
and child tasks. Some of these flags are listed in Figure 4.22. For example, suppose that clone() is passed
the flags CLONE FS, CLONE VM, CLONE SIGHAND, and CLONE FILES. The parent and child tasks
will then share the same file-system information (such as the current working directory), the same
memory space, the same signal handlers, and the same set of open files. Using clone() in this fashion is
equivalent to creating a thread as described in this chapter, since the parent task shares most of its
resources with its child task. However, if none of these flags is set when clone() is invoked, no sharing
takes place, resulting in functionality similar to that provided by the fork() system call. The varying level
of sharing is possible because of the way a task is represented in the Linux kernel. A unique kernel data
structure (specifically, struct task struct) exists for each task in the system. This data structure, instead of
storing data for the task, contains pointers to other data structures where these data are stored— for
example, data structures that represent the list of open files, signal-handling information, and virtual
memory. When fork() is invoked, a new task is created, along with a copy of all the associated data
structures of the parent process. A new task is also created when the clone() system call is made.
However, rather than copying all data structures, the new task points to the data structures of the parent
task, depending on the set of flags passed to clone(). Finally, the flexibility of the clone() system call can
be extended to the concept of containers, a virtualization topic which was introduced in Chapter 1. Recall
from that chapter that a container is a virtualization technique provided by the operating system that
allows creating multiple Linux systems (containers) under a single Linux kernel that run in isolation to
one another. Just as certain flags passed to clone() can distinguish between creating a task that behaves
more like a process or a thread based upon the amount of sharing between the parent and child tasks, there
are other flags that can be passed to clone() that allow a Linux container to be created. Containers will be
covered more fully in Chapter 18