P2L5 Thread Performance Considerations Flashcards

1
Q

Which threading model is better? Boss/Worker or Pipeline?

A

It depends …

It depends on which metric is most important.

Total time or average time per order …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Review: Give 4 ways threads are useful

A
  1. parallelization (speed up)
  2. specialization (hot cache)
  3. efficiency (lower memory usage, cheaper syncronization - read/write shared variables vs. IPC)
  4. In single CPU, threads hide latency of I/O
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is useful for:

  1. Matrix multiply application?
  2. Web service application?
  3. Hardware?
A
  1. execution time
  2. requests/sec and response time
  3. CPU utilization (time CPU is working)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give 3 metrics important for toy shop and OS. Give examples of the metrics for toy shop vs. OS

A
  1. Throughput
    • toys/hour
    • process completion rate
  2. Response time
    • avg. time to respond to order
    • avg. time to respond to input (e.g. mouseclick)
  3. Utilization
    • busy workbenches
    • % CPU
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following are performance metrics?

  1. performance/$
  2. performance/W (per watt)
  3. percentage of SLA violations
  4. client perceived performance
  5. aggregate performance
  6. platform efficiency
  7. throughput
  8. wait time
A

all of the above!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define a ‘test bed’

A

Ideally, we will obtain metrics running real software on real machines with real workloads.

Often, this is not feasible for many different reasons. In these cases, we may have to settle on “toy” experiments that are representative of realistic situations that we may encounter.

We refer to these experimental settings as a testbed. The testbed tells us where/how the experiments were carried out and what were the relevant metrics being gathered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a metric?

A

a metric is some measurable quantity we can use to reason about the behavior of a system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Is ‘it depends’ a correct answer to “Are threads useful?”
  2. Is ‘it depends’ an accepted answer to “Are threads useful?”
A
  1. Yes
  2. No

For example, some graph traversal algorithms work best on sparse graphs, while others work best on dense graphs. Some filesystems are optimized for read access, while others might be optimized for a write-heavy system.

The answer is: It depends! While, this answer is almost always correct, it is rarely accepted. What is more important perhaps is to modify the question, extending it to include the context you wish to examine and the metrics you wish to obtain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Refer to the diagram

  1. Which steps are computationally expensive?
  2. Which steps involve interaction with the network?
  3. Which steps involve interaction with the disk?
A
  1. parser step, header creation
  2. accepting a connection, sending data
  3. read/write file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

State 1 advantage and 4 disadvantages of performance speedup by making the web server multi-process

A

Advantage: simple programming - spawn multiple processes - because we have a working debugged process already. what could be better?

Disadvantages:

  1. higher memory footprint, which can hurt performance.
  2. high cost of a contest switch whenever we want to run a different process.
  3. hard/costly to maintain shared state across processes due to IPC constraints.
  4. difficult to have multiple processes listening on a specific port.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

State 2 advantages and 2 disadvantages of improving web server performance by multi-threading

A

Advantages:

  1. Cheaper context switch because shared address space.
  2. Lighter memory requirements because of shared information across all threads in the process.

Disadvantages:

  1. Software complexity: Multithreaded requires explicit application level synchronization code
  2. Depends on underlying operating system level support for threads, although this is less of an issue now that it was in the past. (true for Solaris paper)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the event-driven model for an application

A

An event driven application is implemented in a single address space, with a single thread of control. The main part of the process is an event dispatcher which in a loop looks for incoming events and then based on those events invokes one or more of the registered handlers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does the event-driven model support concurrency?

A

In the event driven model, the processing of multiple requests are interleaved within a single execution context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why does the event-driven model with one thread work for concurrency?

A

because there is no idle time

context switching just wastes cycles that could have been used for request processing.

In the event driven model, a request will be processed exactly until a wait is necessary, at which point the execution context will switch to servicing another request.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What about if we have multiple CPUs?

Note the gotcha

A

If we have multiple CPUs, the event driven model still makes sense, especially when we have to service more concurrent requests than we have CPUs. Each CPU could host a single event-driven process, and all multiple requests to be processed concurrently within that process.

This can be done with less overhead than if each CPU had to context switch among multiple processes or multiple threads.

Gotcha: It is important to have mechanisms that will steer the right set of events to the right CPU.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe implementation of event-driven model

A

The operating system uses sockets as an abstraction over the network, and files as an abstraction over the disk.

  1. single address space
  2. single execution context
  3. no shared access variables

we are jumping over code base - executing handlers - but that’s less than context switches

17
Q

List the problems with the event-driven model. And some solutions

A

If one of the handlers initiates a blocking call, the entire process can be blocked.

Solution: asynchronous I/O operations.

18
Q

What is an asynchronous I/O operation?

A

Asynchronous calls let the process or thread continue execution now and check their results later.

19
Q

Describe helpers

If the kernel is not multithreaded - it wasn’t back in the day - the helpers need to be processes. The model was called the Asymmetric Multi-Process Event-Driven Model or AMPED. The multithreaded equivalent acronym is AMTED.

A

Asynchronous I/O calls are not available on all platforms or for all types of devices.

Handler passes the call to the operation to a helper, and returns to the event dispatcher.

The helper will block, but the main event loop will not!

The helper will be the one that handles the blocking I/O operation and interact with the dispatcher as necessary.

In doing this, the synchronous I/O call is handled by the helper.

20
Q

Which model requires the least amount of memory? Why?

  1. Boss-Worker model
  2. Pipeline
  3. Event-driven model
A

Event driven model.

Because in the other models we have threads. Threads have memory requirements to keep track of things for context switching..

in event-driven - just for threads on blocking I/O (helper)

21
Q

At about 100MB Flash becomes better than SPED. Why?

A
  • Because Flash can handle I/O without blocking
  • Because at 100MB, the workload becomes I/O bound. So the system that performs better on I/O wins.
22
Q
A