Module 5b - RPC Performance Flashcards

1
Q

In RPC client-server systems, what are the 3 assumptions that are made with regards to calculating RPC performance?

A
  1. The workload is CPU-intensive
  2. The software architecture has a layer of clients and a layer of servers (2 layered architecture)
  3. Clients use synchronous RPCs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In RPC performance what do the following letters denote?

D
L
T
C

A

D - the time required to process one request in one thread at the server

L - the one-way network latency between the client and the server layer (roundtrip network time is 2L)

T - the total number of threads at the client layer

C - the total number of cores allocated to the server layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In many systems, the performance of the system is limited by the server’s CPU resources.

Explain how the server’s CPU resources limits throughput such that Throughput < C/D

Does the client’s implementation influence the Throughput < C/D law?

A

Each server can process at most 1/D requests per second if it is busy 100% of the time.

Since there are C cores, the server is limited to C/D requests per second (which is Throughput).

This is irrespective of how the client is implemented & other overheads. The server simply cannot work faster than C/D requests per seconds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In some systems, the performance of the system is limited by the client layer (and not the server layer)

Explain how the client’s threads limits throughput such that Throughput <= T/(2L + D)

A

Client threads spend most of their time waiting for RPCs to complete (after sending a request). They do not perform CPU-intensive work.

A single thread must wait 2L + D time for one RPC request, and so the performance limited by how small this is. With T threads this gives us:

Throughput <= T/(2L + D)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can you combine the throughput bounds of the server and the client RPCs to get an upper bound on the overall throughput of the system? What would this throughput be?

A

We know the server’s maximum throughput is C/D since it processes requests in parallel on numerous cores

We know the client’s maximum throughput is T/(2L + D) since each thread is busy waiting while RPCs are fired off

Therefore, the Throughput of the overall system is:
Throughput < min( C/D, T/(2L+D) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does it mean in terms of throughputs of the client and server when we say:

  1. Performance is limited by the server
  2. Performance is limited by the client
A
  1. Tput_server < Tput_client
  2. Tput_client < Tput_server

Note that:
Tput_server = C / D
Tput_client = T / (2L+D)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In order to maximize the performance of a client-server RPC system, what do we need to ensure in terms of the number of client threads (T) and server cores (C)

A

We don’t want to bottleneck the system. Since the Tputs are:
Tput_server = C / D
Tput_client = T / (2L+D)

And 2L + D > D, then we want to ensure T > C. There must be more client threads than there are server cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

From the client’s perspective, an RPC request will have latency of 2L + D (since it has to traverse the network and compute the request)

In what case would the latency observed by the client be more than this? Which layer is limiting the Throughput?

A

Whenever the server’s load is higher than its number of cores, and the requests are queued. This would cause queuing delays & the throughput of the system would be limited by the server and not the client

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Suppose the system has a throughput of “Tput_sys” (which is the minimum of the client and server Tput), and the client has T threads.

What is an expression for the end-to-end latency of the network?

A

latency = T / Tput_sys

This only applies if the latency is constant AND that the client threads share the load equally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly