Distributed Algorithms Flashcards

1
Q

What does M stand for in distributed analysis?

A

is the number of message transmissions (usually includes unsuccessful transmissions in addition to successful ones). aka ‘message cost’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lnode

A

the entity workload, that is, the
number of messages per entity,

Lnode = M/|V|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Llink

A

the transmission load, that is,
the number of messages per link.

Llink = M/|E|,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

B

A

bit complexity, the number of transmitted bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why may it be necessary to consider the bit complexity of a distributed protocol?`

A

Messages are sequences of bits; some protocols might employ messages that are very short (e.g., O(1) bit signals), others very long (e.g., .gif files). Thus, for a more
accurate assessment of a protocol, or to compare different solutions to the same problem that use different sizes of messages, it might be necessary to use as a cost
measure the number of transmitted bits B also called bit complexity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

total execution delay

A

the delay between the time the first entity starts the execution of a computation and
the time the last entity terminates its execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

in the general model of distributed computing, what assumptions are there about execution time?

A

None, except for the fact that communication delays for a single message are finite (but widely varying) in absence of failure. in other words, communication delays are in general unpredictable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

T

A

the ideal execution delay or ideal time complexity, the execution delay experienced when the system is synchronous and (in the absence of failure) takes one unit of time for a message to arrive and to be processed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When to do parallel processing?

A

The simplest performance model for parallelism is the “NQ” model, where N is the number of elements, and Q is the computation per element. In general, you need the product NQ to exceed some threshold before you start getting a performance benefit. For a low-Q problem like “add up numbers from 1 to N”, you will generally see a breakeven between N=1000 and N=10000. With higher-Q problems, you’ll see breakevens at lower thresholds.

You should look into doing parallel processing.The total time to execute the sequential version exceeds a minimum threshold. These days [ 2014], the threshold is roughly (within a factor of ten of) 100 microseconds across most platforms. You don’t need to measure this precisely though. You can estimate this well enough in practice by multiplying N (the number of elements) by Q (cost per element of F), in turn guestimating Q as the number of operations or lines of code, and then checking that N * Q is at least 10000. (If you are feeling cowardly, add another zero or two.) So when F is a tiny function like x -> x + 1, then it would require N >= 10000 elements for parallel execution to be worthwhile. And conversely, when F is a massive computation like finding the best next move in a chess game, the Q factor is so high that N doesn’t matter so long as the collection is completely splittable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is using Java data type implementations that use ‘synchronized’ keyword enough?

A

No

Consider this:

if (vector.isEmpty()){
vector.add(data);
}
Even though the methods involved are synchronized, because they are being locked and unlocked individually, two unfortunately timed threads can create a vector with two elements.

So in effect, you have to synchronize in your application code as well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly