Parallel Architecture Flashcards

Question 1

Q

Hypercube bisection width

Question 2

Q

Square Toirodal Mesh Bisection Width

Answer

A

2*sqrt(p)

Question 3

Q

Mesh Bisection Width

Question 4

Q

Fully Connected Bisection Width

Question 5

Q

Crossbar Bisection Width

Question 6

Q

Omega Bisection Width

Question 7

Q

Message transimission time

Answer

A

l (latency) + n (bytes)/b (bytes/second)

Question 8

Q

Snooping Cache Coherence

Answer

A

The idea behind snooping comes from bus-based systems: When the cores share a bus, any signal transmitted on the bus can be “seen” by all the cores connected to the bus. Thus when core 0 updates the copy of x stored in its cache, if it also broadcasts this information across the bus, and if core 1 is “snooping” the bus, it will see that x has been updated, and it can mark its copy of x as invalid. This is more or less how snooping cache coherence works.

Question 9

Q

Directory Based Cache Coherence

Answer

A

Directory-based cache coherence protocols attempt to solve this problem through the use of a data structure called a directory.
The directory stores the status of each cache line.
Typically, this data structure is distributed; in our example, each core/memory pair might be responsible for storing the part of the structure that specifies the status of the cache lines in its local memory.
Thus when a line is read into, say, core 0’s cache, the directory entry corresponding to that line would be updated, indicating that core 0 has a copy of the line.
When a variable is updated, the directory is consulted, and the cache controllers of the cores that have that variable’s cache line in their caches will invalidate those lines.

Question 10

Q

Shared Memory vs Distributed Memory

Answer

A

Shared Memory:

Pros:
1. Implicit coordination of processors through shared data structures.
2. Appealing programming model for many programmers.
3. Generally suitable for systems with a small number of processors.

Cons:
1. Scaling interconnect can be costly.
2. Conflicts over access to the bus increase dramatically with more processors.
3. Large crossbars, while efficient, are expensive.

Distributed Memory:

Pros:
1. Relatively inexpensive interconnects like hypercube and toroidal mesh.
2. Well-suited for systems with thousands of processors.
3. Better for problems requiring vast amounts of data or computation.

Cons:
1. Requires explicit message passing for coordination.
2. More complex programming model for many programmers.
3. Not as suitable for small-scale systems with few processors.

Question 11

Q

MIMD

Answer

A

MIMD (Multiple Instruction, Multiple Data) systems support multiple simultaneous instruction streams operating on multiple data streams.
MIMD systems consist of fully independent processing units or cores, each with its own control unit and datapath.
MIMD systems are asynchronous, meaning processors can operate at their own pace.
Many MIMD systems lack a global clock and may have no relation between system times on different processors.
Without synchronization imposed by the programmer, even if processors execute the same sequence of instructions, they may execute different statements at any given instant.

Question 12

Q

SIMD

Answer

A

SIMD (Single Instruction, Multiple Data) systems operate on multiple data streams by applying the same instruction to multiple data items simultaneously.
Abstract SIMD systems have a single control unit and multiple datapaths.
Instructions are broadcast from the control unit to the datapaths, where each datapath applies the instruction to a data item or remains idle.
An example application is “vector addition,” where two arrays with n elements each are added element-wise.

Question 13

Q

Modified State

Answer

A

cache block contains current value
all other copies contains an invalid value

Question 14

Q

Shared State

Answer

A

cache block has not been updated
cache block contains current value
all other copies also contains the current value

Question 15

Q

Invalid State

Answer

A

cache block does not contain most recent value of
memory block

Question 16

Q

Bus Read

Answer

Study These Flashcards

A

Generated by read operation of memory block not
in local cache

Question 17

Q

Bus Read Exclusive

Answer

Study These Flashcards

A

Generated by write operation to memory block not
in local cache or in local cache, but not in state M
Memory provides the most recent value
All other copies are marked invalid