1. Direct link to neighbours 2. Private on-chip memory 3. Staged communication with non-neighbours 4. NxN grid worst case: 2 (N-1) steps

1. Direct link to neighbours 2. Private on-chip memory 3. More symmetrical, more paths, shorter paths 4. NxN grid worst case: N steps 5. More wires, complex routing

1. All cores to all cores 2. Simple to build 3. Constant latency 4. Memory can be oranized in any way: private to each core or shared between cores 5. Time-shared bus (disadvantage) → complexity, lesser bandwidth (fraction of that of grid) 6. Very long wires (to connect all these cores) → area, routing, power, slow

Lecture 2: The world of parallelism Flashcards by Lana A

What is the trend in GPU and CPU usage?

The number of cores is increasing

How well did you know this?

Not at all

Perfectly

What is Flynn’s Taxonomy?

Classification system for computer architectures based on number of instruction and data streams they can process simultaneously.

How well did you know this?

Not at all

Perfectly

What are the categories of Flynn’s Taxonomy?

SISD: Single Instruction, Single Data
SIMD: Single Instruction, Multiple Data
MISD: Multiple Instruction, Single Data
MIMD: Multiple Instruction, Multiple Data (Chip MPs)

How well did you know this?

Not at all

Perfectly

Define the four categories of Flynn’s Taxonomy

SISD: One instruction executed at a time
processing one data element at a time. e.g. Traditional single processors

SIMD: One instruction executed at a time
operating on multiple independent streams of data e.g. Vector processors (1970s), Vector units (MMX, SSE, AVX), GPUs

MIMD: Multiple sequences of instructions executed independently
each one operating on a different stream of data e.g. Chip Multiprocessors

SIMD: Multiple instruction streams but with the same code on
multiple independent streams of data e.g. Data Parallel machines built from independent processors

How well did you know this?

Not at all

Perfectly

Describe ineterconnections and communication in parallelism

Between cores
Between cores and memory

The way the connections are affects the type of computations

How well did you know this?

Not at all

Perfectly

Inerconnections in Chip MPs Advantages and Disadvantages

Advantages: Faster than traditional interconnects so lower cost
Disadvantages: Limited silicon and power for network

How well did you know this?

Not at all

Perfectly

What is a grid?

Direct link to neighbours
Private on-chip memory
Staged communication with non-neighbours
NxN grid worst case: 2*(N-1) steps

How well did you know this?

Not at all

Perfectly

What is a torus?

Direct link to neighbours
Private on-chip memory
More symmetrical, more paths, shorter paths
NxN grid worst case: N steps
More wires, complex routing

How well did you know this?

Not at all

Perfectly

What is the smallest kind of on-chip interconnect?

A grid

How well did you know this?

Not at all

Perfectly

In what interconnect do all cores connect four neighbors?

Torus

How well did you know this?

Not at all

Perfectly

Which interconnects are suitable for smaller and which for larger systems?

Grid -> Smaller
Torus -> Larger
Bus -> Smaller

How well did you know this?

Not at all

Perfectly

Whats a key property of Torus?

They can be generalized further to multiple dimensions:
2D torus 2D grid →
folded 4 neighbours →
3D torus 3D grid →
folded 6 neighbours →
4D torus 4D grid →
folded 8 neighbours →
CMPs rarely go above 2D

How well did you know this?

Not at all

Perfectly

Which inerconnect is relied on by many multiprocessors?

A bus, partially or fully

How well did you know this?

Not at all

Perfectly

What is a bus?

All cores to all cores
Simple to build
Constant latency
Memory can be oranized in any way: private to each core or shared between cores
Time-shared bus (disadvantage)
→ complexity, lesser bandwidth (fraction of that of grid)
Very long wires (to connect all these cores)
→ area, routing, power, slow

How well did you know this?

Not at all

Perfectly

For a large number of cores what would be the main bottleneck?

The bus

How well did you know this?

Not at all

Perfectly

Which topologies are more suitable for larger systems (scalable)?

Study These Flashcards

1.Trees
2. Hierarchical (Crossbars,
Hypercubes, Rings,
MIN, etc)

Important for high core count

What type of switching is in scalable topologies? Desrcibe it.

Study These Flashcards

Packet switching: Dividing data into small packets for efficient transmission across a network. Can follow different paths to the destination and arrive out of order.

What is a reason for discontuity in core count growth?

Study These Flashcards

Interconnection. Larger counts require more complex network topologies.

What is shared memory? Describe its hardware and software view.

Study These Flashcards

Accessible from every part of the computation.

Hardware: Memory connected to all cores

Software: Global. Accessible from all threads (Reads/Writes)

What is distributed memory? Describe its hardware and software view.

Study These Flashcards

Accessible from only one part of the computation.

Hardware: Memory connected to only one core