L17 Flashcards

Question 1

Q

What law sets a limit to the number of cores that can be used to achieve a speedup?

Answer

A

Amdahl’s law is a formula that limits the speedup that can be achieved by parallelizing a computation.

Question 2

Q

What does the Dark Silicon refer to?

Answer

A

Dark silicon is a term used to describe the portion of a chip’s transistors that are not being used at any given time.

It is impossible to keep the whole chip powered on without damaging it.

Question 3

Q

What are the 3 categories for heterogenous CMPs?

Answer

A

Domain specific accelerators
* Accelerating one very specific domain/type of computation
* Uses specialized hardware

General purpose accelerators
* Accelerating a general class of workloads
* Fully programmable

Asymmetric multi-cores and many-cores
* Cores of different capabilities (heterogeneity in the CPU itself)
* Tightly coupled

Question 4

Q

What is an example of a domain specific accelerators for heteregenous CMPs?

Answer

A

Snapdragon 8 Gen 2 SoC

Question 5

Q

What is an example of a general purpose accelerators for heteregenous CMPs?

Answer

A

o on chip GP GPUs
o IBM Cell SPE
o On-Chip FPGAs
o Workloads
o Project Catapult: a reconfigurable fabric for accelerating large-scale datacenter services

Question 6

Q

What is an example of an assymetrical multi-cores and many-cores for heteregenous CMPs?

Answer

A

ARM big.LITTLE (static asymmetry)
 Clusters of big and small cores
 OS sends high load tasks to big cores through cluster migration, CPU migration, and global task scheduling

Snapdragon 8 Gen 2: 3 clusters (prime, perf, efficiency)

Intel Alder Lake: high-end, low-end, e-cores, thread director

Question 7

Q

What is the difference between dynamic and static asymmetry in asymmetric multi-cores and many-cores?

Answer

A

Static asymmetry utilises the same runtime resources to all the cores.

Dynamic asymmetry utilises more runtime resources to some cores.

Question 8

Q

Why are heterogenous CMPs difficult to use?

Answer

A

Most properties are unknown until after runtime

Unknown capabilities, unknown availability, uknown relative benefit

Question 9

Q

Are heterogenous CMPs functionally portable? How can functional portability be achieved for low-level and high-level programs?

Answer

A

No, they are not because they use device-specific languages and have device-specific optimisations.

Functional portability can be achieved for low-level programs through OpenCL. A program can be distributed as OpenCL source code, forming a standard layer of compatibility.

For high-level languages,
- C++ with SYCL can be automatically compiled into other languages for different devices.
- Java with TornadoVM uses parallel for/reduce annotations.

Question 10

Q

True or False.

SYCL, OpenCL and TornadoVM have some performance portability.

Answer

A

False.

OpenCL has no performance portability

Question 11

Q

How can performance portability be achieved in heterogenous CMPs?

Answer

A

 DSLs and library APIs as the new HW/SW contract
 De Facto Standards
 Intel OneAPI

Question 12

Q

What are the 4 factors that affect scheduling in heterogenous CMPs?

Answer

A

Communication/Interference:
* Hard constraint: devices need to share memory
* Soft constraints: devices need to be close and compete for shared resources, which is suboptimal to schedule on same cluster

Affinity:
* Different device types makes it extremely difficult to predict scheduling requirements
* Different core types also makes it difficulty to decide which core is implementing what

Thread/task criticality:
* Not all tasks/threads are equally important for overall progress

Energy and power:
* Limited power budged results in adjusting the frequency of cores and some cores might need to be off
* Limited energy budget affects the power and runtime of a core

L17 Flashcards

(12 cards)