Lectures 12 & 13 - Chip Multiprocessors Flashcards

Question 1

Q

Flynn’s Taxonomy

Answer

A

SISD - Uniprocessor

SIMD - Vector

MISD - Not interesting?

MIMD - Each processor runs its own program and operates on its own data.

Question 2

Q

Shared-address-space platforms

Answer

A

All processors have access to a shared data space accessed via a shared address space. All communication takes place via a shared memory. Each may also have area of memeory that is private.

Question 3

Q

Message-passing platforms

Answer

A

Each processing element has its own exclusive address space and communication is achieved by sending explicit messages between processing elements. Sending and receiving of message is used to communicate between and synchronise the actions of multiple processing elements.

Question 4

Q

Why uniprocessors were difficult to develop further?

Answer

A

Exploiting greater levels of ILP became very expensive (transistor count, complexity and power).

Limits to pipelining

Designs limited by power consumption

Interconnects scale poorly compared to transistors.

Processor complexity limited by cost of design/verification and time to market constraints.

Question 5

Q

Bus Snooping

Answer

A

Exploit presence of shared bus:

Bus access is arbitrated ensuring each bus transaction is completed before next transaction starts. All bus transactions are broadcast and can be observed by all processors (in same order). Coherence can be maintained by ensuring all cache controllers “snoop” on the bus and monitor the transactions. Cache controller may take action if bus transaction involves memory block of which it has a copy.

Question 6

Q

Write-Through Invalidation Protocol

Answer

A

Every write causes a write transaction on the bus.

For each write transaction:

Each snooping cache checks if it has a copy of the cache block associated with the write address.

If we have a copy of block invalidate it.

Question 7

Q

Sequential Consistency

Answer

A

A multiprocessor is sequentially consistent if the result of any execution is the same as if the operations of all processors were executed in some sequential order, and the operations of each individual processor occur in this sequence in the order specified by its program.

Question 8

Q

Relaxed Consistency

Answer

A

Reordering memory operations between synchronization operations does not typically affect correctness.

Question 9

Q

False sharing

Answer

A

Cache coherence mechanism may cause block to be invalidated even when no communication is taking place. Unrelated words just happen to be stored in same unit of coherence.

Question 10

Q

Private L2 Caches

Answer

A

Low hit latency but low capacity. Cache capacity for shared data is reduced as each private L2 must keep its own copy of the shared data. Constrained by fixed partitioning of cache resources.

Question 11

Q

Cache Exclusion Policy

Answer

A

Block in L1 or L2 so need independent snooping hardware for each level of the cache hierarchy.

Question 12

Q

Cache Inclusion Policy

Answer

A

If block is in L1 then it is also in the L2. Sufficient to just snoop L2.

Question 13

Q

Shared L2 Cache

Answer

A

Greater capacity but higher hit latency. No replication of data when data is shared between cores. Shared L2 cache has larger capacity which helps reduce capacity misses. Don’t need to worry about cache coherency at L2 level. Avg. hit latency will be higher as need to retrieve data from remote L2 back. Need greater associativity to control conflict miss rate.

Question 14

Q

MSI - Cache Line States

Answer

A

Shared - Block is present in unmodified state in this cache and main memory is up-to-date. Copies may also exist in other caches.

Modified - Only this cache has a valid copy and copy in memory is stale.

Invalid - Explanatory

Question 15

Q

MESI Protocol

Answer

A

Add exclusive state if only read by one person. This allows silent transition to M, if read and write by same user.

Question 16

Q

Test and Set

Answer

Study These Flashcards

A

Executes testing and changing of instruction as a single instruction. Can implement with two bus transactions by locking down bus for two cycles to ensure sequence is atomic. Alternatively can issue read exclusive then perform read and write without giving up ownership. Incoming requests to blocka re buffered until data is written into the cache.

Question 17

Q

LL/SC

Answer

Study These Flashcards

A

Load-linked:

Reads memory, sets lock flag and put address in lock register.

Store-conditional:

Check lock flag to ensure an intervening conflicting write has not occurred. If lock flag is not set then SC will fail.

Lectures 12 & 13 - Chip Multiprocessors Flashcards

(17 cards)