7 - Communication & Synchronisation Flashcards

1
Q

How do work items/threads communicate?

A

Through memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the idea case for memory?

A

One type that is large, cheap and fast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the attributes of large, cheap and fast memory?

A

Large = slow/expensive

Cheap = small/slow

Fast = small/expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 4 types of GPU memory types?

A

Private memory, local memory, global memory, constant memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the attributes of private memory?

A

Very fast, only accessible by a single work item, registers, 10/100 bytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the attributes of local memory?

A

Fast, accessible by all work items within a single work group, user accessible cache, K/MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the attributes of global memory?

A

Slow, accessible by threads from all work groups, DRAM, GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the attributes of constant memory?

A

Fast, also accessible, by all threads, part of global memory but cached, not writable, relatively small, KB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should you minimse time spent on?

A

Memory operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you minimise time spent on memory operations?

A

Move frequently accessed data to a faster memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the order of fast memory?

A

host&raquo_space; global&raquo_space; local&raquo_space; private

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What doesn’t benefit from moving frequently accessed data to a faster memory?

A

Single or sporadic accesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When does data become global memory?

A

When it is transferred from host to device

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is local memory?

A

Making a local copy of the input to make accesses faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do you need synchronisation?

A

Accesses to shared locations need to be correctly synchronised/coordinated to avoid race conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are 3 types of synchronisation mechanisms?

A

Barriers/memory fences
Atomic operations
Separate kernel launches

17
Q

What do barriers do?

A

Ensure that all work items within the same work group reach the same point

18
Q

Which has lower overhead, global or local memory barriers?

A

Local

19
Q

Where should you avoid putting barriers?

A

In conditional statements, should always apply to all work items from the group otherwise deadlock

20
Q

What is impossible in modern GPU/CPU hardware?

A

Synchronise different work groups

21
Q

How do you synchronise different workgroups?

A

By writing and launching separate kernels

22
Q

What do Atomic functions do?

A

Provide a mechanism for atomic (without interruption) memory operations

23
Q

What do Atomic functions guarantee?

A

Race free execution

24
Q

How are Atomic updates performed?

A

Serially, so performance penalty

25
Q

What is the order in Atomic functions?

A

The order is unspecific, so can only be used with associative and commutative operators

26
Q

What are the limitations of Atomic functions?

A

Atomics are slower than normal accesses

Performance degrades with many simultaneous attempts to perform atomic operations on the same data

27
Q

What is the usage for Atomic functions?

A

For infrequent, sparse and/or unpredictable global communication

Attempt to use shared memory and structure algorithms to avoid synchronisation whenever possible

28
Q

What is does global memory reads by GPU involve?

A

Reading entire blocks of data

29
Q

What is memory coalescing?

A

Sequential data access for better performance. When another value is requested and is from the same block then no additional memory access is required

30
Q

What are the effects of the Stride in Strided memory access?

A

A stride affects the access the pattern, if the stride is larger than the block size then the benefits of blocking are gone