9 - Algorithms 2 (Histogram) Flashcards
What is a histogram?
An estimator of data distribution. Uses bins (value intervals) to count the number of values falling into that interval
Name 2 applications of a Histogram
Histogram equalisation (contrast) and object detection
What do atomic operations deal with?
Race conditions.
But serialise the access to global memory and are slow
What is the extreme case in histogram distribution?
If all input values are the same ones, one bin will be updated by all threads
What is Histogram addition?
A combination of two histograms (their sum)
Associative and commutative operation
Variant of reduction
What is histogram privatisation?
Creating partial private histograms (stored in local memory) which are then combined into a single one
What are the two versions of histogram privatisation?
Local histogram per work group and local histogram per work item
What does local histogram per work group require?
Atomics, but on local memory so is much faster
What is local histogram per work item?
Assigning a local histogram for each work item/thread, then reducing to a single local histogram (i.e using local atomics)
Each work item processes a number of K elements serially and there is no need for synchronisation
What is the performance of local histogram per work group?
N atomic operations on local memory
B*WG atomic operations on global memory
What is the performance of local histogram per work item?
N/K atomic operations on local memory
B*WG/K atomic operations on global memory
What are limitations of histogram privatisation algorithms?
Size of local memory is a limiting factor - defines max number of bins
What is the procedure for a histogram sort-search algorithm?
Calculate bin indices for each element
Sort
Search for start of segments corresponding to each index - results in a cumulative histogram
Subtract the neighbouring values to receive a normal histogram
What are the characteristics of a sort and search histogram procedure?
Not effective for small problems. Good for large B and N
How does the input data distribution affect the performance of a parallel histogram implementation based on atomics?
If all values are the same then only one bin will be updated by all threads, lots of conflicts and serial operations