EXAM QUESTIONS Flashcards
Explain the difference between the CPU and GPU architecture
Describe what types of applications can benefit more from each one of
these two systems
Which architecture targets the latency and
which targets the throughput and why?
x
Describe parallel implementations of binary and p-ary search
List the advantages and limitations of these two methods.
Compare the step complexity between P-ary and binary search.
x
What is the difference between the speedup and efficiency in the context of
parallel computing?
x
What values correspond to the ideal case of parallelisation and which factors
affect these ideal figures in real parallel applications?
x
Calculate the total speedup for a system running a program consisting of
serial and parallel parts of the code. Consider two cases where the parallel
part is executed on 2 and 20 parallel processors. Assume that the serial part
occupies 20% of the entire code.
x
What is the goal of parallelisation?
x
Explain Moore’s Law and its consequences on the development of parallel hardware.
The number of chips on a transistor doubles approximately every 2 years
Drove optimisation through increasing the speed and power of serial processors
Much easier/cheaper to wait a few years for technology to catch up rather than invest in complex and expensive architectures
More transistors means deeper instruction pipelining, more operations per time period and more complicated instructions
Explain the fact that while the transistor count in the processors is still rising, the clock rate trend has flattened since 2004/2005.
x
Can we still improve the performance of our processors using frequency-scaling? Justify your answer.
x
Provide two examples of applications which benefit from parallel computation. Could these problems be solved with standard serial hardware?
x
Name the two independent dimensions used in Flynn’s taxonomy to classify multi-processor computer architectures.
Instruction Steams and Data Streams
Provide a brief definition of SIMD and explain what types of problems would benefit the most from SIMD implementations.
x
Compare the pros and cons of shared and distributed memory architectures
x
Which category of Flynn’s taxonomy do GPUs fall in?
x
Is the GPU architecture optimised for low-latency applications or high-throughput applications, and why?
x
What is heterogeneous computing?
Combination of both a latency processor (CPU) and a throughput processor (GPU)
What is a parallel pattern?
x
What is a map pattern and its main characteristics? What is the theoretical and
practical complexity of the map pattern?
x
What is the strategy to perform parallel computation on large data inputs with a
limited number of processing units?
x
What is a stencil pattern and how it can be optimised?
x
Explain the reduce pattern and provide two example combiner functions for it.
x
What are the requirements for the combiner functions in the reduce pattern?
x