High performance Computing Flashcards

1
Q

Week 1

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What new science is a major user of high performance computing?

A

Life sciences for applications such as genome processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can we determine the performance of a high performance computer using floating point mathematics

A
  • Linpack is a performance benchmark which measures floating point operations per
    second (flops) using a dense linear algebra workload
  • A widely used performance benchmark for HPC systems is a parallel version of Linpack
    called HPL (High-Performance Linpack)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is parallelization important in HPC

A

High requirement floating point calculations can be run in parallel to make use of multiple cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is dennard scaling?

A

Dennard scaling is a recipe for keeping power per unit area (power density) constant as transistors were scaled to smaller sizes

As transistors became smaller they also became faster (delay reduction) and more
energy efficient (reduce threshold voltage)

With very small features limits associated with the physics of the device (e.g. leakage
current) are reached

Dennard scaling has broken down and processor clock speeds are no longer
increasing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the current most common supercomputer architecture

A

Current systems are all
based on integrating many
multi-core processors

  • The dominant architecture is
    now the “commodity cluster”
  • Commodity clusters
    integrate off-the-shelf (OTS)
    components to make an
    HPC system (cluster)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give the proper definition for a Commodity cluster

A

A commodity cluster is a cluster in which both the network and the compute nodes are commercial products available for procurement and independent application by organisations (end users or separate vendors) other than the original equipment manufacturer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give four components of a cluster algorithm

A

Compute nodes: provide the processor cores and memory required to run the workload
* Interconnect: cluster internal network enabling compute nodes to communicate and access storage
* Mass storage: disk arrays and storage nodes which provide user filesystems
* Login nodes: provide access (e.g. ssh) for users and administrators via external network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do High performance Computers use compiled languages

A

Maximizes performance.
Compilers parse code and generate executables with optimizations.
Optimizations at compile-time are less costly than at runtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are common langauges for High performance computers?

A

C, C++ and Fortran

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why must parellisation be done manually

A

Parallelization is too complex for compilers to handle automatically.
Programmers add parallel features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Week 2

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What can we use OpenMP for?

A

OpenMP provides extensions to C, C++ and Fortran
* These extensions enable the programmer to specify where parallelism should be added and how to add it
* The extensions provided by OpenMP are:
- Compiler directives
- Environment variables
- Runtime library routines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does it mean to say that OpenMP uses a fork join execution model?

A

Execution starts with a single thread (master thread)
- Worker threads start (fork) on entry to a parallel region
- Worker threads exit (join) at the end of the parallel region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What can we use the OMP_NUM_THREADS header for?

A

We can use the OMP_NUM_THREADS environment variable to control the number of threads forked in a parallel region e.g.

  • export OMP_NUM_THREADS=4
  • OMP_NUM_THREADS is one of the environment variables defined in the standard
  • If you don’t specify the number of threads the default value is implementation defined
    (i.e. the standard doesn’t say what it has to be)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does openMP provide that we can call directly from our functions?

A

OpenMP provides compiler directives, environment variables and a runtime library with functions we can call directly from our programs

17
Q

What header file must be included to use the open.mp library

A

<omp.h>
</omp.h>

18
Q

Why is conditional compilation useful in programs that use OpenMP?

A

It ensures that the program can compile and run as a serial version when OpenMP is not enabled, avoiding errors caused by missing OpenMP compiler flags.

19
Q

What is the role of the C pre-processor in the compilation process?

A

The C pre-processor processes source code before it is passed to the compiler, handling directives such as #include and #ifdef.

20
Q

What does the _OPENMP macro indicate when it is defined?

A

It indicates that OpenMP is enabled and supported by the compiler.

21
Q

hat is the syntax of the #ifdef directive used for conditional compilation?

A

ifdef MACRO

// Code included if MACRO is defined #else
// Code included if MACRO is not defined #endif
22
Q

What is the main benefit of using conditional compilation with OpenMP programs?

A

It allows the same source code to support both serial and parallel execution by enabling or disabling OpenMP-related code.

23
Q

What is one good way to distribute the workload when working in parallel

A

One way to do this in OpenMP is to parallelise loops

  • Different threads carry out different iterations of the loop
  • We can parallelise a for loop from inside a parallel region:
    #pragma omp for
  • We can start a parallel region and parallelise a for loop:
    #pragma omp parallel for
24
Q

What changes about the order of loop iterations when they are executed in parallel?

A

When the loop is parallelised the iterations will not take place in the order specified by the loop iterator
* We can’t rely on the loop iterations taking place in any particular order

25
Q

How do we solve the issue of loops not iterating in order when executed parallelly

A

The results are stored in arrays which do hold the results in the order we require
* If we want to print out the results of our calculation in order we will need a second sequential (not parallelised) loop

26
Q

What is the correct way to define variable scope for OpenMP loops

A

In the examples in the previous unit different threads accessed the same copies of arrays x and y (shared)
* The loop index i was different for each thread (private)
The correct functioning of an OpenMP loop requires correct variable scoping
* In this case the default scoping rules did the right thing

27
Q

Why might we want to define the variable scope before running parallelised code?

A

Explicitly declaring variable scope makes the code much easier to understand and more likely to be correct

  • We can specify the default scope by adding a clause to the directive which starts the parallel region:
  • default (shared)
  • default (none)
  • The default clause can be followed by a list of private and/or shared variables
28
Q
A