High performance Computing Flashcards
Week 1
What new science is a major user of high performance computing?
Life sciences for applications such as genome processing
How can we determine the performance of a high performance computer using floating point mathematics
- Linpack is a performance benchmark which measures floating point operations per
second (flops) using a dense linear algebra workload - A widely used performance benchmark for HPC systems is a parallel version of Linpack
called HPL (High-Performance Linpack)
Why is parallelization important in HPC
High requirement floating point calculations can be run in parallel to make use of multiple cores
What is dennard scaling?
Dennard scaling is a recipe for keeping power per unit area (power density) constant as transistors were scaled to smaller sizes
As transistors became smaller they also became faster (delay reduction) and more
energy efficient (reduce threshold voltage)
With very small features limits associated with the physics of the device (e.g. leakage
current) are reached
Dennard scaling has broken down and processor clock speeds are no longer
increasing
What is the current most common supercomputer architecture
Current systems are all
based on integrating many
multi-core processors
- The dominant architecture is
now the “commodity cluster” - Commodity clusters
integrate off-the-shelf (OTS)
components to make an
HPC system (cluster)
Give the proper definition for a Commodity cluster
A commodity cluster is a cluster in which both the network and the compute nodes are commercial products available for procurement and independent application by organisations (end users or separate vendors) other than the original equipment manufacturer.
Give four components of a cluster algorithm
Compute nodes: provide the processor cores and memory required to run the workload
* Interconnect: cluster internal network enabling compute nodes to communicate and access storage
* Mass storage: disk arrays and storage nodes which provide user filesystems
* Login nodes: provide access (e.g. ssh) for users and administrators via external network
Why do High performance Computers use compiled languages
Maximizes performance.
Compilers parse code and generate executables with optimizations.
Optimizations at compile-time are less costly than at runtime.
What are common langauges for High performance computers?
C, C++ and Fortran
Why must parellisation be done manually
Parallelization is too complex for compilers to handle automatically.
Programmers add parallel features.
Week 2
What can we use OpenMP for?
OpenMP provides extensions to C, C++ and Fortran
* These extensions enable the programmer to specify where parallelism should be added and how to add it
* The extensions provided by OpenMP are:
- Compiler directives
- Environment variables
- Runtime library routines
What does it mean to say that OpenMP uses a fork join execution model?
Execution starts with a single thread (master thread)
- Worker threads start (fork) on entry to a parallel region
- Worker threads exit (join) at the end of the parallel region
What can we use the OMP_NUM_THREADS header for?
We can use the OMP_NUM_THREADS environment variable to control the number of threads forked in a parallel region e.g.
- export OMP_NUM_THREADS=4
- OMP_NUM_THREADS is one of the environment variables defined in the standard
- If you don’t specify the number of threads the default value is implementation defined
(i.e. the standard doesn’t say what it has to be)