L2 - Base Technologies Flashcards
What are process technologies?
About the processes to build the chips. Silicon wavers are used to replicate chips.
What is 10nm?
width of the fins on the chip
What is the aim concerning computer chips?
Put as many transistors as possible on it.
What did the increase in transistors lead to?
Increase in clock speed stopped because otherwise, the heat increases –> Single-thread performance is stopped by clock speed.
–> need for a performance increase and energy reduction
Why are more cores used in CPU?
Because they are like little CPUs that can work independently –> increase of performance
Trends concerning CPU
- multi-core processors
- SIMD support
- combination of core private and shared cashes
- heterogeneity
- hardware support for energy control
What is SIMD?
Single Instruction Multiple Data. SIMD describes computers with multiple processing elements that simultaneously perform the same operation on multiple data points. –> data parallelism
What is the combination of core private and shared cashes about?
This keeps the data near to the CPU.
What is heterogeneity about when talking about CPU?
Processors might have different cores. You specialize your hardware for the process you are doing –> less energy consumption.
What is hardware support for energy control about?
You can adapt the processor for the needs of the application. E.g. switching off the other cores or lower the clock frequency.
What is a challenge when it comes to CPU?
Feeding the processor: the memory hierarchy
-> Use data hierarchy in an efficient way to keep the data as close the CPU as possible
What is special about the Intel Kaby Lake Processor?
- system agent manages CPU energy control
- you have a ring interconnection network that goes around the cores and enables them to communicate
- the level 3 cash is near to the core
- GPU is already integrated in the processor
What is special about the Skylake XP Socket?
- multi-dimensional mash network that serves as communication for the cores
What is special about ARM processor designs?
These processors are integrated into Systems on a Chip (SoC).
SoC (from the internet)
A System-on-a-Chip (SoC) is an integrated circuit (IC) that integrates all the components of a computer or electronic system into a single chip. It is a type of microcontroller that includes a microprocessor, memory, and other components such as input/output interfaces, power management, and communication interfaces all integrated into a single piece of silicon.
What is ARM Big Little
- you have clusters of processors
- with cluster switching you can pick one cluster at a time (use either the high performance cluster or the low performance cluster)
What is an alternative to cluster switching?
Global task scheduling. Here you distribute computation more flexibly among big and little processors.
What is a GPGPU?
General Purpose Graphics Processors (GPGPU) is the use of a graphics processing unit to perform computation in applications traditionally handle by the CPU
What is an accelerator? (From the internet)
An accelerator for a CPU is a device or system that is designed to offload certain computational tasks from the central processing unit (CPU) in order to improve performance. These tasks can include things like data compression, encryption, machine learning inferencing, and image processing. Examples of accelerators include graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs).
What is the aim of accelerator programming?
Increasing the computational speed.
What is the motivation for accelerators?
- increase computational speed
- reduce energy consumption
3 ways to increase computational speed and reduce energy consumption.
Through specialization in:
- operations
- on-chip communication
- memory access
3 types of accelerators
- GPGPs
- Many standard cores
- FPGAs
Are costs for FPGA decreasing? What about ASIC?
Yes. ASIC costs increase.
What can FPGA be used for?
–> can be used for big data scenarios.
What is an FPGA?
Field Programmable Gate Array: Designed to be configured by a customer or a designer after manufacturing.
- array of logical gates to implement hardware programmed special functions (you ca program algorithms in the hardware)
3 system integration designs for accelerators
- nodes with attached accelerators (SPU and accelerator on board)
- accelerator only design (no standard core (CPU) anymore)
- accelerator booster (dynamically allocate a fraction of the booster for certain computations)
Challenges connected to GPGPUs
- programming a GPGPU
- coordinating the scheduling of computation on the system processor (CPU) and the GPU
- managing the transfer of data between system memory (RAM) and GPU memory
How to manage the transfer of data between system memory (RAM) and GPU memory
NVIDIA developed CUDA language for that
What is multithreading?
The ability of the CPU to provide multiple threads of execution concurrently, supported by OS. Threads share the resources of a single or multiple cores
MIMD
Multiple instruction, and multiple data is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently.
SIMD
Single instruction, multiple data is a type of parallel processing. SIMD describes computers with multiple processing elements that simultaneously perform the same operation on multiple data points.
Difference Multithreading and MIMD
Multithreading:
- multiple threads single process
- threads share same memory space (direct communication)
- single processor (multicore) systems
MIMD:
- threads can run different processes
- each thread has own memory space and runs independently
- multiple processor systems connected by network
- used in high-performance computing
ILP
Instruction-level parallelism (ILP). Is a family of processor and compiler design techniques that speed up execution by causing individual machine operations to execute in parallel.
HBM
High Bandwidth Memory (180GB/sec)
Multi-instance GPU (MIG)
- with MIG you can now partition the GPU in the - hardware
- you also slice the Dynamic random access memory (DRAM) and cash etc
- so each virtual machine can get one partition
→ MIG provides data and performance isolation - In CC you have a physical server and a virtual server when you ask for a server
- there are multiple virtual servers running on physical server
- The question is which of the virtual servers gets the actual GPU that is attached to the physical servers
NUMA
Non Uniform Memory Access
multiple CPUs with multiple cores
- share same memory
- can communicate with each other by reading and writing data on the memory
- single physical address space
What is a distributed memory system?
A distributed memory system is a type of computer system where multiple processors are connected by a network, each with its own local memory. The processors in a distributed memory system can communicate and coordinate with each other, but they do not share a common memory. This is in contrast to a shared memory system, where all processors have access to a common memory space, and can directly read and write memory locations.
Difference of distributed memory systems or clusters compared to NUMA
Distributed memory systems:
- multiple processors are connected by a network, each with its own local memory
- often used in high-performance computing applications
- programming is more complex because of the distribution of data and computation across the processors their communication
- allows for the use of larger number of processors and the ability to scale up the system by adding more processors.
NUMA:
- all processors have access to a common memory space, and can directly read and write memory locations
PUE
Power usage effectiveness (PUE)
- measure of how efficiently a computer data center uses its power
PUE = total facility power/ IT Equipment power
-> ideally ratio of 1
-> can be reduced by putting data centers to Iceland
ERE
Energy Reuse Effectivness
- measure of how efficient a data center reuses the power dissipated by the computer
ERE = Total facility power - reuse/ IT Equipment power
–> Ideally ratio of 0
–> E.g. use heat generated by computers to warm offices
Are costs for FPGA decreasing? What about ASIC?
Yes. ASIC costs increase.