Parallel Processing Flashcards
Define out of order execution
Some CPUs may reorder instructions to execute them in the best manner for the CPU’s design.
Define superscalar CPUs
CPU that can execute more than one instruction per clock cycle. It utilises multiple execution units (e.g. ALUs) within one core.
Define multi-core CPUs
A multicore processor contains multiple independent cores (processing units) on a single chip.
Each core can fetch, decode, and execute its own instructions independently.
Tasks or threads are distributed across cores by the operating system or software.
Limitations in increasing processor speed
Why not just make a processor faster?
Faster processor needs more power. That power is then dissipated as heat.
We’ve reached the limit on the amount of heat we can easily dissipate from the CPU.
So rather than increasing the speed of processing, we increase the amount of data that can be processed at each step. Known as bandwidth.
Difference between multi-core and multi-processor CPUs
Multi-core: A single CPU shares multiple cores, which are essentially independent processing units. Shares the same RAM and L3 cache etc.
Multi-processor: A computer system uses multiple separate physical CPUs. Each have it’s own memory.
Define cache
A cache is a small, high-speed memory storage layer that temporarily stores frequently accessed data or instructions, enabling faster access compared to retrieving the same data from a slower memory source.
Because it is in the CPU, the CPU can access it extremely quickly.
Define hardware multi-threading
Hardware multi-threading is a technique where a CPU executes multiple threads concurrently on a single core by utilizing idle resources during instruction execution, improving efficiency and performance.
Define symmetric multiprocessor (SMP)
A type of computing architecture where two or more identical processors share a single memory and operate under a unified operating system, with equal access to I/O devices and resources, enabling parallel task execution.
Define Assymmetric multiprocessor (AMP)
A computing architecture where processors are not identical and have different roles or capabilities. Typically, a master processor controls the system, assigning tasks to one or more subordinate processors, which may be specialized for certain functions (e.g., I/O, computation, or real-time tasks).
Define shared memory model
Most common systems today use a shared memory model. Where each processor or core is connected to the same memory via a system bus.
A bus arbiter coordinates which CPUs can access memory and which times as more than one CPU cannot access shared memory at the same time. Because of this, each processor/core often has its own L1 cache.
The problem with cache coherency in the shared memory model
Suppose one core stores a copy of the data from shared memory in its cache. If another core changes that data in the shared memory, the first core will operate on old data from its cache, not the updated values.
Simple solution to the cache coherency problem
When one core communicates via the system bus that it wants to write data in the shared memory, other cores can see what is happening and they can check if the address where data is being written exists in their caches. If it is, they can update their values accordingly.
However this solution does not scale.