Terms Flashcards
HPC (High Performance Computing)
A computing model that focuses on processing power over data throughput.
Handles large-scale problems and is maximized for speed.
e.g. Weather Forcasting, Scientific Modelling
HTC (High Throughput Computing)
A computing model that focuses on maximizing data throughput over processing power.
handles large amounts of smaller, independent processes.
e.g. Web services that handle user requests
Gustafson-Barsis’s Law
as the number of processors increases, the problem size or workload can be increased proportionally to keep the total execution time of the program constant.
Little’s Law
States that the average number of tasks in a system is equal to the arrival rate of tasks multiplied by the average time a task spends in the system
Buffer
A temporary storage space used to hold data while it’s being moved from one place to another.
Enables efficient data transfer and controls the flow of data between systems
What are the requirements for deadlock to occur in concurrent code?
Deadlock occurs when two or more threads are blocked because they are waiting on each other to release resources they hold.
The 4 conditions for Deadlocking to occur are mutual exclusion, Hold & Wait, No Preemption and Circular Wait
Atomocity
When a concurrent programme is completing a task, it treats all actions as a single unit of work.
It prevents race conditions, deadlocks and inconsistent data states.
Monitor
High-level synchronization construct that provides a way to control access to shared resources in a multi-thread environment.
Uses mutual exclusion
What does the synchronized keyword in java do and give 3 limitations of this keyword?
Java’s synchronized keyword creates monitors that control access to shared resources in a multi-thread environment.
3 limitations of this are it’s potential for deadlock, limited granularity and potential performance overhead.
Give an example of a commercial distributed system
Amazon’s Web Services.
(Cloud computing)
Provide two examples of system desgins for distributed systems
Client-Server Architecture and Peer-to-Peer architecture
What is Client-Server architecture?
The client sends requests to the server, the server processes these requests and sends responses to the client.
Limitation: latency issues if the server goes down
This is a centralized system
What is Peer-to-Peer architecture?
All nodes act as both clients and servers and they communicate with eachother directly.
Provides a more decentralized
and fault-tolerant system.
What is a centralized system?
One where all nodes in the network communicate with a central coordinator node. E.g. Client-Server architecture.
What is a decentralized system?
One where all nodes communicate directly with eachother with no central node.
E.g. Peer-to-Peer architecture, Blockchain
What is Safety in concurrent programming?
The property that guarantees that no matter how the concurrent system processes thread interleave, certain aspects of the system will always hold.
e.g. If two owners of bank account attempt to withdraw funds simultaneously, there’s no danger of the system messing up and giving the user’s more than what the account has available.
What is liveness in a concurrent system?
The property that guarantees that some desirable behavior will eventually happen.
e.g. all incoming requests to a web server should be eventually processed and responded no matter what.
What is Dekker’s Algorithm?
A mutual exclusion algorithm.
It uses flags to signal that a thread wants to enter a critical section.
it also uses a turn variable to indicate whether it is the threads turn to enter the critical section.
There are two parts to Dekker’s algorithm:
Entry Section –> Processor sets its flag to true and waits until the other processor has its flag set to false, then the processor checks the turn variable to see if it can enter the critical section.
Exit Section –> Processor sets its flag to false when finished in the critical section and the turn variable is set to the other number
What is the difference between monitors and semaphores?
Monitors –> use High-level synchronization constructs to control access to shared resources in multi-thread systems.
Semaphores –> Integer variables used to control access to shared resources. (generally in binary)
What is the Producer Consumer problem?
A synchronization problem where producers produce data items and put them into a buffer, while consumers remove said items from the buffer and process them. The problem arises when producers produce data items faster than the consumer can consume them, leading to buffer overflows and starvation
How do you solve the producer consumer problem?
Use two semaphores: one to represent empty slots available in the buffer and another to represent filled slots in the buffer
Give an example of a data item in Java that is always thread-safe
Immutable objects
How do you ensure that a class is thread safe?
Synchronize critical sections of the code and use atomic / volatile variables
What are stubs and skeletons in RMI (Remote method Invocation)?
Stubs –> Client-side proxy that represents a remote object in the local JVM, responsible for marshalling the method arguments and sending them to the remote object, then unmarshalling the method results.
Skeletons –> Server-side object that acts as a mediator between the remote object and the network.
In simple terms they both work together to enable communication between clients and servers.
Define marshalling
the process of converting a method argument into a format that can be transmitted over a network.
Define IDLs (Interface Definition Languages)
They define a standard way of defining interfaces and data types which can be used accross different programming languages
Name some common services of distributed systems
Communication, Security, Resource Sharing, Scalability, Load Balancing
Define RPC (Remote Procedure Call)
Used in distributed computing, it allows a programme running on one computer to call a procedure located on another computer over a network.
Stubs and Skeletons hide the details of RPC in RMI.
Mainly used in client-server architectures
Define Task Partitioning
Process of dividing a complex task into smaller sub-tasks to be executed in parallel. Two common types of this are Corse Grain Partitioning and Fine Grain Partitioning
Define Corse Grain Partitioning
Involves dividing a large task into a few large sub-tasks which are executed in parallel by different processors. Easy to implement
E.g. a web app that processes different user requests
Define Fine Grain Partitioning
Divides large tasks into many small sub-tasks which are executed in parallel by different processors.
Fast processing times.
What is a decomposition strategy
Strategy used in computing to break down a task into smaller more manageable parts.
What size buffer would you need to handle up to 7,000 transactions in a web app with a latency of 100ms
It has to be at least big enough to handle the 7,000 transactions with additional headroom for traffic bursts , network latency etc.
Amdahl’s Law?
Used in parallel computing, it describes the theoretical maximum speedup that can be achieved by using multiple processors to execute a programme.
Speedup = 1/[(1-P) + (P/N)] where P is the parallel portion of the code as a decimal and N is the number of processors.
What are the correctness properties that apply programmes that are not supposed to terminate\/
Safety and Liveness