Terms Flashcards

Question 1

Q

HPC (High Performance Computing)

Answer

A

A computing model that focuses on processing power over data throughput.
Handles large-scale problems and is maximized for speed.
e.g. Weather Forcasting, Scientific Modelling

Question 2

Q

HTC (High Throughput Computing)

Answer

A

A computing model that focuses on maximizing data throughput over processing power.
handles large amounts of smaller, independent processes.
e.g. Web services that handle user requests

Question 3

Q

Gustafson-Barsis’s Law

Answer

A

as the number of processors increases, the problem size or workload can be increased proportionally to keep the total execution time of the program constant.

Question 4

Q

Little’s Law

Answer

A

States that the average number of tasks in a system is equal to the arrival rate of tasks multiplied by the average time a task spends in the system

Question 5

Q

Buffer

Answer

A

A temporary storage space used to hold data while it’s being moved from one place to another.
Enables efficient data transfer and controls the flow of data between systems

Question 6

Q

What are the requirements for deadlock to occur in concurrent code?

Answer

A

Deadlock occurs when two or more threads are blocked because they are waiting on each other to release resources they hold.
The 4 conditions for Deadlocking to occur are mutual exclusion, Hold & Wait, No Preemption and Circular Wait

Question 7

Q

Atomocity

Answer

A

When a concurrent programme is completing a task, it treats all actions as a single unit of work.
It prevents race conditions, deadlocks and inconsistent data states.

Question 8

Q

Monitor

Answer

A

High-level synchronization construct that provides a way to control access to shared resources in a multi-thread environment.
Uses mutual exclusion

Question 9

Q

What does the synchronized keyword in java do and give 3 limitations of this keyword?

Answer

A

Java’s synchronized keyword creates monitors that control access to shared resources in a multi-thread environment.
3 limitations of this are it’s potential for deadlock, limited granularity and potential performance overhead.

Question 10

Q

Give an example of a commercial distributed system

Answer

A

Amazon’s Web Services.
(Cloud computing)

Question 11

Q

Provide two examples of system desgins for distributed systems

Answer

A

Client-Server Architecture and Peer-to-Peer architecture

Question 12

Q

What is Client-Server architecture?

Answer

A

The client sends requests to the server, the server processes these requests and sends responses to the client.
Limitation: latency issues if the server goes down
This is a centralized system

Question 13

Q

What is Peer-to-Peer architecture?

Answer

A

All nodes act as both clients and servers and they communicate with eachother directly.
Provides a more decentralized
and fault-tolerant system.

Question 14

Q

What is a centralized system?

Answer

A

One where all nodes in the network communicate with a central coordinator node. E.g. Client-Server architecture.

Question 15

Q

What is a decentralized system?

Answer

A

One where all nodes communicate directly with eachother with no central node.
E.g. Peer-to-Peer architecture, Blockchain

Question 16

Q

What is Safety in concurrent programming?

Answer

A

The property that guarantees that no matter how the concurrent system processes thread interleave, certain aspects of the system will always hold.
e.g. If two owners of bank account attempt to withdraw funds simultaneously, there’s no danger of the system messing up and giving the user’s more than what the account has available.

Question 17

Q

What is liveness in a concurrent system?

Answer

A

The property that guarantees that some desirable behavior will eventually happen.
e.g. all incoming requests to a web server should be eventually processed and responded no matter what.

Question 18

Q

What is Dekker’s Algorithm?

Answer

A

A mutual exclusion algorithm.
It uses flags to signal that a thread wants to enter a critical section.
it also uses a turn variable to indicate whether it is the threads turn to enter the critical section.
There are two parts to Dekker’s algorithm:
Entry Section –> Processor sets its flag to true and waits until the other processor has its flag set to false, then the processor checks the turn variable to see if it can enter the critical section.
Exit Section –> Processor sets its flag to false when finished in the critical section and the turn variable is set to the other number

Question 19

Q

What is the difference between monitors and semaphores?

Answer

A

Monitors –> use High-level synchronization constructs to control access to shared resources in multi-thread systems.
Semaphores –> Integer variables used to control access to shared resources. (generally in binary)

Question 20

Q

What is the Producer Consumer problem?

Answer

A

A synchronization problem where producers produce data items and put them into a buffer, while consumers remove said items from the buffer and process them. The problem arises when producers produce data items faster than the consumer can consume them, leading to buffer overflows and starvation

Question 21

Q

How do you solve the producer consumer problem?

Answer

A

Use two semaphores: one to represent empty slots available in the buffer and another to represent filled slots in the buffer

Question 22

Q

Give an example of a data item in Java that is always thread-safe

Answer

A

Immutable objects

Question 23

Q

How do you ensure that a class is thread safe?

Answer

A

Synchronize critical sections of the code and use atomic / volatile variables

Question 24

Q

What are stubs and skeletons in RMI (Remote method Invocation)?

Answer

A

Stubs –> Client-side proxy that represents a remote object in the local JVM, responsible for marshalling the method arguments and sending them to the remote object, then unmarshalling the method results.
Skeletons –> Server-side object that acts as a mediator between the remote object and the network.
In simple terms they both work together to enable communication between clients and servers.

Question 25

Q

Define marshalling

Answer

A

the process of converting a method argument into a format that can be transmitted over a network.

Question 26

Q

Define IDLs (Interface Definition Languages)

Answer

A

They define a standard way of defining interfaces and data types which can be used accross different programming languages

Question 27

Q

Name some common services of distributed systems

Answer

A

Communication, Security, Resource Sharing, Scalability, Load Balancing

Question 28

Q

Define RPC (Remote Procedure Call)

Answer

A

Used in distributed computing, it allows a programme running on one computer to call a procedure located on another computer over a network.
Stubs and Skeletons hide the details of RPC in RMI.
Mainly used in client-server architectures

Question 29

Q

Define Task Partitioning

Answer

A

Process of dividing a complex task into smaller sub-tasks to be executed in parallel. Two common types of this are Corse Grain Partitioning and Fine Grain Partitioning

Question 30

Q

Define Corse Grain Partitioning

Answer

A

Involves dividing a large task into a few large sub-tasks which are executed in parallel by different processors. Easy to implement
E.g. a web app that processes different user requests

Question 31

Q

Define Fine Grain Partitioning

Answer

A

Divides large tasks into many small sub-tasks which are executed in parallel by different processors.
Fast processing times.

Question 32

Q

What is a decomposition strategy

Answer

A

Strategy used in computing to break down a task into smaller more manageable parts.

Question 33

Q

What size buffer would you need to handle up to 7,000 transactions in a web app with a latency of 100ms

Answer

A

It has to be at least big enough to handle the 7,000 transactions with additional headroom for traffic bursts , network latency etc.

Question 34

Q

Amdahl’s Law?

Answer

A

Used in parallel computing, it describes the theoretical maximum speedup that can be achieved by using multiple processors to execute a programme.
Speedup = 1/[(1-P) + (P/N)] where P is the parallel portion of the code as a decimal and N is the number of processors.

Question 35

Q

What are the correctness properties that apply programmes that are not supposed to terminate\/

Answer

A

Safety and Liveness

Question 36

Q

Define Latency

Answer

A

The time taken for a request to get sent to a server and the corresponding response to be received.
Minimizing latency usually involves using HPC and low-latency networking.

Question 37

Q

Define Throughput

Answer

A

The measure of how many requests a server can handle at a given time.
Maximizing throughput usually involves using distributed computing.

Question 38

Q

What are Spinlocks?

Answer

A

Low-level synchronization constructs used to control access to shared resources in multi-thread systems.
When the calling thread requests a spin lock that is already held by another thread, the other thread spins in a loop to check of the lock has become available.

Question 39

Q

What are Spinlocks?

Answer

A

Low-level synchronization constructs used to control access to shared resources in multi-thread systems.
When the calling thread requests a spin lock that is already held by another thread, the other thread spins in a loop to check of the lock has become available.

Question 40

Q

What are some advantages of Monitors over spinlocks?

Answer

A

Monitors are more efficient as they do not waste CPU cycles while waiting for a lock.
Easier to use.

Question 41

Q

What happens when a thread enters a synchronized method?

Answer

A

It acquires the monitor associated with the object on which the method is called

Question 42

Q

What does a synchronized block do?

Answer

A

It allows a thread to acquire and release a monitor explicitly.
(Allows for Fine-Grain control over the locking mechanism).

Question 43

Q

What is a monitor invariant?

Answer

A

Defines the state the monitor is expected to maintain and any operations performed on the monitor should preserve this invariant.
E.g. for a monitor that models a bank account with a balance, the model invariant may be that the balance can never drop below 0.

Question 44

Q

what does it mean when a monitor invariant is relaxed?

Answer

A

The monitor’s internal consistency check is temporarily suspended.
This is good for cases like the producer consumer scenario, where the producer may temporarily violate the monitor invariant by adding items to a full buffer.

Question 45

Q

What are race conditions?

Answer

A

A situation that can occur when two or more threads access shared resources in an unsynchronized manner.

Question 46

Q

What does the notify() synchronization primitive do?

Answer

A

Wakes up a thread that is waiting for an object’s monitor

Question 47

Q

What does the wait() synchronization primitive do?

Answer

A

Causes the thread to wait until it’s notified by another thread using the object’s monitor with the notify() method.

Question 48

Q

What is the Epidemic Algorithm?

Answer

A

A distributed computing algorithm that enables information to be disseminated and replicated accross a system of nodes in a decentralized and fault-tolerant manner.
Nodes in a network randomly exchange data with eachother, and this randomness ensures that all nodes in the network eventually receive the data even if there are network failures.
The “source node” is the first node to start the exchanges and every node that receives the data becomes a “carrier”

Question 49

Q

Name two types of epidemic algorithms

Answer

A

Push model and the Pull model

Question 50

Q

Define a push model

Answer

A

A type of epidemic propagation model, a replica sends updates to all other replicas it encounters, whether they need them or not
Not very efficient

Question 51

Q

Define a pull model

Answer

A

A type of epidemic propagation model, a replica requests updates from other replicas when they need it.
More efficient than a push model
e.g. Gitlab

Question 52

Q

What is location transparency in distributed computing?

Answer

A

The ability to access files regardless of a machines physical location (User doesn’t need to know where the files are stored to access them).
E.g. DNS (Domain Name System), a distributed system that maps domain names to IP addresses. The client doesn’t need to know the location of the DNS server to resolve the domain name to an IP address.

Question 53

Q

What is location independence in distributed computing?

Answer

A

Ability to move files around the system without disrupting the user’s ability to access them.
E.g. cloud services like Google Drive.

Question 54

Q

What is a cache?

Answer

A

A temporary storage space to speed up access to frequently accessed data.
Reduces latency and network traffic

Question 55

Q

What is the cache-consistency problem?

Answer

A

Challenge of ensuring that all copies of a file are consistent and up to date when changes are made to one copy.

Question 56

Q

How do you solve the cache-consistency problem?

Answer

A

Write-through Policy
Invalidate Policy
Update Policy
Hybrid Policy

Question 57

Q

Define the write-through policy

Answer

A

Any update to a file is written to all copies immediately

Question 58

Q

Define the invalidate policy

Answer

A

When an update is made to a file, all copies are invalidated and the next operation will retrieve the updated file.

Question 59

Q

Define the update policy

Answer

A

Changes to an updated file are propagated to other copies ASAP.

Question 60

Q

Define the Hybrid Policy

Answer

A

A combination of the write-through policy, the invalidate policy and the update policy

Question 61

Q

What is Peterson’s Algorithm?

Answer

A

Synchronization algorithm used to solve the critical section of a two processor system, similar to Dekker’s algorithm.

Question 62

Q

What is the Bakery Algorithm?

Answer

A

Mutual Exclusion Algorithm designed for n processes.
Works by assigning a unique ticket number to each process, and allowing processes to enter the critical section based on their ticket number

Question 63

Q

Is the Bakery Algorithm suitable for n processes?

Answer

A

It depends, this algorithm works well for a small number of threads (e.g. 5), but it’s efficiency worsens for a larger number of processes for two main reasons:
Bust-Waiting –> System resources get used while threads wait in the loop waiting for other processes to finish their work. this can lead to contention for shared resources. e.g. CPU time / Memory
Non-Scalable –> Doesn’t scale well as the number of processes increases. (0(n^2) complexity).

Question 64

Q

What are lock objects in the Java.util.concurrent package?

Answer

A

Mechanisms for controlling access to shared resources in multi-thread environments.

Answer 65

A

ReentrantLock lock–> Allows a thread to acquire the same lock multiple of times without deadlocking itself.
Used in situations where more control over locking and unlocking is required as it provides greater concurrency than synchronized blocks of code.

Lock.tryLock() –> returns a boolean value indicating whether the current thread has acquired the lock or not

Answer 66

A

Mutual Exclusion Algorithm where processes are organised in a logical ring. A token is then passed around the ring and whichever process holds the token can access the shared resource / critical section.

Answer 67

A

Used to synchronize the clocks of multiple nodes in a distributed system

Answer 68

A

Used to order events and messages that occur within a node, as well as to synchronize the clocks of other nodes in the system

Answer 69

A

Used to synchronize the clocks of multiple nodes in a distributed system.

Answer 70

A

A clock synchronization algorithm, it synchronizes the clocks of nodes in a system.
Works by adding a timestamp to each message sent between processes, based on the sender’s logical clock value, and the receiving node updates its own logical clock based on this timestamp and then processes the message.
Ensures that messages are partially ordered.

Answer 71

A

It assumes that messages are sent in a timely and reliable manner, which isn’t always the case.

Answer 72

A

Used to elect a coordinator in a decentralized distributed system.
It assumes that each process has a unique ID / Priority and the thread with the highest ID is elected as the coordinator.
Does not scale well when used in large scale systems.

Answer 73

A

A deployable unit of code that is packaged and used for distribution and use.
e.g. Java.util.concurrent

Answer 74

A

Contract-First approach –> The web service description language (WSDL) contract is first created, and the java artifact is generated from this.
Code-First approach –> Java code is written first, and then the WSDL contract i generated from this java code.
Hybrid Approach –> Java code and WSDL contract are developed simultaneously

Answer 75

A

XML-based language used to describe the functionality offered by a web service.
Typically consists of parts:
Types
Messages
Port Type
Binding

Answer 76

A

Ability of the programme to execute a fixed task quicker as more processors work on it.
E.g. If a task takes 10 minutes to complete on one processor but 5 minutes on another processor, then the programme has strong scalability

Answer 77

A

A programme has weak scalability if when the size of the task increases proportionally to the number of processors working on it, the execution time should remain roughly the same. E.g. if a task running on one processor takes 10 minutes to execute, but the same task that is twice as big also takes roughly 10 minutes to execute on two processors, then the programme has weak scalability.

Answer 78

A

Remote Interface –> Defines the remote methods that can be called by the clients of the distributed system to access and modify shared resources.
Remote Object –> Object that implements the remote interface and provides the implementation of the methods declared in the interface. Responsible for receiving method calls from the client and invoking the method
Stubs and Skeletons –> respsonsibe for marshalling and unmarshalling remote method arguments and sending them to the remote object over the network

Answer 79

A

Remote Interface –> Defines the remote methods that can be called by the clients of the distributed system to access and modify shared resources.
Remote Object –> Object that implements the remote interface and provides the implementation of the methods declared in the interface. Responsible for receiving method calls from the client and invoking the method
Stubs and Skeletons –> respsonsibe for marshalling and unmarshalling remote method arguments and sending them to the remote object over the network

Answer 80

A

Objects that intercept and process incoming and outgoing remote method calls to a remote object.
Necessary to perform tasks like authentication and authorization, performance monitoring and error handling

Answer 81

A

An inceptor gets registered to a remote object, and if a remote method is called on this object the RMI system calls the inceptor invoke method which allows the inceptor to perform tasks like authentication and authorization checks before the remote method is actually invoked on the object.

Answer 82

A

A centralized system requires fewer messages to be exchanged as all nodes only have to deal with the central server, while in a token ring system all nodes must communicate with eachother as the token gets passed around.
Also, the centralized system is more susceptible to failure than a token ring system due to the single point of potential failure which is the central server.

Answer 83

A

Used in inter-process communication (IPC) and message passing in parallel computing.
Provides a standard API for developing parallel applications that run on many different architectures.
Advantages:
Portable, scalable

Answer 84

A

An API for shared memory multiprocessing in C, it enables developers to write parallel programmes that can run on multi-core processors and shared memory systems.
Advantages:
Easy to use, scalable

Answer 85

A

Using both can be advantageous when dealing with apps that require parallelism at both the shared memory and distributed memory levels

Answer 86

A

Open-source web server software which serves web pages over the internet.
It receives incoming HTTP requests from the client web browser and passes them to the appropriate handler, the dynamic content generator generates content based on the user’s request.

Answer 87

A

Use C’s synchronization primitives mutexes and condition variables

Answer 88

A

Mutex (standing for mutual exclusion) is a synchronization primitive used in c to allow for mutual exclusion when threads access a shared resource.
Generally used alongside a condition variable.
Represented as pthread_mutex_t
Process must acquire a mutex using pthread_mutex_lock before accessing a shared resource, and then must release it using pthread_mutex_unlock

Answer 89

A

a synchronization primitive used in c to allow for mutual exclusion when threads access a shared resource.
Generally used alongside a mutex.
represented as pthread_cond_t
Thread waits for a certain condition to be true using pthread_cond_wait which blocks the thread until it is signaled by another thread that the shared resource is no longer being modified by using pthread_cond_signal.