CS3524 - Distributed Systems Flashcards

Question 1

Q

How is database integrity achieved?

Answer

A

Consistency of Data - safeguard against undesired manipulation of data
Durability of Data - safeguard against loss or corruption of data

Question 2

Q

Describe the ACID properties?

Answer

A

Atomicity - either all or no operation of a transaction are committed
Consistency - a transaction transforms a DB from a consistent state to another
Isolation - transactions do not reveal their intermediate state to other transactions until they are committed
Durability - after commit, system ensures that results of a transaction will persist, even after system failure

Question 3

Q

What are 2 problems associated with concurrency?

Answer

A

Lost Update - transactions overwrite each other’s updates
Inconsistent Retrieval - transaction base their calculations off of uncommitted data, i.e. a “dirty read”

Question 4

Q

Define Serial Equivalence of two transactions?

Answer

A

Two or more transactions are serial equivalent if they produce the same result operating in an interleaving fashion as if they would operate in a completely serialized fashion.

Question 5

Q

Define Serialisability of two transactions?

Answer

A

For two transactions to be serializable, it is necessary and sufficient that all pairs of conflicting operations are executed on the same order on all data objects accessed by these operations.

Question 6

Q

What are the 2 types of concurrency control behaviors?

Answer

A

Pessimistic - Transactions wait to avoid conflict and apply early locking of resources
Optimistic - Transactions are restarted after conflict detection

Question 7

Q

Explain how Simple Exclusive Locking works?

Answer

A

Any transaction must acquire the lock to a shared object before manipulating it.

This guarantees serialisability of transactions since conflicting pairs of operations cannot execute simultaneously on a locked object.

Not efficient, as it unnecessarily reduces concurrency of transactions

Question 8

Q

Explain how 2-Phased Locking and Strict 2PL work?

Answer

A

A transaction performs all locking operations after the first unlocking operation.

Strict 2PL works by releasing all locks held by the transaction right before committing or aborting

Question 9

Q

Explain Optimistic Concurrency Control - what is does, how it works, and what issues it solves?

Answer

A

It works on the assumption that the likelihood of two transactions accessing a shared object is low.

Before a transaction is committed, conflicts are checked for. If found, conflicting transactions are aborted and restarted by the client.

Overcomes disadvantages of locking schemes by :
1. Reducing the computational cost of lock management
2. Avoid the need of read() locks
3. Avoids deadlocks and improves concurrency, however a deadlock detection or time-out mechanism must be in place.

Question 10

Q

What are tentative copies of shared objects?

Answer

A

Tentative copies are created by a transaction which contains a write() operation. The transaction can then abort without affecting other transactions or the “visible” state of the object itself. Once the transaction is committed, the tentative copy of the object is written to persistent storage and becomes the most up-to-date version of the shared object.

Question 11

Q

What are the 3 phases transactions in optimistic concurrency control?

Answer

A

Working phase - where all the operations occur.
Validation phase - where conflicts with other transactions take place.
Update phase - once the transaction has been validated, it is committed and its results made permanent.

Question 12

Q

Explain how the Working phase works in optimistic concurrency control?

Answer

A

The transaction creates a tentative copy
read() operations are performed immediately.
The Read set and Write sets are instantiated for the current transaction.

Question 13

Q

What is the relevant set in transaction validation (optimistic concurrency control)?

Answer

A

It is computed using the transaction number assigned to any transaction in the validation phase.

It’s made up of all transactions whose Working phase overlapped with Tv, but were committed after Tv started and before Tv entered the Validation phase.

Question 14

Q

Explain how Backward validation works in optimistic concurrency control?

Answer

A

Consider a transaction Tv, and Ti from the relevant set of Tv.
Backward validation checks whether inconsistent retrieval occurred. This happens when Tv’s read() operations conflict with Ti’s write() operations. At this point, Tv is aborted.

Question 15

Q

Explain how Forward validation works in optimistic concurrency control?

Answer

A

Consider a transaction Tv, and Ti from the relevant set of Tv.
Forward validation checks whether any of Tv’s write() operations conflict with operations of active transactions. At this point, either Tv’s commit is delayed until the conflict resolves itself or all conflicting transactions are aborted and Tv is committed.

Question 16

Q

What is a method to test for conflicts in optimistic concurrency control?

Answer

A

For each transaction pair Tv, Ti we compute their respective read and write sets, then we check for the intersection of their read ^ write sets and write ^ write sets. If these are non-empty then there is a conflict.

Question 17

Q

What is a commit protocol in distributed transactions?

Answer

A

A commit protocol is a procedure that allows servers to coordinate their actions for committing a distributed transaction.

It should guarantee Atomicity, Consistency, and Durability, as well as be non-blocking and support unilateral aborts.

Question 18

Q

Explain how the 2PC protocol works in distributed transactions?

Answer

A

The Voting phase
The coordinator asks participants if they are ready to commit, and submit their vote.
If any participant votes “no”, they will unilaterally abort, otherwise they will move to the next stage.
The Completion phase
The coordinator asks participants to commit. If one participant aborts, the coordinator will ask all other participants to abort

Question 19

Q

How is durability ensured in 2PC protocol?

Answer

A

Each server stores intermediate states of the objects they hold a lock for and use these states and local log files to “roll forward” after a participant crash

Question 20

Q

What are the 3 possible failure situations during the 2PC protocol voting phase, and how do coordinator and participant behave?

Answer

A

Coordinator timeout - participant crashes before sending its vote back
- coordinator order all participants to abort
- crashed participant will recover then abort
- non-blocking scenario
Participant timeout - the coordinator crashes before sending vote request
- participants will timeout and unilaterally abort
- non-blocking scenario
Participant timeout - the coordinator crashes after sending vote request
- participants vote and wait for command
- coordinator doesn’t respond
- participants left in uncertain state
- blocking scenario

Question 21

Q

What are the 3 possible failure situations during the 2PC protocol commit phase, and how do coordinator and participant behave?

Answer

A

Coordinator fails, but managed to send its decision to at least one participant.
This participant will broadcast the decision to all other participants, hence it’s a non-blocking scenario
Coordinator fails, but did not send a decision to any live participant
Participants will elect a new coordinator and restart the 2PC protocol
Coordinator and at least on participant fail, other participants do not know if the failed one received any message.
This is a blocking scenario, and participants must wait for coordinator to recover. If a new coordinator is elected and send abort command, then there could be a contradiction when the failed participant restarts, as it might have received a “commit” command before going down.

Question 22

Q

What are the 2 recovery strategies adopted by participants during blocking coordinator failure in 2PC protocol?

Answer

A

Ask for coordinator decision
participant will call getDecision() on the coordinator at set intervals until a return message is obtained
Elect a new coordinator
participants will elect a new coordinator server who will order a mass abort and restart the 2PC protocol.
In case of another coordinator failure, this election can be repeated again.

Question 23

Q

Explain how the 3PC protocol works in distributed transactions? Answer in terms of what coordinator and participants do.

Answer

A

It is non-blocking in failure situations, but there is a greater cost due to more messages being handled.

Coordinator and participants can independently commit their local transactions in case of failure

No need for participants to wait for recovery of crashed coordinator

There are 3 phases
1. The voting phase
Coordinator sends out vote request. If any participant votes “no”, coordinator will order mass abort, otherwise participant moves to READY state.

The pre-commit phase
The coordinator asks participants who voted yes to send acknowledgment (ACK), if they do then they move to PREPARED state
The commit phase
participant commits its part and send a notification back to the coordinator.

Question 24

Q

How are timeout and failure recovery handled in 3PC protocol? Answer in terms of Coordinator and Participant

Answer

A

Coordinator
1. Failure during the voting phase - participants are late to send vote so coordinator orders all to abort
2. Failure before sending the ACK request - everyone will unilaterally abort
3. Coordinator fails just before sending a commit() message - after recovery, coordinator will commit, and after timeout participants will commit as well

Participants
1. Failure due to no vote request received - unilateral abort
2. Failure due to late acknowledgement request - unilateral abort
3. Participant fails to send ACK back to coordinator, unilateral abort

NOTE : any failure after the ACK message has been send will result in all parties to timeout and unilaterally COMMIT

Question 25

Q

What are the 4 Coffman conditions for a deadlock to occur?

Answer

A

Mutual Exclusion - a resource is held by at most one process. This guarantees Consistency
Hold & Wait - processes already holding resources might have to wait for another process to complete
Non-preemption - once a resource has been granted to a process, it cannot be taken away. This guarantees Consistency
Circular wait - process A holds resource X, process B holds resource Y. A needs Y to complete and B needs X to complete.

Question 26

Q

What is one way to detect deadlocks?

Answer

A

Construct a wait-for graph, where nodes are transactions, edges represent the “waiting for” relationship between transactions, and a cycle in the graph represents a deadlock

Question 27

Q

What are two ways to handle deadlocks?

Answer

A

Choose a transaction to abort either by its age (oldest will be aborted) or by how many resources it’s currently locking.
Deadlock timeout - each lock held by a transaction has a period of time where it cannot be broken. If this resource is causing a deadlock, the lock will be released after this time period.

Question 28

Q

What is one method to prevent deadlocks?

Answer

A

Locking all objects at the very start of a transaction in one atomic action. While this prevents deadlocks from arising, it drastically reduces concurrency.

Question 29

Q

Explain centralized detection of distributed deadlocks?

Answer

A

A server is elected as the deadlock detector.

Each other server intermittently submits its
local wait-for graph.

The detector then constructs a global wait-for graph and checks for distributed deadlocks.

This solution doesn’t scale well, and in addition if the detector server fails there is no way of detecting and handling distributed deadlocks.

Question 30

Q

Explain what phantom deadlocks are?

Answer

A

These occur when a deadlock has been resolved locally, but the global wait-for graph has not been updated on time due to a delay. In this case, the detector server might unnecessarily abort transactions trying to solve a deadlock which isn’t there.

Question 31

Q

Explain how edge chasing works to detect distributed deadlocks?

Answer

A

It does not require the construction of a global wait-for graph.

It uses probes, which are messages sent across virtual paths along a virtual wait-for graph.

At each server, the probe records which transaction is using which resources. If a reference appears twice in the probe then a loop (deadlock) is detected.

Edge chasing consists of the Initiation, Detection, and Resolution steps.

Question 32

Q

What is one technique to provide reliability and availability?

Answer

A

Replication is the maintenance of copies of data objects on multiple servers

it provides performances enhancement via client side caching and maximizes the availability of service.

Assume there are n servers holding copies of data due to replication. If p is the probability of one of them failing, the overall availability of the system is a = 1 - p^n.

Question 33

Q

What are Byzantine failures in distributed systems?

Answer

A

These are failures whose source cannot be identified. It can be mitigated against using a voting scheme :

If up to X servers can exhibit a byzantine failure, we need a voting scheme where the majority (2X + 1) servers validate the correctness of the service provided by X

Question 34

Q

Explain what Passive Replication is?

Answer

A

There is a single primary replica manager, and multiple secondary backup managers.

The primary manager then performs operations on data and updates the backup managers.

Client requests are handled in the following way :
1. Request
2. Coordination
3. Execution
4. Agreement
5. Response

Question 35

Q

Explain what Active Replication is?

Answer

A

All replica managers act on the same level. Client requests are multicast to these managers as a group. All managers process request and all respond to the client.

Question 36

Q

How does 2PC protocol behave in Passive Replication?

Answer

A

A single primary replica manager can take an eager or lazy strategy :

Eager - it involves all its backup managers
Lazy - it does the commit itself and then communicates the result to the backup managers

Question 37

Q

How does 2PC protocol behave in Active Replication?

Answer

A

transactions are serialized :

Read() operations are performed by a single replica manager.

Write() operations are performed by all replica managers.

The front-end server may communicate with any replica in the group.

Question 38

Q

Explain what buffer overflow is?

Answer

A

It’s a pointer manipulation attack by which a user will enter a longer than expected string, which will then be copied onto a fixed length buffer onto the stack, and return a different pointer than expected.

It is mostly an issue in the C language, and can be mitigated against by checking for the length of the output before copying it to the buffer

Question 39

Q

Name and explain 3 possible security issues with Java programming?

Answer

A

Possible answers include

Exception Handling - ensure the program is left in a correct state when handling exceptions
Passwords - do not store text passwords since they can be retrieved from compiled files
Inheritance - ensure that you know why methods are contained within a class you are inheriting
Mutability - use decorators to wrap class instances and make them immutable. You can also override the clone() method to prevent any attack
Avoid inner classes - these can mess up private declarations, as well as make initialisation, serialisation, and cloning possible attack vessels

Question 40

Q

Explain what SQL injection is and how to mitigate against it?

Answer

A

It’s a code injection attack by which the user will insert SQL statements into input fields which will execute on the database.

This can be mitigated against by using prepared statements and query batching

Question 41

Q

Name and explain 2 web issues concerning secure programming?

Answer

A

Naming issues - blacklisted names can be bypassed using font encodings. Can be mitigated against by using a list of allowed names instead
Cross Site Scripting (XSS) - works by injecting malicious JS payload onto a server which will then run on a victim’s browser. Attacker usually aims to steal victim’s session cookies. Can be mitigated against by HTML encoding/escaping and selective tag filtering

Question 42

Q

What are threat trees? Explain their purpose, and their structure.

Answer

A

These are used by companies to describe the security of a system in terms of its vulnerabilities.

Root nodes represent the attacker’s main goal.

Subsequent non-leaf nodes represent sub goals the attacker might need to achieve to achieve the root goal

Leaf nodes are possible attacks required to achieve parent node’s goal

Nodes can be :
OR - any child node must be achieved
AND - all child nodes must be achieved

Question 43

Q

What is the most common label that threat tree nodes tend to have? Answer in terms of OR and AND nodes.

Answer

A

The most common annotation is Possible or Impossible. This is used by companies to save money by investing only in threats which are more likely to occur.

For OR nodes
- attack is possible if any child node is possible
- attack is impossible if all child nodes are impossible

For AND nodes
- attack is possible if all child nodes are possible
- attack is impossible if any child node is impossible

Question 44

Q

What is ALE in security and how is it used?

Answer

A

Annual Loss Expectancy = Damage * Likelihood

It provides insight into which threats are worth investing against

Question 45

Q

What is a cipher? Name two types of ciphering algorithms?

Answer

A

Ciphers are algorithms that scramble plain text using a key in a form which hides the original plaintext meaning.

The two types of ciphers are :

Transposition ciphers - plaintext letters are reordered according to a particular scheme.

Substitution ciphers - plaintext letters are mapped to other letters or group of letters in the same alphabet

Question 46

Q

This question is about Columnar Transposition Ciphers.

Explain how the encoding process works by encoding the plaintext “I love secure messaging” using the key “secret”.

Explain how the decoding process works by decoding the cipher text “ous lcsg eeg vra ieen smi” using the same key.

Answer

A

Encoding
1. layout one column per letter of the key, and number them by their alphabetical order
2. write down message along the rows, wrapping to next row when overflow
3. list and concatenate the columns based upon their ordering
Answer is - ous lcsg eeg vra ieen smi

Decoding
1. Calculate number of rows by rounding up the value of cipherLength/keyLength
2. Calculate number of chars in last row with (1) * keyLength - cipherLength
3. Populate the column according to key letter ordering

Answer is - I love secure messaging

Question 47

Q

Explain the 3 types of mono-alphabetic substitution ciphers : Cesar, key-phrase sub, random sub.

Answer

A

Cesar cipher takes in a key “n” and transposes the alphabet to the left by n positions. This creates a one-to-one mapping between the letters. Can be decoded by right shifting the alphabet by “n”. For n letters, there are n-1 possible cipher alphabets
Key-phrase substitution tables - these prepend the keyword to the alphabet and ensure each letter only appears once. Then a mapping is created like before.
Random substitution table - these work by mapping the alphabet to a random permutation of the original one. For n letters, there are n! Possible mappings

Question 48

Q

What is a technique used to break mono-alphabetic substitution ciphers?

Answer

A

Frequency analysis, it works by exploiting the fact that the frequency of letters is preserved by mono-alphabetic encoding. Attackers can iteratively guess the encoding of each letter given their frequency.

Question 49

Q

The Vigenere Cipher is a well known poly-alphabetic substitution cipher. Explain how the encoding process works. Explain how the decoding process works as well.

Answer

A

The the plaintext alphabet and create a table where the rows are gradually shifting the alphabet left.
Create a keystream using the key. This is done by appending the letters of the key until keystreamLength = plaintextLenght
For each letter in the plain text, use the letter to get the column index of the table, and the letter in the same keystream position to get the row. The letter in that cell will be the mapped cipher letter.

The decoding process is similar, just reverse the look up indices for each letter in the ciphertext. if “D” is in the cipher text and its corresponding keystream value is “F”, find row D, then travel alongside it until you reach F -> its plaintext letter is the column in which F resides.

Note - this technique is better understood via an example

Question 50

Q

Explain what Digital Signatures are? What is a Message Digest?

Answer

A

A digital signature is a mathematical scheme which demonstrates the authenticity of a digital message

A message digest is the hashed value which maps to a hashed block of data (from the plaintext), and is an application of cryptographic has functions.

Question 51

Q

Explain briefly how symmetric key cryptography works? What is the main problem with this method?

Answer

A

A single key is used for both encryption and decryption, hence why it’s called symmetric.

A problem arises when this key has to be securely transmitted between sender and receiver.

Question 52

Q

Explain how the Diffie-Hellman Key Exchange mechanism works?

Answer

A

Allows sender and receiver to calculate key through information they can securely exchange. Here is an example

let p be a large prime > 2, and g be an integer such that g < p. Both sender and receiver agree on what these numbers are

The sender chooses a random integer a, and the receiver chooses a random integer b.

The sender calculates A = g^a mod p, and the receiver calculates B = g^b mod p.

The sender then uses B to calculate the key through Key = B^a mod p, and the receiver uses A in the same way - Key = A^b mod p.

Question 53

Q

What are the 3 Shannon Design Characteristics for a high quality cipher?

Answer

A

Confusion - the relationship between the key and the ciphertext should be as complicated as possible.
Diffusion - the relationship between the plaintext and the ciphertext should be as complicated as possible
Avalanche Effect - a change in one bit in the input to a cipher should change at least half the bits in the output in a pseudo random way.

Question 54

Q

How does the Mixing Transformation Cipher proposed by Shannon work?

Answer

A

It alternates ciphers which implement confusion (substitution) with ciphers which implement diffusion (transpositions/permutation). These are implemented with layers of S-Boxes alternating with a P-Box.

Question 55

Q

The Feistel Network is a type of iterated block cipher. Explain how it works?

Answer

A

Input is a block of plaintext of size 2w bits.

The key is made up of n subkeys, K = k1, k2, …, kn.

For n rounds, perform the following.
- split the input into left and right blocks.
- apply a function F( ki, Ri )
- perform XOR between Li and Fi.
- Swap and concatenate them to form the input to be fed into the next layer.

Question 56

Q

Explain how asymmetric key cryptography works?

Answer

A

Both senders have a public key, and both have their own respective private key.

A message encrypted with a public key can only be decrypted using the private key, and vice versa.

Question 57

Q

Explain how RSA encryption works? Answer in terms of what keys are needed and how they can be used for encryption and decryption.

Answer

A

A ciphertext C is the result of encrypting a plaintext block M using a public key { n, e }.
C = M^e mod n

A plaintext block M is obtained by decrypting a ciphertext block C using a private key { d }
M = C^d mod n

Question 58

Q

Explain how Key Generation is carried out in RSA? Hint : p, q, n, e, d.

Answer

A

Public key is composed of n and e
Private key is composed of n and d

n = p * q, where p and b are large primes
compute %(n) = (p-1)(q-1) - Euler totient.
choose e such that gcd(e, %(n)) = 1 and e < %(n)
choose d such that d*e = 1 mod %(n)

Question 59

Q

What are the two methods to do RSA cryptanalysis?

Answer

A

Brute force the secret key, since the public key can be easily accessed.
Factorize the public key n such that to obtain primes p and q. Extremely difficult for large values.

Question 60

Q

Explain how Digital Signatures work?

Answer

A

It’s an application of asymmetric encryption.

A sender signs a message using its own private key.

A receiver uses the sender’s public key to authenticate the message.

It works since only the owner of the public key could have encrypted the message.

Question 61

Q

Explain how the Creation of a Message and Verification steps work in Digital Signatures?

Answer

A

Message Creation
1. A digital signature for a particular message is created by encoding a message with the sender’s private key.

Verification
1. A transmitted digital signature can be verified using a public key, since it uniquely identifies the sender.

Question 62

Q

Digital Signatures provide a way to prevent against message interception by eavesdroppers as well as providing a way to authenticate the origin of a message. True of False?

Answer

A

False - Digital signatures do not prevent messages from being intercepted. They are only concerned with proving the authenticity of a message, and that it came from the expected sender.

Question 63

Q

Explain what a Message Digest is in Digital Signatures? Answer in terms of Sender and Receiver.

Answer

A

A message digest is encrypted with the private key to create a unique signature for that message.

Sender
1. Creates signature from message digest
2. Appends signature to message and sends it.

Receiver
1. Recreates message digest from the received message.
2. Decrypts the signature using the sender’s public key.
3. Compares the two results to check the origin of the message.

Brainscape's Knowledge GenomeTM

CS3524 - Distributed Systems Flashcards

Brainscape's Knowledge Genome^TM