CS3524 - Distributed Systems Flashcards

1
Q

How is database integrity achieved?

A
  1. Consistency of Data - safeguard against undesired manipulation of data
  2. Durability of Data - safeguard against loss or corruption of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the ACID properties?

A
  1. Atomicity - either all or no operation of a transaction are committed
  2. Consistency - a transaction transforms a DB from a consistent state to another
  3. Isolation - transactions do not reveal their intermediate state to other transactions until they are committed
  4. Durability - after commit, system ensures that results of a transaction will persist, even after system failure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are 2 problems associated with concurrency?

A
  1. Lost Update - transactions overwrite each other’s updates
  2. Inconsistent Retrieval - transaction base their calculations off of uncommitted data, i.e. a “dirty read”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define Serial Equivalence of two transactions?

A

Two or more transactions are serial equivalent if they produce the same result operating in an interleaving fashion as if they would operate in a completely serialized fashion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define Serialisability of two transactions?

A

For two transactions to be serializable, it is necessary and sufficient that all pairs of conflicting operations are executed on the same order on all data objects accessed by these operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 2 types of concurrency control behaviors?

A
  1. Pessimistic - Transactions wait to avoid conflict and apply early locking of resources
  2. Optimistic - Transactions are restarted after conflict detection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain how Simple Exclusive Locking works?

A

Any transaction must acquire the lock to a shared object before manipulating it.

This guarantees serialisability of transactions since conflicting pairs of operations cannot execute simultaneously on a locked object.

Not efficient, as it unnecessarily reduces concurrency of transactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain how 2-Phased Locking and Strict 2PL work?

A

A transaction performs all locking operations after the first unlocking operation.

Strict 2PL works by releasing all locks held by the transaction right before committing or aborting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain Optimistic Concurrency Control - what is does, how it works, and what issues it solves?

A

It works on the assumption that the likelihood of two transactions accessing a shared object is low.

Before a transaction is committed, conflicts are checked for. If found, conflicting transactions are aborted and restarted by the client.

Overcomes disadvantages of locking schemes by :
1. Reducing the computational cost of lock management
2. Avoid the need of read() locks
3. Avoids deadlocks and improves concurrency, however a deadlock detection or time-out mechanism must be in place.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are tentative copies of shared objects?

A

Tentative copies are created by a transaction which contains a write() operation. The transaction can then abort without affecting other transactions or the “visible” state of the object itself. Once the transaction is committed, the tentative copy of the object is written to persistent storage and becomes the most up-to-date version of the shared object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 3 phases transactions in optimistic concurrency control?

A
  1. Working phase - where all the operations occur.
  2. Validation phase - where conflicts with other transactions take place.
  3. Update phase - once the transaction has been validated, it is committed and its results made permanent.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain how the Working phase works in optimistic concurrency control?

A
  1. The transaction creates a tentative copy
  2. read() operations are performed immediately.
  3. The Read set and Write sets are instantiated for the current transaction.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the relevant set in transaction validation (optimistic concurrency control)?

A

It is computed using the transaction number assigned to any transaction in the validation phase.

It’s made up of all transactions whose Working phase overlapped with Tv, but were committed after Tv started and before Tv entered the Validation phase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain how Backward validation works in optimistic concurrency control?

A

Consider a transaction Tv, and Ti from the relevant set of Tv.
Backward validation checks whether inconsistent retrieval occurred. This happens when Tv’s read() operations conflict with Ti’s write() operations. At this point, Tv is aborted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain how Forward validation works in optimistic concurrency control?

A

Consider a transaction Tv, and Ti from the relevant set of Tv.
Forward validation checks whether any of Tv’s write() operations conflict with operations of active transactions. At this point, either Tv’s commit is delayed until the conflict resolves itself or all conflicting transactions are aborted and Tv is committed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a method to test for conflicts in optimistic concurrency control?

A

For each transaction pair Tv, Ti we compute their respective read and write sets, then we check for the intersection of their read ^ write sets and write ^ write sets. If these are non-empty then there is a conflict.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a commit protocol in distributed transactions?

A

A commit protocol is a procedure that allows servers to coordinate their actions for committing a distributed transaction.

It should guarantee Atomicity, Consistency, and Durability, as well as be non-blocking and support unilateral aborts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Explain how the 2PC protocol works in distributed transactions?

A
  1. The Voting phase
    The coordinator asks participants if they are ready to commit, and submit their vote.
    If any participant votes “no”, they will unilaterally abort, otherwise they will move to the next stage.
  2. The Completion phase
    The coordinator asks participants to commit. If one participant aborts, the coordinator will ask all other participants to abort
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is durability ensured in 2PC protocol?

A

Each server stores intermediate states of the objects they hold a lock for and use these states and local log files to “roll forward” after a participant crash

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 3 possible failure situations during the 2PC protocol voting phase, and how do coordinator and participant behave?

A
  1. Coordinator timeout - participant crashes before sending its vote back
    - coordinator order all participants to abort
    - crashed participant will recover then abort
    - non-blocking scenario
  2. Participant timeout - the coordinator crashes before sending vote request
    - participants will timeout and unilaterally abort
    - non-blocking scenario
  3. Participant timeout - the coordinator crashes after sending vote request
    - participants vote and wait for command
    - coordinator doesn’t respond
    - participants left in uncertain state
    - blocking scenario
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 3 possible failure situations during the 2PC protocol commit phase, and how do coordinator and participant behave?

A
  1. Coordinator fails, but managed to send its decision to at least one participant.
    This participant will broadcast the decision to all other participants, hence it’s a non-blocking scenario
  2. Coordinator fails, but did not send a decision to any live participant
    Participants will elect a new coordinator and restart the 2PC protocol
  3. Coordinator and at least on participant fail, other participants do not know if the failed one received any message.
    This is a blocking scenario, and participants must wait for coordinator to recover. If a new coordinator is elected and send abort command, then there could be a contradiction when the failed participant restarts, as it might have received a “commit” command before going down.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the 2 recovery strategies adopted by participants during blocking coordinator failure in 2PC protocol?

A
  1. Ask for coordinator decision
    participant will call getDecision() on the coordinator at set intervals until a return message is obtained
  2. Elect a new coordinator
    participants will elect a new coordinator server who will order a mass abort and restart the 2PC protocol.
    In case of another coordinator failure, this election can be repeated again.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Explain how the 3PC protocol works in distributed transactions? Answer in terms of what coordinator and participants do.

A

It is non-blocking in failure situations, but there is a greater cost due to more messages being handled.

Coordinator and participants can independently commit their local transactions in case of failure

No need for participants to wait for recovery of crashed coordinator

There are 3 phases
1. The voting phase
Coordinator sends out vote request. If any participant votes “no”, coordinator will order mass abort, otherwise participant moves to READY state.

  1. The pre-commit phase
    The coordinator asks participants who voted yes to send acknowledgment (ACK), if they do then they move to PREPARED state
  2. The commit phase
    participant commits its part and send a notification back to the coordinator.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How are timeout and failure recovery handled in 3PC protocol? Answer in terms of Coordinator and Participant

A

Coordinator
1. Failure during the voting phase - participants are late to send vote so coordinator orders all to abort
2. Failure before sending the ACK request - everyone will unilaterally abort
3. Coordinator fails just before sending a commit() message - after recovery, coordinator will commit, and after timeout participants will commit as well

Participants
1. Failure due to no vote request received - unilateral abort
2. Failure due to late acknowledgement request - unilateral abort
3. Participant fails to send ACK back to coordinator, unilateral abort

NOTE : any failure after the ACK message has been send will result in all parties to timeout and unilaterally COMMIT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the 4 Coffman conditions for a deadlock to occur?

A
  1. Mutual Exclusion - a resource is held by at most one process. This guarantees Consistency
  2. Hold & Wait - processes already holding resources might have to wait for another process to complete
  3. Non-preemption - once a resource has been granted to a process, it cannot be taken away. This guarantees Consistency
  4. Circular wait - process A holds resource X, process B holds resource Y. A needs Y to complete and B needs X to complete.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is one way to detect deadlocks?

A

Construct a wait-for graph, where nodes are transactions, edges represent the “waiting for” relationship between transactions, and a cycle in the graph represents a deadlock

27
Q

What are two ways to handle deadlocks?

A
  1. Choose a transaction to abort either by its age (oldest will be aborted) or by how many resources it’s currently locking.
  2. Deadlock timeout - each lock held by a transaction has a period of time where it cannot be broken. If this resource is causing a deadlock, the lock will be released after this time period.
28
Q

What is one method to prevent deadlocks?

A

Locking all objects at the very start of a transaction in one atomic action. While this prevents deadlocks from arising, it drastically reduces concurrency.

29
Q

Explain centralized detection of distributed deadlocks?

A

A server is elected as the deadlock detector.

Each other server intermittently submits its
local wait-for graph.

The detector then constructs a global wait-for graph and checks for distributed deadlocks.

This solution doesn’t scale well, and in addition if the detector server fails there is no way of detecting and handling distributed deadlocks.

30
Q

Explain what phantom deadlocks are?

A

These occur when a deadlock has been resolved locally, but the global wait-for graph has not been updated on time due to a delay. In this case, the detector server might unnecessarily abort transactions trying to solve a deadlock which isn’t there.

31
Q

Explain how edge chasing works to detect distributed deadlocks?

A

It does not require the construction of a global wait-for graph.

It uses probes, which are messages sent across virtual paths along a virtual wait-for graph.

At each server, the probe records which transaction is using which resources. If a reference appears twice in the probe then a loop (deadlock) is detected.

Edge chasing consists of the Initiation, Detection, and Resolution steps.

32
Q

What is one technique to provide reliability and availability?

A

Replication is the maintenance of copies of data objects on multiple servers

it provides performances enhancement via client side caching and maximizes the availability of service.

Assume there are n servers holding copies of data due to replication. If p is the probability of one of them failing, the overall availability of the system is a = 1 - p^n.

33
Q

What are Byzantine failures in distributed systems?

A

These are failures whose source cannot be identified. It can be mitigated against using a voting scheme :

If up to X servers can exhibit a byzantine failure, we need a voting scheme where the majority (2X + 1) servers validate the correctness of the service provided by X

34
Q

Explain what Passive Replication is?

A

There is a single primary replica manager, and multiple secondary backup managers.

The primary manager then performs operations on data and updates the backup managers.

Client requests are handled in the following way :
1. Request
2. Coordination
3. Execution
4. Agreement
5. Response

35
Q

Explain what Active Replication is?

A

All replica managers act on the same level. Client requests are multicast to these managers as a group. All managers process request and all respond to the client.

36
Q

How does 2PC protocol behave in Passive Replication?

A

A single primary replica manager can take an eager or lazy strategy :

  1. Eager - it involves all its backup managers
  2. Lazy - it does the commit itself and then communicates the result to the backup managers
37
Q

How does 2PC protocol behave in Active Replication?

A

transactions are serialized :

Read() operations are performed by a single replica manager.

Write() operations are performed by all replica managers.

The front-end server may communicate with any replica in the group.

38
Q

Explain what buffer overflow is?

A

It’s a pointer manipulation attack by which a user will enter a longer than expected string, which will then be copied onto a fixed length buffer onto the stack, and return a different pointer than expected.

It is mostly an issue in the C language, and can be mitigated against by checking for the length of the output before copying it to the buffer

39
Q

Name and explain 3 possible security issues with Java programming?

A

Possible answers include

  1. Exception Handling - ensure the program is left in a correct state when handling exceptions
  2. Passwords - do not store text passwords since they can be retrieved from compiled files
  3. Inheritance - ensure that you know why methods are contained within a class you are inheriting
  4. Mutability - use decorators to wrap class instances and make them immutable. You can also override the clone() method to prevent any attack
  5. Avoid inner classes - these can mess up private declarations, as well as make initialisation, serialisation, and cloning possible attack vessels
40
Q

Explain what SQL injection is and how to mitigate against it?

A

It’s a code injection attack by which the user will insert SQL statements into input fields which will execute on the database.

This can be mitigated against by using prepared statements and query batching

41
Q

Name and explain 2 web issues concerning secure programming?

A
  1. Naming issues - blacklisted names can be bypassed using font encodings. Can be mitigated against by using a list of allowed names instead
  2. Cross Site Scripting (XSS) - works by injecting malicious JS payload onto a server which will then run on a victim’s browser. Attacker usually aims to steal victim’s session cookies. Can be mitigated against by HTML encoding/escaping and selective tag filtering
42
Q

What are threat trees? Explain their purpose, and their structure.

A

These are used by companies to describe the security of a system in terms of its vulnerabilities.

Root nodes represent the attacker’s main goal.

Subsequent non-leaf nodes represent sub goals the attacker might need to achieve to achieve the root goal

Leaf nodes are possible attacks required to achieve parent node’s goal

Nodes can be :
OR - any child node must be achieved
AND - all child nodes must be achieved

43
Q

What is the most common label that threat tree nodes tend to have? Answer in terms of OR and AND nodes.

A

The most common annotation is Possible or Impossible. This is used by companies to save money by investing only in threats which are more likely to occur.

For OR nodes
- attack is possible if any child node is possible
- attack is impossible if all child nodes are impossible

For AND nodes
- attack is possible if all child nodes are possible
- attack is impossible if any child node is impossible

44
Q

What is ALE in security and how is it used?

A

Annual Loss Expectancy = Damage * Likelihood

It provides insight into which threats are worth investing against

45
Q

What is a cipher? Name two types of ciphering algorithms?

A

Ciphers are algorithms that scramble plain text using a key in a form which hides the original plaintext meaning.

The two types of ciphers are :

Transposition ciphers - plaintext letters are reordered according to a particular scheme.

Substitution ciphers - plaintext letters are mapped to other letters or group of letters in the same alphabet

46
Q

This question is about Columnar Transposition Ciphers.

Explain how the encoding process works by encoding the plaintext “I love secure messaging” using the key “secret”.

Explain how the decoding process works by decoding the cipher text “ous lcsg eeg vra ieen smi” using the same key.

A

Encoding
1. layout one column per letter of the key, and number them by their alphabetical order
2. write down message along the rows, wrapping to next row when overflow
3. list and concatenate the columns based upon their ordering
Answer is - ous lcsg eeg vra ieen smi

Decoding
1. Calculate number of rows by rounding up the value of cipherLength/keyLength
2. Calculate number of chars in last row with (1) * keyLength - cipherLength
3. Populate the column according to key letter ordering

Answer is - I love secure messaging

47
Q

Explain the 3 types of mono-alphabetic substitution ciphers : Cesar, key-phrase sub, random sub.

A
  1. Cesar cipher takes in a key “n” and transposes the alphabet to the left by n positions. This creates a one-to-one mapping between the letters. Can be decoded by right shifting the alphabet by “n”. For n letters, there are n-1 possible cipher alphabets
  2. Key-phrase substitution tables - these prepend the keyword to the alphabet and ensure each letter only appears once. Then a mapping is created like before.
  3. Random substitution table - these work by mapping the alphabet to a random permutation of the original one. For n letters, there are n! Possible mappings
48
Q

What is a technique used to break mono-alphabetic substitution ciphers?

A

Frequency analysis, it works by exploiting the fact that the frequency of letters is preserved by mono-alphabetic encoding. Attackers can iteratively guess the encoding of each letter given their frequency.

49
Q

The Vigenere Cipher is a well known poly-alphabetic substitution cipher. Explain how the encoding process works. Explain how the decoding process works as well.

A
  1. The the plaintext alphabet and create a table where the rows are gradually shifting the alphabet left.
  2. Create a keystream using the key. This is done by appending the letters of the key until keystreamLength = plaintextLenght
  3. For each letter in the plain text, use the letter to get the column index of the table, and the letter in the same keystream position to get the row. The letter in that cell will be the mapped cipher letter.

The decoding process is similar, just reverse the look up indices for each letter in the ciphertext. if “D” is in the cipher text and its corresponding keystream value is “F”, find row D, then travel alongside it until you reach F -> its plaintext letter is the column in which F resides.

Note - this technique is better understood via an example

50
Q

Explain what Digital Signatures are? What is a Message Digest?

A

A digital signature is a mathematical scheme which demonstrates the authenticity of a digital message

A message digest is the hashed value which maps to a hashed block of data (from the plaintext), and is an application of cryptographic has functions.

51
Q

Explain briefly how symmetric key cryptography works? What is the main problem with this method?

A

A single key is used for both encryption and decryption, hence why it’s called symmetric.

A problem arises when this key has to be securely transmitted between sender and receiver.

52
Q

Explain how the Diffie-Hellman Key Exchange mechanism works?

A

Allows sender and receiver to calculate key through information they can securely exchange. Here is an example

let p be a large prime > 2, and g be an integer such that g < p. Both sender and receiver agree on what these numbers are

The sender chooses a random integer a, and the receiver chooses a random integer b.

The sender calculates A = g^a mod p, and the receiver calculates B = g^b mod p.

The sender then uses B to calculate the key through Key = B^a mod p, and the receiver uses A in the same way - Key = A^b mod p.

53
Q

What are the 3 Shannon Design Characteristics for a high quality cipher?

A
  1. Confusion - the relationship between the key and the ciphertext should be as complicated as possible.
  2. Diffusion - the relationship between the plaintext and the ciphertext should be as complicated as possible
  3. Avalanche Effect - a change in one bit in the input to a cipher should change at least half the bits in the output in a pseudo random way.
54
Q

How does the Mixing Transformation Cipher proposed by Shannon work?

A

It alternates ciphers which implement confusion (substitution) with ciphers which implement diffusion (transpositions/permutation). These are implemented with layers of S-Boxes alternating with a P-Box.

55
Q

The Feistel Network is a type of iterated block cipher. Explain how it works?

A

Input is a block of plaintext of size 2w bits.

The key is made up of n subkeys, K = k1, k2, …, kn.

For n rounds, perform the following.
- split the input into left and right blocks.
- apply a function F( ki, Ri )
- perform XOR between Li and Fi.
- Swap and concatenate them to form the input to be fed into the next layer.

56
Q

Explain how asymmetric key cryptography works?

A

Both senders have a public key, and both have their own respective private key.

A message encrypted with a public key can only be decrypted using the private key, and vice versa.

57
Q

Explain how RSA encryption works? Answer in terms of what keys are needed and how they can be used for encryption and decryption.

A

A ciphertext C is the result of encrypting a plaintext block M using a public key { n, e }.
C = M^e mod n

A plaintext block M is obtained by decrypting a ciphertext block C using a private key { d }
M = C^d mod n

58
Q

Explain how Key Generation is carried out in RSA? Hint : p, q, n, e, d.

A

Public key is composed of n and e
Private key is composed of n and d

  1. n = p * q, where p and b are large primes
  2. compute %(n) = (p-1)(q-1) - Euler totient.
  3. choose e such that gcd(e, %(n)) = 1 and e < %(n)
  4. choose d such that d*e = 1 mod %(n)
59
Q

What are the two methods to do RSA cryptanalysis?

A
  1. Brute force the secret key, since the public key can be easily accessed.
  2. Factorize the public key n such that to obtain primes p and q. Extremely difficult for large values.
60
Q

Explain how Digital Signatures work?

A

It’s an application of asymmetric encryption.

A sender signs a message using its own private key.

A receiver uses the sender’s public key to authenticate the message.

It works since only the owner of the public key could have encrypted the message.

61
Q

Explain how the Creation of a Message and Verification steps work in Digital Signatures?

A

Message Creation
1. A digital signature for a particular message is created by encoding a message with the sender’s private key.

Verification
1. A transmitted digital signature can be verified using a public key, since it uniquely identifies the sender.

62
Q

Digital Signatures provide a way to prevent against message interception by eavesdroppers as well as providing a way to authenticate the origin of a message. True of False?

A

False - Digital signatures do not prevent messages from being intercepted. They are only concerned with proving the authenticity of a message, and that it came from the expected sender.

63
Q

Explain what a Message Digest is in Digital Signatures? Answer in terms of Sender and Receiver.

A

A message digest is encrypted with the private key to create a unique signature for that message.

Sender
1. Creates signature from message digest
2. Appends signature to message and sends it.

Receiver
1. Recreates message digest from the received message.
2. Decrypts the signature using the sender’s public key.
3. Compares the two results to check the origin of the message.