2. Cloud Flashcards

Question 1

Q

What is cloud computing?

Answer

A

Cloud Computing is Computing in the Internet

Question 2

Q

What is the fat tree design?

Answer

A

Network design for datacenter:
- Three tier design: Edge, Aggregation, Core
- Defined by single parameter k = number of ports on a switch
- All layers use the same switch
- Supports k³/4 hosts
- High redundancy: k*k/4 paths between two endpoints

Question 3

Q

What is the jellyfish network design?

Answer

A

Forget network structure and use random connections:
- Each 4L ports switch connects to
– L hosts
– 3L other random switches

Question 4

Q

What is the CAP theorem?

Answer

A

In a distributed system you can satisfy at most two out of the following three properties:
1. Consistency: all nodes have same data at any time
2. Availability: the system allows operations all the time
3. Partition-tolerance: the system continues to work in spite of network
partitions

Question 5

Q

How does Cassandra handle the CAP theorem?

Answer

A

Weak consistency

Question 6

Q

What are the characteristics of Cassandra?

Answer

A

Key-Value Pair Storage
“No-SQL”
Supports get(key) and put(key,value) operations

Question 7

Q

How is data stored in Cassandra?

Answer

A

Key-value pair
Nodes form a ring and key is hashed to determine the location (DHT)
Similar to chord
Replicated on n nodes

Question 8

Q

What are the replica policies in Cassandra?

Answer

A

Rack Unaware: replicate data at n-1 successive nodes
Rack Aware: coordinator tells nodes the range they are replicas for
Datacenter Aware: same as rack aware, but on datacenter level

Question 9

Q

How does a write operation in Cassandra work?

Answer

A

Partitioner of the node determines the node responsible (hash function)
Log it to disk commit log
Modify memtables
When memtables are old or full, flush to disk
– Datafile, Indexfile

Question 10

Q

How do Bloom filters work and what are they used for in Cassandra?

Answer

A

Bloom filter: Bit map and a set of hash functions.
- Use the set of hash functions to create a fingerprint for a given key:
– h(x) = y -> BIT[y] = 1
- is used to check if data is present on a node
- might create false positives

Question 11

Q

How is a delete operation done in Cassandra?

Answer

A

Don’t delete item right away
Add tombstone to item

Question 12

Q

How is a read operation done in Cassandra?

Answer

A

Fetch data from closest replica
Also fetch multiple other replicas
– If data differs init read-repair

Question 13

Q

How is the potential speed-up of parallelization computed?

Answer

A

Amdahls formula (upper bound):
n = number of processors
p = portion of the program that is parallelizable

S = 1 / ((1-p) + p/n)

Question 14

Q

Describe the two methods of parallelization in cloud computing

Answer

A

Request Level Parallelism (RLP):
- Concurrent processing of multiple requests: e.g. Google
– Distribute indexing, images, documents, ads, … to multiple nodes
Data Level Parallelism (DLP):
- Concurrent processing of multiple data: e.g. MapReduce
– Distribute data with map and reduce nodes

Question 15

Q

Explain the main principle of MapReduce

Answer

A

Data in key-value format
Chunk of data is processed by Mapper (mapping function) to Intermediate Output
Intermediate Output is assigned by Partitioner to Reducer (reduce funciton)
– Same Intermediate key -> same reducer
Reducer produces final output

Question 16

Q

How is the architecture of MapReduce?

Answer

Study These Flashcards

A

Master-Worker architecture:
Master = Job Tracker (JT), Worker = Task Tracker
- TT pulls map or reduce tasks from JT
- TT periodically sends heartbeat to JT

Question 17

Q

How is fault tolerance implemented in MapReduce?

Answer

Study These Flashcards

A

JT restarts task if it doesn’t receive a heartbeat from the TT
JT assigns all map or reduce tasks from the failed node to another node
JT identifies slow tasks (stragglers) by tracking the progress and runs them redundantly on a second node

Question 18

Q

Answer

Study These Flashcards

A

2. Cloud Flashcards

(18 cards)