VL 9 Flashcards

1
Q

Storage devices for VMs

A

• Instance volumes: Disks/SSD attached to physical server
• Optimized for high IOPS rates.
• Lost when VM is stopped.
• EBS volumes: Service providing volumes (storage area network)
• Can be mounted only in a single VM at a time. Thus not be usable for sharing information.
• Maximum size 16 TB
• Survive stopping or termination of VM.
• Boot device lost when VM is terminated (but you can specify to keep it as well)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Cloud Storage types

A

• Obiect store (S3)
• Shared file system (NAS) (EFS)
• Relational database (RDS)
NoSQL database (Dynamo DB)
• Data warehouse (Redshift)
• Timeseries, ledger, graph, … databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Characteristics of cloud storage systems

A

Voluminous data
Commodity hardware
Distributed data
Expect failures
Processing by application
Optimization for dominant usage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

CAP Theorem

A

Cannot be achieved together in distributed system
Consistency: read returns last written value
Availability: all requests are answered in acceptable time
Partition-tolerance: system continues working even if some nodes are separated
AP, CP
AP apply eventual consistency: providing it o my after certain time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Object storage: AWS S3

A

Simple storage service (S3):
Data spread out across at least three data centers in a region
Most used for backup
Data management: two level hierarchy of buckets and data objects
Data objects can be searched by name, bucket name, metadata but not content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

AWS S3

A

Storage classes:
Standard
Reduced_redundancy
Intelligent_tiering
Glacier
Deep_archive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

AWS S3

A

Data access: data objects can’t be modified
Versioning: object uploaded -> new version created
Object deleted -> only marked as deleted
Lifecycle: consists of rules that trigger two types of actions
Transition actions: migration of objects to another storage class
Expiration actions: define when objects expire and can be deleted by S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Consistency AW S3

A

Create new object: key becomes visible only after all replicas were written
Updating/deleting: read operations returns latest version of object
Simultaneous puts: last write wins
Atomic puts to multiple keys not supported

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Security AWS S3

A

Authentication via PKI
Access Control Lists on bucket
Contents can be encrypted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Google File System requirements

A

Survive failures of components
Files are huge
Most writes are appendings at the end
Optimized to support all common operations
Support for concurrent modifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Google File System Architecture

A

Single master server and many chunk servers
Master holds metadata in main memory Multiple shadow masters to handle client reads

Directory structure is implemented as a lookup table, mapping pathnames to metadata. No ionodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Google File System replication

A

3 replicas but can be adapted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Google File System failure detection

A

Master exchanges heartbeat wir the chunk servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Google File System Data access

A

Clients first contact master but then interacts directly with chunk servers
One of 3 chunk servers is selected as primary and is responsible for updating the replicas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Google File System data integrity

A

Data integrity: each chunk server keeps a checksum
Consistency: system allows concurrent writes and appends to chunks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Google File System Consistency

A

System allows concurrent writes and appends to chunks
Data pushed to all replica servers

17
Q

Google File System metadata

A

Master server contains metadata about all chunks
Each server stores metadata and checksum about its chunks

18
Q

Google File System interactions for writes

A
  1. client asks master for all chunkservers
  2. Master grants a new lease on chunk, increases the chunk version number, tells all replicas to do the same. Replies to client. Client no longer has to talk to master
  3. client pushed data to all servers, not necessarily to primary first
  4. Once data is acknowledged, client sends write request to primary. Primary decides serialization order for all incoming modifications and applies them to the chunk
19
Q

Google File System interactions for writes

A
  1. After finishing the modification, primary forwards write request and serialization order to secondaries, so they can apply modifications in same order. (If primary fails, this step is never reached.)
  2. All secondaries reply back to the primary once they finish the modifications
  3. Primary replies back to the client, either with success or error
20
Q

Google File System interactions for appends

A

In step 4: primary checks to see if appending a record to current chunk would exceed max size (64MB)
If yes: pads chunk, notifies secondaries to do the same, tells client to retry request on the next chunk
Record append is restricted to 1/4th max chunk size (padding at most will be 16MB)

Record appends fails at any of replicas -> client must retry
Successful record append: data must have been written at the same offset on all replicas of the chunk

21
Q

Google File System: limitations

A

Scalability of single master
Solutions: partitioning of file system and development of distributed master
64MB chunk size
No latency guarantees

22
Q

AWS Elastic File System

A

Distributed file system
Capacity: unlimited file system size, individual files 48 TB
Throughout scales with file system size

23
Q

AWS EFS

A

POSIX-compliant shared file storage
Automatic provisioning of storage capacity
Integrated with lifestyle management
Security: control access through POSIX permissions. Amazon VPC, AWS IAM
Close to open Consistency

24
Q

Relational Database

A

Designed for vertical scaling
ACID properties of transactions
Atomicity: set of operations is successfully or it doesn’t change anything
Consistency
Isolation
Durability

25
Q

AWS Relational Database Service

A

Provided standard relational dbs (postgreSQL, MySQL)
Configure multi AZ installation for automatic failover
Configure multiple read replicas

26
Q

Amazon Autora

A

Relational db
Replication: 6 copies of data replicated across 3 availability zones
Up to 15 read copies can be configured
Automatic backups in S3
Automatic storage scaling

27
Q

Features of NoSQL databases

A

Schema free
Support for non-relational data
Designed for horizontal scaling: automatic distribution
Auto-replication and caching

28
Q

Types of noSQL dbs

A

Key-value database
Document oriented
Column family database
Graph database

29
Q

Amazon dynamo

A

Key-value database
Optimized for small requests, quick access, high availability
Server less service
Fault tolerate
Automatic scaling of tables
Support for ACID transactions
Encryption by default
Fine grained access control for tables

30
Q

DynamoDB

A

Decentralized architecture and eventual consistency semantics

31
Q

Dynamo dB partitions

A

Tables are stored in partitions

32
Q

Management of partitions

A

Mapping keys to partitions
Mapping partitions to nodes
Virtual nodes are assigned to physical nodes

33
Q

Dynamo: Replication

A

Replication(N,R,W)
N consecutive nodes
If read successful on R copies, it is successful
Same for write on W copies
Typical configuration (3,2,2)
R+W>N ensures the most recent info is returned

Can be used to configure SLA requirements of the service
N determines durability
R and W latency

34
Q

Dynamo DB FAILUREs

A

Gossip protocol