AWS Hello, Storage Concepts Flashcards
1
Q
Data Dimension
A
- 3 V’s of big data.
- Consider the storage mechanism most suitable for a particular workload. NOT a single data store for the entire system.
- Right tool for the right job
2
Q
Highly structured data
A
- Has a pre-defined schema.
- Ex: Relational database
- Each entity of the same type has the same number of attributes and the domain of allowed values for an attribute can be further constrained.
- Advantages: self-described nature
3
Q
Loosely structured data
A
- Has entities, which have attribute / fields
- Field uniquely identifies an entity
- However, attributes are not required to be the same in every entity
- Result: data more difficult to analyst and process in an automated fashion. Higher burden of reasoning about the data on the consumer or application.
4
Q
Unstructured data
A
- Does not have sense or structure.
- No entities or attributes
- Can contain useful information.
- Result: any useful information must be extracted from consumer
5
Q
BLOB data
A
- Useful as a whole
- But little benefit trying to extract value from a piece or attribute.
- Result: systems that store BLOB treat as a “black box” to store/retrieve as a whole.
6
Q
Data Temperature
A
- Another useful way to look at data to determine the right storage for application
- Helps us to understand how “lively” data is (how much is being written/read and how soon it needs to be available)
- Ex: Hot, Warm, Cold, Frozen
- The same data can start hot and gradually cool.
- When this happens, tolerance of read latency increases as does data set size.
7
Q
Data value
A
- Some data must be preserved at all costs, other data can be easily regenerated or even lost without significant impact.
- Value of data will impact the investment in durability.
3.
8
Q
Data value tip!
A
- To optimize cost and/or performance further, segment data within each workload by value and temperature, and consider different data storage options for different segments.
9
Q
Data dimensions tip!
A
- Think in terms of a data storage mechanism that is most suitable for a particular workload - not a single data store for the entire system. Choose the right tool for the job.
10
Q
Storage tip - One size does not fit all!
A
- Know the availablity, level of durability, and cost factors for each storage option and how they compare.
11
Q
AWS Shared Responsibility Model and Storage
A
- AWS: responsible for securing the storage services
- Developer/customer: responsible for securing access to and using encryption on artifacts you create/store.
- Best practice to always use principle of least privilege.
12
Q
CIA model
A
- Confidentiality, Integrity, Availablity forms the fundamentals of information security. These should be applied to AWS storage.
- Availablity (1) sits on top of Integrity (2) and Confidentiality (3) to form “Information Security”
13
Q
EBS characteristics
A
- EBS presents data to EC2 instance as a disk volume.
- Provides lowest-latency access to your data from single EC2 instances.
- EBS provides durable, persistent block storage volume for use with EC2 instances.
- Automatically replicated within AZ (offering high availablity and durability)
- Offers consitent low-latency performance.
- Can scale up and down within minutes. Pay for what you provision
14
Q
Typical use cases for EBS
A
- Boot volumes on EC2 instances
- Relational / NoSql databases
- Steam and log processing aapplications
- Data warehousing applications
- Big data analytics engines (Hadoop) and Amazon EMR clusters.
15
Q
EBS designed to achieve:
A
- Availablity 99.999%
- Durability of replication within a single AZ.
- Annual failure rate (AFR) between 0.1 - 0.2 percent
16
Q
EBS Volume attributes
A
- Persist independently from the running life of an EC2 instance. (After EBS is attached to an instance, use it like any other physical hard drive.)
- Very flexible. (Current generation volumes attached to current generation instance types, can dynamically increase size, modify provisioned input/output operations per second (OPS) capacity, and change the volume type on live production volumes.
17
Q
EBS Volume types
A
- SSD-backed volumes
18
Q
SDD Use Cases
A
- GENERAL PURPOSE: recommended for most workloads.
- System boot volumes.
- Virtual Desktops
- Low-latency interactive.
- Apps.
- Development and test environments. - PROVISIONED IOPS:
- I/O intensive workloads
- Relational DBs
- NoSql DBs
19
Q
HDD Use Cases
A
- THROUGHPUT-OPTIMIZED:
- Streaming workloads requiring consistent, fast throughput at a low price
- Big data
- Data warehouse
- Log processing
- Cannot be a boot volume - COLD:
- Throughput-oriented storage for large volumes of data that is infrequently accessed
- Scenarios where the lowest storage cost is important
- Cannot be a boot volume
20
Q
Elastic Volume benefits
A
- Can be done with no downtime, performance impact, changes to application.
- Create the volume with capacity/performance needed to deploy b/c you can always change later.
- Saves hours of planning cycles and prevents overprovisioning.
21
Q
EBS Snapshot
A
- Point in time snapshot of EBS volumes
- Backed up to S3 for long-term durability.
- Volume does not need to be attached to a running instance to take a snapshot.
- Snapshots are incremental back ups, only the blocks that have changed are updated, making it much more cost-effective way to store block data
- When deleting, EBS will retain the most recent snapshot to restore from.
- EBS determines which dependent snapshots can be deleted to ensure that all other snapshots will still work.
22
Q
Elastic Volume
A
- Allows you to increase capacity dynamically, tune performance, and change the type of volume live.
- Feature of EBS.
- Can be done with no downtime, performance impact, changes to application.
23
Q
EBS Optimization
A
- Remember EBS volumes are network-attached (not attached directly to the host like instance stores)
- On instances WITHOUT support from EBS-optimized throughput, network traffic can contend with traffic b/n your instance and your amazon EBS volumes.
- EBS-optimized instances, these two types of traffic are separated.
- Some instance configurations incur an extra cost for using Amazon EBS-optimized, while other are always EBS-optimized, at no extra cost.
24
Q
EBS Encryption
A
- For simplified DATA encryption, create encrypted EBS volumes with EBS Encryption feature
- All EBS volume types support encryption.
- EBS uses 256-bit Advanced Encryption Standard (AES-256) algorthims and Amazon-managed Amazon Key Management Service (AWS KMS).