Lec 4: Storage Flashcards by K G

When a bucket is created, AWS provides features that can be managed. List and describe 5 key features.

Bucket versioning: keeps multiple versions of an object in the same bucket.

Tags: are key-value pairs that provide customized label to track costs from buckets.

Default Encryption: All objects uploaded to the bucket will be automatically encrypted using one server-side encryption.

Object Lock: prevent objects in a bucket from being deleted or overwritten for a fixed amount of time or indefinitely.

Server access logging: provides records for the requests that are made to a bucket.

How well did you know this?

Not at all

Perfectly

Describe how S3 handles consistency of objects and how this approach affects the state of objects when they are read using a GET.

S3 deliversstrong read-after-write and list (i.e., GET, PUT and LIST operations) consistency automatically. Specifically, what a user write is what they will read, and the results of a LIST will be an accurate reflection of what’s in the bucket.

When GET is used to read an object, the read request immediately receives the latest content of the object.

How well did you know this?

Not at all

Perfectly

You are asked to store data about music in a local DynamoDB table. Specifically, you need to record the artist name and their song names. Describe the AWSCLI commands you would use to create a table to store such information and write entries to that table.

From a terminal:
mkdir dynamodb; cd dynamodb

install jre if not done (sudo apt-get install default-jre)

wget https://s3-ap-northeast-1.amazonaws.com/dynamodb-local-tokyo/dynamodb_local_latest.tar.gz

tar -zxvf dynamodb_local_latest.tar.gz

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar –sharedDb

To create the table:
aws dynamodb create-table –table-name MusicAlbum
–attribute-definitions \
AttributeName=Artist,AttributeType=S \
AttributeName=Song,AttributeType=S \
–key-schema AttributeName=Artist,KeyType=HASH \
AttributeName=Song,KeyType=RANGE \
–provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 \
–endpoint-url=http://localhost:8000

How well did you know this?

Not at all

Perfectly

Create table:
aws dynamodb create-table –table-name MusicAlbum
–attribute-definitions \
AttributeName=Artist,AttributeType=S \
AttributeName=Song,AttributeType=S \
–key-schema AttributeName=Artist,KeyType=HASH \
AttributeName=Song,KeyType=RANGE \
–provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 \
–endpoint-url=http://localhost:8000

Creates a table named “MusicAlbum” with two attributes (“Artist” and “Song”),

Sets up “Artist” as the partition key and “Song” as the sort key,

Assigns provisioned throughput capacity of 1 read capacity unit and 1 write capacity unit,

Connects to a DynamoDB instance running locally on http://localhost:8000.

How well did you know this?

Not at all

Perfectly

Create items:
aws dynamodb put-item \
–table-name MusicAlbum \
–item \ ‘{“Artist”: {“S”: “Tom”}, “Song”: {“S”: “Call Me Today”}}’\
–return-consumed-capacity TOTAL –endpoint-url=http://localhost:8000

aws dynamodb put-item \
–table-name MusicAlbum \
–item ‘{“Artist”: {“S”: “Jerry”}, “Song”: {“S”: “Happy Day”}}’ \
–return-consumed-capacity TOTAL –endpoint-url=http://localhost:8000

Insert two items with values of the attributes,

Request information about the consumed capacity for the operations,

Specify a local DynamoDB for the connection.

How well did you know this?

Not at all

Perfectly

Querying:
aws dynamodb query \
–table-name MusicAlbum \
–key-condition-expression “Artist = :artist” \
–expression-attribute-values ‘{“:artist”:{“S”:”Tom”}}’ \
–endpoint-url=http://localhost:8000

Queries the “MusicAlbum” table for items where the “Artist” is “Tom”.

‘artist’ works as a placeholder and can be changed by other word.

How well did you know this?

Not at all

Perfectly

Scan:
aws dynamodb scan \
–table-name MusicAlbum \
–endpoint-url=http://localhost:8000

Outputs the whole table

How well did you know this?

Not at all

Perfectly

What will the table be like if we create the first item with 3 attributes and the second item with 2 attributes?

Cell will be empty for the missing attribute for the second item

How well did you know this?

Not at all

Perfectly

What does the code snippet do (slide 27)

This policy allows two operations to the bucket named “cits5503-123456-lecture” and its contents from IP addresses within the “192.0.2.0/24” range. This range covers all IP addresses from 192.0.2.0 to 192.0.2.255, inclusive.

What does 192.0.2.0/16 mean?

How well did you know this?

Not at all

Perfectly

What is a cloud storage? Give examples.

Cloud storage, provided by a third-party cloud provider allows customers to store and access data over the internet.

Examples: Dropbox, Google drive, iCloud, Amazon S3

How well did you know this?

Not at all

Perfectly

What is Amazon S3 (Simple Storage Service)?

It is a popular and widely used cloud storage service provided by AWS
It allows users to store and retrieve any amount of data at any time over the internet
Involved buckets and objects

How well did you know this?

Not at all

Perfectly

Steps of creating a bucket in Amazon S3

General configuration: bucket name, AWS region

Bucket name must be unique within the global namespace.
Bucket names must be unique across all AWS accounts in an AWS partition. A partition is a grouping of AWS Regions. AWS has three partitions:AWS(Standard Regions),AWS CN(China Regions), andAWS-US-Gov(US Gov Regions).

It must follow the bucket naming rules.

Specify object ownership
-ACLs: access control lists
-ACLs: disable (recommended) or enable
*Bucket ACLs are old access-control mechanism for buckets.
Block (all) Public Access settings for this bucket
Bucket versioning: disable / enable
Tags (add)
Default encryption: encryption type (SSE-S3, SSE-KMS, DSSE-KMS), bucket key: disable / enable
Object lock: disable / enable
Click create bucket

How well did you know this?

Not at all

Perfectly

What are the rules for bucket naming?

The following rules apply for naming buckets in Amazon S3:
Bucket names must be between 3 (min) and 63 (max) characters long.
Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens
Bucket names must begin and end with a letter or number.
Bucket names must not contain two adjacent periods.
Bucket names must not be formatted as an IP address (for example, 192.168.5.4).
Bucket names must not start with the prefix xn– .
Bucket names must not start with the prefix sthree- and the prefix sthree-configurator
Bucket names must not end with the suffix -s3aIias . This suffix is reserved for access point alias names.

How well did you know this?

Not at all

Perfectly

What is an object?

-It is an individual unit of data stored in a bucket

Can be a file of any type: documents, images, videos, etc

It contains both data and metadata:
-Data refers to file contents. Metadata include file attributes.
-e.g., a file called sunset.jpg is uploaded into a bucket.

How well did you know this?

Not at all

Perfectly

How to identify an object? object key + version ID (if enabled)

Object key is a string specifying the object’s location and name, e.g., cits5503-123456-lecture/subdir/sunset.jpg

Version ID: denotes a specific version of an object, e.g., v1AbCdEfGhIjKlMnOpQrStUvWxYz1234567890

A combination of an object key and version ID uniquely identifies a specific version of an object in a bucket.

How well did you know this?

Not at all

Perfectly

What are tags?

Study These Flashcards

Tags are key-value pairs that provide customized label to track costs from buckets,

e.g.,
-If a bucket is tagged as cits5503/lectures, ‘cits5503’ is a key and ‘lectures’ is a value.
-If another bucket is tagged as cits5503/labs, ‘cits5503’ is a key and ‘labs’ is a value.

What are the different encryptions?

Study These Flashcards

Server-side encryption with S3 managed keys (SSE-S3): S3 manages the encryption keys (AES-256 encryption) used to encrypt and decrypt objects.

Server-side encryption with AWS Key Management Service keys (SSE-KMS): It uses KMS to manage the encryption keys for each object stored in the bucket.

Dual-layer server-side encryption with AWS KMS (DSSE-KMS): It is a combination of SSE-S3 and SSE-KMS.

What is object lock?

Study These Flashcards

Stores objects using a write-once-read-many WORM model to help you prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely

Object Lock only works in versioned buckets. Enabling object lock enables bucket versioning automatically

What is ARN?

Study These Flashcards

ARN: a unique ID for any AWS resources, such as S3 buckets, EC2 instances and IAM users.

ARN is needed when configuring a bucket policy

Configure a bucket, what are the properties?

Study These Flashcards

Bucket versioning
Tags
Default encryption
Object Lock
Server access logging
-Provides records for the requests that are made to a bucket.

Configure a bucket, what are the permissions?

Study These Flashcards

Bucket policy
CORS (Cross-origin resource sharing) policy

What is a bucket policy?

Study These Flashcards

Secure access to buckets and their objects
-For unauthenticated users, access is denied
-For authenticated users, access is dependent on their permissions

What are the parts of a bucket policy?

Study These Flashcards

Version: indicates the language version of the policy language.

Statement: represents a permission rule.

Effect: what the effect will be when a user requests a specific action—this can be either’Allow’or’Deny’.

Principal: refers to a set of users/applications that have allowed (or denied) access to the actions and resources below.

Action: defines a set of resource operations a principal is allowed (or denied) to perform.

Resource: specifies AWS resources for which a principal is allowed (or denied) to take actions. ARN identifies the bucket.

Id: An optional identifier for the policy, denoting a unique name for the policy.

Sid: An optional identifier for the statement, denoting a unique name for the statement.

s3:GetObject: Allows users to read objects in the bucket.
s3:GetBucketLocation: Allow users to retrieve the bucket’s region.
s3:ListBucket: Allows users to list the objects in the bucket.

What is the difference between the two ARNs in the Resource field?
“arn:…lecture/*”
“arn:…lecture”

Study These Flashcards

First ARN: refers to all the objects within the bucket.
Second ARN: refers to the bucket itself.

Common S3 actions

s3:GetObject: Allow users to read objects from the bucket. s3:PutObject: Allow users to upload new objects to the bucket. s3:DeleteObject: Allow users to delete objects from the bucket. s3:ListBucket: Allow users to list the objects in the bucket. s3:GetBucketLocation: Allow users to retrieve the bucket's region. A complete list of actions (https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations.html)

What is CORS (Cross-origin Resource Sharing )

CORS allows specific origins to access a bucket and specifies the allowed HTTP methods and HTTP headers for each origin. HTTP method: defines the action to be performed against resources. -GET used to fetch resources, e.g., images, and user data. -DELETE is used to remove resources. HTTP header: is a key-value pair that contains information about an HTTP request or response. -Request header: e.g., User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64). -Response header: e.g., Content-Type: application/json. AllowedHeaders AllowedMethods AllowedOrigins ExposeHeaders

What is Amazon DynamoDB?

A fast and flexible non-relational database service -Ex of relational database service: MySQL, PostgreSQL, etc.

Steps of creating a table in Amazon DynamoDB?

Set table name + partition key Set sort key Set table settings: default or customized Set table class: DynamoDB Standard or DynamoDB Standard-IA Fill out capacity calc: average item size(KB), item read/second, item write/second, read consistency, write consistency --throughput capacity: 1 read/write capacity unit (RCU/WCU) = read/write 1 item of 1 KB in 1 second Set read/write capacity settings: capacity mode: on-demand / provisioned; auto-scaling of read capacity or write capacity Set encryption key management: owned by amazon DynamoDB / AWS managed key / stored in your account and owned and managed by out Set deletion protection (default: off) Amazon DynamoDB: Actions > Explore items > Create item (provide attribute name and value (Student: Kim)) Amazon DynamoDB: Scan / Query > Explore items > Create item (provide attribute name and value (Student: Kim))

What is read/write consistency?

Read from DynamoDB can be: Eventually Consistent The response of a read may not reflect the result of a recent write operation Strongly Consistent The response of a read returns the most up-to-date data reflecting all updates from all previous write operations Transactional Group multiple read actions together and submit them in a single all-or-nothing operation Write from DynamoDB can be: Standard Transactional Group multiple write actions together and submit them in a single all-or-nothing operation

Different Data Types

Attributes can be: Scalar: represent exactly one value --Number, String, Binary, Boolean, null Set: an array of the same scalar type --["Black", "Green" ,"Red"] --[42.2, -19, 7.5, 3.14] Document

Different Document Types

List: a collection of values, enclosed in square brackets: [ ... ] --FavoriteThings: ["Cookies", "Coffee", 3.14159] Map: a hierarchical structure of attributes within a single attribute, enclosed in curly brackets: {...}

What are the core components of DynamoDB?

In DynamoDB, tables, items, and attributes are the core components that you work with. A table is a collection of items, and each item is a collection of attributes. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility. You can use DynamoDB Streams to capture data modification events in DynamoDB tables.

Write a Python script to create a table called CloudFiles on your local DynamoDB and the attributes for the table are: CloudFiles = { 'userId', 'fileName', 'path', 'lastUpdated', 'owner', 'permissions' } ) userId is the partition key and fileName is the sort key.

import boto3 Get the service resource. dynamodb = boto3.resource('dynamodb') Create the DynamoDB table. table = dynamodb.create_table( TableName='CloudFiles', KeySchema=[ { 'AttributeName': 'userId', 'KeyType': 'HASH' }, { 'AttributeName': 'fileName', 'KeyType': 'RANGE' } ], AttributeDefinitions=[ { 'AttributeName': 'userId', 'AttributeType': 'S' }, { 'AttributeName': 'fileName', 'AttributeType': 'S' }, ], ProvisionedThroughput={ 'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5 } ) Wait until the table exists. table.wait_until_exists() Print out some data about the table. print(table.item_count)

Use AWS CLI command to scan the created DynamoDB table, and output what you've got.

aws dynamodb scan --table-name CloudFiles

Use AWS CLI command to delete the table.

aws dynamodb delete-table --table-name CloudFiles

Lec 4: Storage Flashcards

(35 cards)