Section 8 - DynamoDB Flashcards
List 6 different types of databases supported by the AWS ecosystem?
- RDS
- Aurora
- DynamoDB
- DocumentDB
- ElastiCache
- Neptune
What is Amazon DynamoDB?
DynamoDB is a fully managed NoSQL key-value and document database.
- NoSQL key-value database
- Fully managed and serverless
- Non-relational
- Scales automatically to massive workloads with fast performance
- performance
- SSD Storage
- Resilience
- Spread across 3 geographically distinct data centers
- Consitency
- Eventually consistent reads (default)
- Strong consistent reads
AWS DynamoDB Constistency Models?
-
Eventually consistent reads (default)
- Consistency across all copies of data is usually reached within a second.
- Best for read performance
-
Strong consistent reads
- A stronly consistent read always reflects all successful writes.
- Writes are reflected across all 3 locations at once.
- Best for read consistency
-
ACID Transactions
- DynamoDB transations provide the ability to perform ACID Transactions (Atomic, Constistent, Isolated, Durable)
AWS DynamoDB Primary Keys?
DynamoDb stores and retrieves data based on a primary key.
Two types of primary keys:
-
Partition Key
- A unique attribute
- Value of the partition key is input to an internal hash function which determines the partition or physical location on which the data is stored.
- e.g. Customer_ID=7886544, product_id,email address ..
- Composite key (partition key + sort key)
AWS DynamoDB Access Control?
- Fine-Grained Access Control with IAM
-
IAM condition paramter
- dynamodb:LeadingKeys allows users to access only the items where the partition key value matches their User_ID
AWS DynamoDB indexes?
Query based on an attribute that is not the primary key.
- DynamoDB allows you to run a query on non-primary key attributes using global secondary indexes and local secondary indexes.
- A secondary index allow you to perform fast queries on specific columns in a table. You select the columns that you want included in the index and run your searches on the index, rather than on entire dataset.
AWS DynamoDB Local Secondary Index?
-
Primary Key
- Same partition key as your original table but a different sort key
-
A Different View
- Gives you a different view of your data, organized according to an alternative sort key
-
Faster Queries
- Any queries based on this sort key are much faster using the index than the main table.
-
An Example
- Partition Key: User ID
- sort Key: account creation date
-
Add at Creation Time
- Can only be created when you are creating your table. You cannot add remove or modify it later.
AWS DynamoDB Global Secondary Index?
-
A Completely Different Primary key
- Different partition key and sort key
-
View Your Data Differently
- Gives a completely different view of the data.
-
Speeds Up Queries
- Speeds up any queries relating to this alternative partition and sort key.
-
An Example
- Partition Key: email address
- Sort key: last log in date
-
Flexible
- You can create when you create your table or add it later.
AWS DynamoDB Query?
-
Sort key
- Results are always sorted by the sort key
-
Numeric Order
- In ascending numeric order by default (e.g. 1,2,3,4)
-
ASCII
- ASCII character code values
-
Reverse The Order
- You can reverse the order by setting the ScanIndexForward parameter false.
-
Eventually Consistent
- By default, queries are eventually consistent.
-
Strongly Consitent
- You need to explicitly set the query to be strong consistent.
AWS DynamoDB Scan Vs Query?
-
Query is More efficient than a scan
- A scan dumps the entire table and filters out the values to provide the desired result, removing the unwanted data
-
Extra Step
- Adds extra step of removing the data you don’t want
- As the table grows, the scan operation takes longer
-
Provisioned Throughput
- A scan operation on a large table can use up the provisision throughput for large table in just a single operation.
AWS DynamoDB Imporving Query and Scan perfomance?
- Set a smaller page size
- e.g. set the page size to return 40 items
-
Avoid scans
- Avoid using scans operations if you can. Design tables in a way that you can use the Query, Get, or BatchGetItem APIs
-
Paralle Scans
- You can configure DynamoDB to use parallel scans instead by logically deviding a table or index into segments and scanning each segment in parallel
- Beware: it’s best to aoid parallel scans if your table or index is already incurring heavy read or write activity from another applications.
AWS DynamoDB Provisioned Throughput?
Measured in Capacity Units:
- When you create your table, you can specify your requirements in terms of read capacity units and write capacity units
-
Write Capacity Units
- 1 X Write capacity unit = 1 x 1KB write per second
-
Read Capacity Units
- 1 X Read capacity unit = 1 x Strongly Consistent read of 4kb per second.
- OR 2 X Eventually Consistent reads of 4KB per second (default)
AWS DynamoDB On-Demand Capacity?
- DynamoDb Instantly scales up and down based on your activity of your application.
- Charges apply for reading and writing, and storing data.
-
Ideal for :
- Unpredictable workloads
- New applications where you don’t know the use pattern
- When you want to pay for only what you use (pay per request)
AWS DynamoDB Which pricing model to use?
On-Demand Capacity:
- Unknown workloads
- Unpredictable application traffic
- Spiky,short-lived peaks.
- A pay-per-use model is desired
- It might be more difficult to predict the cost.
Provision Capacity
- Read and write capacity requirements can be forecasted
- Predictable application traffic
- Application traffic is consistent or increases gradually
- You have more control over the cost.
AWS DynamoDB Accelerator (DAX)?
- DynamoDB Accelerator (Or DAX) is a fully managed, clustered in-memory cache for DynamoDB
- Delivers up to a 10x read performance improvement. Microsecond performance for millions of requests per second.
- DAX is a write-through caching service. Data is written to the cache and the backend store at the same time.
- DynamoDB API calls are pointed to DAX (cache), and if an item is not found then DAX gets the item from the DynamoDB, caches it and send it back to the API call.
- Caters for eventually consistent reads only.
-
No suitable for:
- Applications which are mainly write intensive
- Applications that do not perform many read operations
-
Ideal for:
- Read-heavy workloads and bursty wordloads
- e.g. Auction applications,Gaming site, Ecommerce (Black Friday sale)