DynamoDB Questions Flashcards
What is a row called in DynamoDB
Item
Do items needs to share the same number of attributes?
No
Are there columns in DynamoDB
No
Can each Item in a table can be expressed as a tuple?
Yes
What is the maximum size of an item?
400K
Can it be provisioned to scale on the fly
Yes
Is this data model suited for storing data in used for serialization?
Yes
Is this data model suited for storing data in messaging in distributed systems
Yes
Are attributes in name/value pairs?
Yes
What is used to perform DDL operations?
Dynamo DB API
What is used to perform DML operations?
object-oriented code
Does dynamo DB support SQL?
No
Can data be queried ?
Yes, programmatically through API
What data can it store well?
cookie states, session data, application state, personalization data, access control data, JSON, sensor and log data.
What type of latency can be expected?
single digit millisecond
Is this suited for horizontal scaling?
Yes
How would you scale horizontally on a RDMS?
Sharding
What are the two consistency models supported?
Strong consistency and eventual consistency
If you choose strong consistency, can you use a standard EBS?
No, you must use provisioned IOPS
What is strong consistency?
The updates of one user must be seen by all other users. Similar to ACID consistency.
Should classic entity-relationship model be implemented in DynamoDB?
No, data should be denormalized and flattened.
Are adh-hoc queries suitable?
No
Are OLAP queries suitable?
No, use redshift for OLAP
Are BLOBS suitable?
No, you can store pointers to S3. Although binary objects can be stored, the item limit of 400K makes it impractical.
Is there a concept such as a table join in DynamoDB?
No.
If you wanted run a query equivalent, what would you need to do?
Use EMR and Hive
Does DynmoDB require a primary key?
Yes
What must the primary key contain?
A hash key, and optionally a range key.
When is a range key required?
When the hashkey is not sufficiently to uniquely identify an item?
When migrating to DynamoDB what should be done to the data model?
The DM should be denormalized?
What can you tell me about the ideal hash key?
It is uniformly distributed across the items in a table.
Should reference data be used as a hash key?
Generally, no because it is not uniformly distributed
How does DynamoDB search for data that is not in the primary key?
It uses local indices, global indices, and scans.
How does a local secondary index differ from a primary index?
It uses the same hash key as defined on the table, but a different attribute as the range key.
How does a global secondary index differ?
It can use any scalar attribute as the hash key or range key.
How many local secondary indices can you have?
5
How many global secondary indices can you have
5
What are global secondary indices similar to in RDMS world?
Covering indices, that contain the data they need in the index.
How are reads in local secondary indices different from global secondary indices?
Reads on global secondary indices are always eventually consistent, whereas local secondary indices support eventual or strong consistency
What is a “read unit”?
The amount of data read from a DynamoDB table or index. 4K
What is a “write unit?”
The amount of data written to a DynamoDB table or index. 1K
Why are the read and write units important?
They determined the provisioned IO of the table.
What is “throttling”?
Dynamo’s DB queuing of the read/write operations because incorrectly provisioned I/O.
Why is throttling to be avoided?
Because it degrades performance and can throw an exception.
Can provisioned I/O be altered on an existing table?
Yes
How does a tradition RDMS and DynamoDB vary in terms of I/O?
DynamoDb can scale up or down the provisioned I/O, whereas a RDMS has fixed I/O
What are the 5 states for migrating?
Planning, Data Analysis, Data Modeling, Testing, Migration
What is the purpose of the planning phase?
Identify the data migration goals
What is the purpose of the data analysis phase?
Understand the composition of the source data and identify the data access patterns
What is the purpose of the modeling phase?
Define the indices and tables that will be needed in Dynamo DB.
In what phase use the data be profiled?
In the data analysis phase, in order to understand the data distribution needed in the modeling phase.
Why is the data access pattern important?
It will determine the provisioned I/O, performance and cost.
What are common patterns for data access?
Write Only, Fetches by distinct values, Queries across a range of values.
Should a randomly generated number ID be used as a hash key?
Yes
What elements are needed to calculate DynamoDB costs?
Number of Items, item size, write units, read units
What scalar types are supported by DynamoDB?
String, number, binary
DynamoDB Dax
Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second. DAX does all the heavy lifting required to add in-memory acceleration to your DynamoDB tables, without requiring developers to manage cache invalidation, data population, or cluster management. Now you can focus on building great applications for your customers without worrying about performance at scale. You do not need to modify application logic, since DAX is compatible with existing DynamoDB API calls. You can enable DAX with just a few clicks in the AWS Management Console or using the AWS SDK. Just as with DynamoDB, you only pay for the capacity you provision.