Developing Flexible NoSQL Solutions with Amazon DynamoDB Flashcards
What is Amazon RDS
Amazon Relational Database Service (Amazon RDS): Provides relational database services in the cloud with support for the following database engines: • Amazon Aurora • PostgreSQL • MySQL • MariaDB • Oracle • Microsoft SQL Server
What is Amazon Redshift
Fast, fully managed data warehouse.
Amazon Redshift provides custom JDBC and ODBC drivers to enable you to use familiar SQL client tools.
Amazon Redshift also includes Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3.
What is Amazon DynamoDB
NoSQL database, fully managed, that supports both document and key-value store models.
Provides low-latency queries, and a fine-grained access control.
What is Amazon Neptune
Fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets.
A graph database is ideal when you need to create relationships between data and quickly query these relationships.
Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
What is Amazon ElastiCache for Redis
ElastiCache for Redis: Redis is a fast, open source, in-memory key-value data store for use as a database, cache, message broker, and queue.
It is a popular choice for caching, session management, real-time analytics, geospatial, chat/messaging, media streaming, and gaming leaderboards.
What is Amazon Aurora
Aurora is a fully managed database by Amazon Relational Database Service (Amazon RDS).
It has been designed to be compatible with MySQL and with PostgreSQL.
DynamoDB key concepts
- Data are stored in tables.
- A table contains items with attributes.
- DynamoDB stores data in partitions and divides a table’s items into multiple partitions based on the partition key (or hash attribute) value.
- A sort key (or range attribute) can be defined to store all of the items with the same partition key value physically close together and order them by sort key value in the partition.
DynamoDB: content of an item
An item is a collection of attributes. Unlike a relational database, DynamoDB is not constrained by a pre-defined schema.
Attributes can have one of the following data types:
• Scalar types: Number, String, Binary, Boolean, and Null
• Multi-valued types: String Set, Number Set, and Binary Set
• Document types: List and Map
You can store a JSON-formatted document as an item. An item can be a maximum of 400 KB in size.
DynamoDB: the two types of primary keys
Partition primary key or Simple primary key : has a single attribute, the partition key. DynamoDB builds an unordered index on this primary key attribute. Each item in the table is uniquely identified by its partition key value.
Partition and sort primary key or Composite primary key. is made of two attributes : the first attribute is the partition key attribute and the second attribute is the sort key attribute. DynamoDB builds an unordered index on the partition key attribute and a sorted index on the sort key attribute. Each item in the table is uniquely identified by the combination of its partition key and sort key values.
DynamoDB: the read consistency levels
Read consistency levels:
• Eventually consistent read may return slightly stale data if a read operation is performed immediately after a write operation.
• Strongly consistent read returns most up-to-date data.
• Transactional provides atomicity, consistency, isolation, and durability (ACID) consistency.
DynamoDB: what are RCU & WCU
Read Capacity Unit: number of strongly consistent read per second of items up to 4 KB in size. Eventually consistent reads use half the provisioned read capacity.
Write Capacity Unit: Number of 1 KB writes per second.
DynamoDB returns an error if the provisioned throughput has been exceeded (ProvisionedThroughputExceeded exception).
Throughput per partition is the total provisioned throughput divided by the number of partitions
DynamoDB: what is a secondary index
A secondary index lets you query the data in the table using an alternate key, in addition to queries against the primary key.
When you create an index, you specify which attributes will be copied, or projected, from the base table to the index
DynamoDB: the two types of secondary index
A local secondary index is considered to be local because the index is located on the same table partition as the items that have a particular partition key. The partition key is the same as the table’s partition key. The sort key can be any scalar attribute.
A global secondary index is considered to be global because queries on this index can span all the data in a table, across all partitions. It can have a partition key and optional sort key that are different from the partition key and sort key of the original table.
DynamoDB: what are streams
An Amazon DynamoDB stream is an ordered flow of information about changes to a table.
The records in the stream are strictly in the order in which the changes occurred.
Streams scale by splitting data across shards.
DynamoDB: what are global tables
A global table is a collection of one or more DynamoDB tables, all owned by a single AWS account, identified as replica tables.
A replica table (or replica, for short) is a single DynamoDB table that functions as a part of a global table. Each replica stores the same set of data items.
DynamoDB: how are managed concurrent updates in global tables
DynamoDB global tables use a “last writer wins” reconciliation between concurrent updates.
DynamoDB: how is managed read consistency for global tables
Strongly consistent reads require using a replica in the same region as where the client is running.
DynamoDB does not support strongly consistent reads across AWS regions.
Transactions are enabled for all single-region DynamoDB tables and are disabled on global tables by default.
DynamoDB: how backup & restore work
Amazon DynamoDB provides on-demand backup and restore capabilities. The backup is created asynchronously by applying all changes until the time of the request to the last full table snapshot.
Point-in-time recovery helps protect your Amazon DynamoDB tables from accidental write or delete operations.
With point-in-time recovery, you can restore that table to any point in time during the last 35 days. DynamoDB maintains incremental backups of your table.
DynamoDB: how API works
The Amazon DynamoDB API allows you to invoke the following types of operations from an application:
- Control operations
- Data operations
- Stream operations
The Amazon DynamoDB API is a low-level HTTP-based API. The API uses JavaScript Object Notation (JSON) as a wire protocol format. The AWS SDKs construct low-level DynamoDB API requests on your behalf and process the responses from DynamoDB.
DynamoDB: API: difference between Query and Scan
The Query operation reads from a table or secondary index only the items that match the primary key specified in the key condition expression. It then refines the result set further based on the filter expression, if specified.
The Scan operation is similar to a Query operation except that the Scan operation reads all items from the table or index. The result set can be refined using a filter expression.
Comparing relational & non-relational databases for data storage
Relational : data is stored in tables that are related to each other through primary/foreign key relationship
Non-relational : supports wide-column stores, document stores, key-value stores and graph stores.
Comparing relational & non-relational databases for schema
Relational: schema defined in the beginning
Non-relational: does not have a fixed schema
Comparing relational & non-relational databases for querying
Relational: data is queried using SQL, which can allow for complex queries
Non-relational: data is queried by focusing on collections of document
Comparing relational & non-relational databases for scalability
Relational: supports vertical scaling => a single server must be made more powerful
Non-relational: supports horizontal scaling => you can partition data across multiple (cheaper) servers
Comparing relational & non-relational databases for transactions
Relational: supports ACID transactions (atomicity, consistency, isolation, durability)
Non-relational: support of transactions varies
Comparing relational & non-relational databases for consistency
Relational: automatically supports strong data consistency due to ACID transactions
Non-relational: delivers high performance with eventual consistency (if needed, strong consistency can be used)
What is ElastiCache for Memcached
ElastiCache for Memcached: Memcached-compatible in-memory key-value store service that can be used as a cache or a data store.
It is a popular choice for web, mobile apps, gaming, Ad-Tech, and ecommerce.
DynamoDB: the write consistency levels
Write consistency levels:
• Standard
• Transactional provides atomicity, consistency, isolation, and durability (ACID) consistency