DynamoDB Flashcards
What is DynamoDB?
DynamoDB is a fully managed No-SQL DB. It doesn’t require a schema and is designed to scale horizontally to handle any amount of traffic or data throughput. It can automatically distribute and replicate data across multiple servers to ensure high availability and performance. Backups are available as well as point in time recovery. Table classes are: Standard and Infrequent Access (IA)
What are the basics of DynamoDB
DynamoDB is made of tables
Each table has a Primary Key (must be decided at creation time)
Each table can have an infinite number of items (= rows)
Maximum size of a item is 400KB
What are the options for primary keys in DynamoDB
Option 1: Partition Key only (HASH)
Partition key must be unique for each
item
Option 2: Partition Key + Sort Key
The combination must be unique
What are the two indexes for DynamoDB
LSI – Local Secondary Index
* Keep the same primary key
* Select an alternative sort key
* Must be defined at table creation time
GSI – Global Secondary Index
* Change the partition key and optional sort sort
* Can be defined after the table is created
What are DynamoDB streams?
It enables you to capture changes (such as inserts, updates, and deletes) made to items in a DynamoDB table in near real-time
Use Cases:
Real-time Data Processing: DynamoDB Streams can be used to process data changes in real-time, enabling applications to react immediately to changes in the database.
Event-Driven Architecture: It facilitates building event-driven architectures where actions are triggered in response to changes in the database, enabling loosely coupled and scalable systems.
Change Tracking and Auditing: DynamoDB Streams can be used to track and audit changes made to the data in DynamoDB tables for compliance, monitoring, or analytics purposes.
What are DynamoDB Global tables
Global Tables allow you to create fully replicated multi-region DynamoDB tables. With DynamoDB Global Tables, you can create tables that are replicated across multiple AWS Regions. Each region has its own independent replica of the table. This replication is managed automatically by DynamoDB.
DynamoDB Global Tables automatically synchronizes data across all replicas in different regions. When an item is added, updated, or deleted in one region, DynamoDB propagates these changes to all other regions in near real-time, ensuring that all replicas stay consistent
How and why will you use Amazon Kinesis Data Streams for DynamoDB
You can use Kinesis Data Streams to capture item-level changes in DynamoDB
* Custom and longer data retention period (> 24 hours in DynamoDB Streams)
You can use Kinesis Data Firehose to view the data and store it in S3, Redshift or Opensearch
Or you can use Kinesis Data Analytics to create real time computations or analytics
How can we index objects using DynamoDB
For example, if we store things in S3, we can use a lambda function to save the metadata of these objects in DynamoDB and then use the data saved to research for things like “ total storage used by customer”
What is DynamoDB DAX
- DAX = DynamoDB Accelerator * Seamless cache for DynamoDB, no application re
- write
- Writes go through DAX to DynamoDB * Micro second latency for cached reads & queries * Solves the Hot Key problem (too many reads) * 5 minutes TTL for cache by default * Up to 10 nodes in the cluster * Multi AZ (3 nodes minimum recommended for
production) - Secure (Encryption at rest with KMS, VPC, IAM,
CloudTrail…)
When would you use DAX vs ElastiCache
If you want to read records directly from DynamoDB, you can use DAX to cache the reads. However, if you want to do computations or aggregate, you can use ElastiCache to store those
What is Amazon OpenSearch
- New name is Amazon OpenSearch
- ElasticSearch => OpenSearch
- Kibana => OpenSearch Dashboards
- Managed version of OpenSearch (open-source project, fork of ElasticSearch)
- Needs to run on servers (not a serverless offering)
Use cases:
* Log Analytics
* Real Time application monitoring
* Security Analytics
* Full Text Search
* Clickstream Analytics
* Indexing
You can push DynamoDB streams into OpenSearch using a Lambda function or push CW logs into it using a Lambda Function
What are the features of OpenSearch
OpenSearch (ex ElasticSearch): provide search and indexing capability
* You must specify instance types, multi-AZ, etc
OpenSearch Dashboards (ex Kibana):
* Provide real-time dashboards on top of the data that sits in OpenSearch
* Alternative to CloudWatch dashboards (more advanced capabilities)
Logstash:
* Log ingestion mechanism, use the “Logstash Agent”
* Alternative to CloudWatch Logs (you decide on retention and granularity)