DynamoDB Flashcards
Partition Key
Often referred to as the hash key, it determines the partition or storage location of the item based on its value.
Range Key
Also known as the range key, it comes into play when items share the same partition key, facilitating efficient sorting and querying within that partition.
Partition and Range Key working
The partition key determines the partition in which the item will be stored, while the sort key organizes items within the partition.
Use Cases: Sorting items based on attributes such as timestamps or numerical values becomes seamless with the sort key, enabling targeted queries within specific partitions.
Optimal Performance
DynamoDB is highly scalable and distributed, but optimal performance is achieved by evenly distributing operations across partitions and keys. DynamoDB distributes data across partitions, and the service’s efficiency is optimized when items are evenly distributed among partitions. Overloading items in a single partition can create bottlenecks and impact performance. It is essential to design primary keys to evenly distribute the load and avoid overloading a single partition. This is known as a “hot partition” and should be avoided to ensure balanced performance.
Item Retrieval
Partition key is fundamental for item retrieval, (cant even do range based search with Partition key) the sort key adds an extra layer of sophistication by allowing items with the same partition key to be distinguished and organized.
Sorting items based on attributes such as timestamps or numerical values becomes seamless with the sort key, enabling targeted queries within specific partitions.
Attributes
Attributes accommodate various data types, from simple strings and numbers to complex structures like lists or maps, offering flexibility in data representation.
Key Condition Expression and Filter Expression
When conducting queries in DynamoDB, you can specify conditions on both the partition key and the sort key using KeyConditionExpression. Additionally, you can apply filter expressions using FilterExpression. The main difference lies in that KeyConditionExpression operates on the keys directly and is more efficient, while FilterExpression filters the results after they have been retrieved.
def query_discount_rules_for_store(store_id, date):
dynamodb = boto3.resource(‘dynamodb’)
table = dynamodb.Table(‘RetailStoreRules’)
filter_expression = Key(‘StoreID’).eq(store_id) & Key(‘Date’).eq(date)
response = table.query(
KeyConditionExpression=filter_expression
)
items = response.get(‘Items’, [])
return items
LSI
In the context of NoSQL databases like DynamoDB, a Local Secondary Index (LSI) is an index associated with a table that shares the same partition key as the main table but has a different sort key. This means that data is organized differently in the index compared to the main table, enabling efficient queries based on the sort key of the LSI.
GSI
Global Secondary Index (GSI) is an independent index of the main table, with its own partition key and, optionally, a different sort key. Unlike LSIs, a GSI does not share the partition key of the main table.
DynamoDB as a Document Storage DB
DynamoDB is called a document-oriented database because it can store and process semi-structured, JSON-like documents, supports flexible schema design, and enables hierarchical data modeling with Maps and Lists.
user_data = {
“UserId”: “12345”,
“Name”: “John Doe”,
“Email”: “john.doe@example.com”,
“Address”: {
“Street”: “123 Elm St”,
“City”: “Los Angeles”,
“State”: “CA”,
“ZipCode”: “90001”
},
“Orders”: [
{
“OrderId”: “A1001”,
“Product”: “Laptop”,
“Price”: 1200,
“Status”: “Shipped”
},
{
“OrderId”: “A1002”,
“Product”: “Mouse”,
“Price”: 25,
“Status”: “Delivered”
}
]
}
Insert the JSON document into DynamoDB
table.put_item(Item=user_data)
Each of the elements of the JSON will be parsed and stored as attributes. UserId would become the primary key
String (S)
Number (N)
Boolean (BOOL)
Map (M) → Used to represent nested JSON objects
List (L) → Used to represent arrays
Querying
DynamoDB allows direct access to nested attributes using dot notation.
response = table.get_item(Key={“UserId”: “12345”})
print(response[“Item”][“Address”][“City”])
Cassandra vs DynamoDB - Document Storage difference
Feature DynamoDB Cassandra
JSON-like storage ✅ Yes (Maps, Lists) ✅ Yes (UDTs, Maps, Lists)
Schema flexibility ✅ No schema required ❌ Needs predefined schema
Query performance ✅ Fast (Indexed attributes) ✅ Fast (Wide-column store)
Nested object query ✅ Yes (dot notation) ❌ No direct access (must fetch full field)
Secondary indexes ✅ Global/local indexes ✅ Supports secondary indexes
DynamoDB vs Cassandra - when to use
DynamoDB is more flexible for JSON-heavy applications, while Cassandra is better for high-write workloads with structured queries
Queries must fetch full fields (e.g., Orders list), while DynamoDB allows partial retrieval.
Range Query in DynamoDB
Yes, you can perform a range query in Amazon DynamoDB, but only when using a composite primary key (also called a partition key + sort key).
DynamoDB supports range queries using the sort key of a composite primary key.
The partition key must be specified exactly.
You can then apply range queries on the sort key using operators like:
BETWEEN
>=, >, <=, <
BEGINS_WITH (for string sort keys)
IN