DocumentDB Flashcards
Overview of DOcDB
- fully managed (non-rlational)document DB for MOngoDB workloads
- JSON documents(nested key value pairs) stored in collections (~tables)
- compatible with majority of mongoDB apps, drives, and tools
- high eprformance, scalability, and availability
- support for flexible indexing, powerful ad hoc queries and aanlytics
- storage and compute can scale independently
- supports 15 low latency read replicas (multi-az)
- auto scaling of storage from 10GB to 64TB
- FAULT TOLERANT AND SELF HEALING STORAGE
AUTOMATIC, CONTINUOUS, incremental backups and PITR
T
docdb stores JSON documents (semi structured data)
T
key vaue pairs can be nested
T
why docdb?
- JSON is the defacto format for data exchange
- documentDB makes it easy to insert, query, index, and perform aggregations over JSON data
- store JSON output from APIs straight into DB and start analyzing it
- flexible document model, data types, and indexing
- add/remove indexes easily
run ad hoc queries for operational and analytics workloads
for known access patterns - use dynamoDB instead
T
DocDB architecture
___ copies of your data across ___ AZs
6, 3
DocDB Architecture
_____ optimistic algorithm (quorum model)
lock-free
docdb architecture
___ copies out of 6 needed for writes (__/6 write quorum - data considered durable whe at least __/6 copies acknowledge the write)
4
docdb architecture
___ copies out of 6 needed for reads (__/6 read quorum
3
docdb architecture
self healing with ____ replication, storage is striped across 100s of volumes
peer to peer
docdb architecture
___ docdb isntance takes writes(master)
1
___ nodes on replicas do not need to write/replicate (=improved read performance)
compute
docdb architecture
log-structured distributed storage layer - passes ___ log records from compute to stroage layer (=faster)
incremental
docdb architecture
master + up to ___ read replicas server reads
15
docdb architecture
data is continuously backed up to __ in real time, using storage nodes (compute nore perfomance is unaffected)
s3
docdb cluster
- recommended to connect using the cluster endpoint in replica set mode (enables your SDK to auto discover the cluster arrangement as instances get added or removed from the cluster)
T
docdb replication
- up to ___ read replicas
15
docdb replication
____ repliaction
ASYNC
docdb replication
replicas share the same underlying storage layer
T
docdb replication
typically take 10s of milliseconds (replication lag)
T
docdb replication
minimal performance impact on the primary due to replication process
T
docdb HA failovers
-failovers occur automatically
T
docdb HA failovers
-a replica is automatically promoted to be the new primary during DR
T
docdb HA failovers
docdb flops the CNAME of the DB instance to point ot the replica and promotes it
T
docdb HA failovers
failover to a replica typically takes 30 seconds ( minimal downtime)
t
docdb HA failovers
creating a new instance takes about 8-10 minutes (post failover)
t
docdb backup and restore
- supports automatic backups
`T
docdb backupa dn restore
continuously backs up your data to s3 for PITR (max retention period of __days)
35
docdb backup and restore
latest restorable time for a PITR can be up to 5 minutes in the past
T
docdb backup and restore
- the first backup is a full backup
- subsequent backups are ____
incremental
docdb backup and restore
take manual snapshots to retain beyond 35 days
T
docdb backup and restore
backup process dfoesnot impact cluster performance
t
docdb backup and restore
- can only restore to a new cluster
T
docdb backup and restore
can restore an unencrypted snapshot to an encrypted cluster (but not hte other way round)
T
docdb backup and restore
to restore a cluster from an encrypted snapshot, you must have access to the KMS key
T
docdb backup and restore
can only share manual snapshots (can copy and share auto ones)
T
docdb backup and restore
can’t share a snaptho encrypted using the defaut KMS keys of teh a/c
T
docdb backup and restore
shapshots can be shared across accounts, but within the same region
T
docdb scaling
- mongoDB sharding not supported (instead offers read replicas/vertical scaling/storage scaling)
- vertical scaling (scale up/down - by resizing instances
- horizontal scaling (scale out/in) by adding /removing up to 15 replicas
- can scale up a replica independently from other replicas (typically for analytical workloads
T
docdb security - IAM and Network
- you use IAM to manage docdb resources
T
docdb security - IAM and Network
supports mongodb default auth ____ for db authentication
SCRAM(Salted Challenge Respose Authentication Mechanism)
docdb security - IAM and Network
supports built in roles for DB users with ____
RBAC(role based access control)
docdb security - IAM and Network
docdb clusters are VPC only (use private subnets)
T
docdb security - IAM and Network
clients (mongodb shell) can run on ec2 in public subnets wihtin VPC
T
docdb security - IAM and Network
can connect to your on prem IT infra via VPN
T
Docdb security - encryption
encryption at rest - AES-256 using KMS
-applied to cluster data/replicas/indexes/logs/backups/snapshots
encryption in transit - using TLS
-to enable TLS, set tls parameter in cluster param group
to conenct over TLS:
download the cert (public key) from AWS
-pass the cert key while connecting to the cluter
T
docdb pricing
- on demand instances - pricing per second with a 10 min minimum
- IOPS - per million IO requests
- each DB page reads operation from teh storage volume counts as one IO (one page = 8KB)
- write IOs are counted in 4KB units
DB storage - per GB per month
backups - per GB per month(backups up to 100% of your clusters data storage is free)
data tarnsfer - per GB
can temporarily stop compute instances for up to 7 days
T
docdb monitoring
API calls logged with ____
cloudtrail
docdb monitoring
common CW metrics
- CPU or RAM utilization - CPUUtilization/FreeableMemory
- IOPS metrics - VolumeReadIOPS/VulumeWriteIOPS/WriteIOPS/ReadIOPS
- Databaes conenctions - DatabaseConnections
- Network Traffic - NetworkThroughput
- Storage volume consumption - VolumeBytesUsed
T
docdb monitoring
-two types of logs can be published/exported to CW logs
profiler logs
audit logs
T
docdb profiler
- logs (into CW logs) the details of ops performed on your cluster
- helps identify slow operations and improve quey performance
- accesible from CW logs
- to anable:
- set parameters, profiler, profiler_threshold_ms, and profiler_sampling_rate
- enable logs exports for audit logs by modifying the intance
- both the steps above are mandatory
T
docdb audit logs
- records DDL statements, authentication, authorization, and user mgmt events to CW logs
- exports your cluster’s auditing records (JSON docs) to CW logs
- accessible from CW logs
to enable:
- set parameter audit_logs=enabled
- enable logs exports for audit logs by modifying the instance
T
docdb performance management
- use explain command to identify slow queries
db. runComamnd({explain: {<query>}})</query> - can use db.adminCommand to find and terminate queries
example: to terminate long running /blocked queries
db. adminCommand({killOp: 1, op: <opid>});</opid>
t