Database Specialty - RDS Flashcards

1
Q

Instance Class Types in RDS

A

Standard, Memory-optimized, Burstable performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Storage Type

A

General Purpose, Provisioned IOPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Parameter groups important points

A

> Define configuration values specific to the select DB engine
Default parameter group cannot be edited
To make config changes, you must create a new parameter group
New parameter group inherits settings from the default parameter group
Can be applied to any DB instances within the AWS region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Restoring from a snapshot points

A
  • By default, restored cluster gets applied with
    • New security group
    • Default parameter group
    • Option group that was associated with the snapshot
  • While restoring from a snapshot, be sure to
    • Choose the correct security group to
      ensure connectivity for the restored DB
    • Choose correct parameter group for the
      restored DB
    • Recommended to retain parameter group
      of the snapshot to help restore with the
      correct parameter group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

PITR with RDS

A
  • Point-In-Time Recovery
  • Can only restore to a new instance
  • The backup retention period controls the
    PITR window
  • Can restore to any point in time during your
    backup retention period
  • RDS uploads DB transaction logs to S3 every
    5 minutes (affects latest restorable time)
  • You can move/restore a DB instance from
    outside VPC to inside VPC with PITR (but not
    other way round)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Exporting DB Snapshot Data to S3

A
  • All types of backups can be exported (automatic/manual
    or those created with AWS Backup service)
  • How to export?
    • Setup an S3 bucket with appropriate IAM
      permissions and create a KMS key for SSE
    • Export the snapshot using console (Actions à Export to Amazon S3) or using start-export-task CLI command
  • Export runs in the background
  • Doesn’t affect the DB performance
  • Data exported in Apache Parquet format (=compressed and consistent)
  • Allows you to analyze the DB data using Athena or Redshift Spectrum
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multi-AZ Deployments in RDS

A
  • For high availability, data durability and fault-tolerance (not used for scaling)
  • Offers SYNC replication to standby instance in another AZ over low latency links
  • Performs automatic failover to standby instance in another AZ in case of planned or unplanned outage
  • Uses DNS routing to point to the new master (no need to update connection strings)
  • Failover times (RTO) are typically 60-120 seconds (minimal downtime)
  • Backups are taken from standby instead of primary to ensure performance level
    during backup activity
  • Recommended for production use cases
  • To force a failover or simulate AZ-failure, reboot the master instance and choose Reboot with failover
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RDS Read Replicas

A
  • Read-only copies of master (primary) DB instance
  • Up to 5 Read Replicas
  • Within AZ, Cross AZ or Cross Region
  • Replication is ASYNC, so reads are eventually
    consistent
  • Applications must update the connection string to leverage read replicas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

RDS Read Replicas 2

A
  • Boost DB performance
    and durability
  • Useful for scaling of read- heavy workloads
  • Can be promoted to primary (complements
    Multi-AZ)
  • To create a replica, you must enable automatic backups with at least one
    day retention period
  • Replica can be Multi-AZ (= a replica with its own standby instance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

RDS Read Replicas as Multi-AZ

A
  • Supported for MySQL / MariaDB / PostgreSQL / Oracle
  • Works as a DR target. When promoted to primary, it works as Multi-AZ
  • There’s added network cost when data goes from one AZ to another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

RDS Read Replicas – Use Case

A
  • You have a production database that is
    taking on normal load
  • You want to run a reporting application
    to run some analytics
  • You create a Read Replica to run the new
    workload there
  • The production application is unaffected
  • Read replicas are used for SELECT (=read) only kind of statements (not INSERT, UPDATE, DELETE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Promoting a Read Replica to a Standalone DB
Instance

A
  • Promoted instance is rebooted and becomes an independent DB instance
    (separate from its source)
  • Will no longer work as a replica. Does not affect other replicas of the original DB
    instance
  • You cannot promote a replica to a standalone instance while a backup is running
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Promoting a Read Replica to a Standalone DB
Instance – Use cases

A
  • Use as a DR strategy
  • Avoid performance penalty of DDL operations (like rebuilding indexes)
  • Perform DDL ops on a read replica and promote it to a standalone instance. Then
    point your app to this new instance.
  • Sharding (splitting a large DB
    into multiple smaller DBs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Enabling writes on a read replica

A
  • For MySQL / MariaDB read replica, set
    the parameter read_only = 0 for the read replica to make it writable
  • You can then perform DDL operations on the read replica as needed without affecting the source DB
  • Actions taken on the read replica don’t
    affect the performance of the source DB instance
  • You can then promote the replica to a standalone DB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RDS Read Replica capabilities

A
  • Can create multiple read replicas in quick succession
  • Can use DB snapshot to perform PITR of a Read Replica
  • Can create a replica from an existing replica
  • reduces replication load from the master DB instance
  • second-tier replica can have higher replication lag
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cross-Region Read Replicas in RDS

A
  • Supported for MariaDB, MySQL, Oracle, and PostgreSQL
  • Not supported for SQL Server
  • Advantages
  • Enhanced DR capability
  • Scale read operations closer to the
    end-users
  • Limitations
  • Higher replica lag times
  • AWS does not guarantee more than
    five cross-region read replica instances
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

RDS replicas with an external database

A
  • Replication b/w an external DB and an
    RDS replica
  • Supported for MySQL / MariaDB engines
  • Two ways
  • Binlog replication
  • GTID based Replication
18
Q

RDS Disaster Recovery Strategies

A
  • To ensure business continuity despite
    unexpected failures/events
  • Multi-AZ is not enough (it can’t protect from
    logical DB corruption, malicious attacks etc.)
  • Key metrics for DR plan – RTO and RPO
  • RDS PITR offers RPO of 5 minutes (typically)
  • RTO (Recovery time objective)
  • How long it takes you to recover after a disaster
  • Expressed in hours
  • RPO (Recovery point objective)
  • How much data you could lose due to a disaster
  • Expressed in hours (e.g. RPO of 1 hour means you could lose an hour worth of data)
19
Q

Comparing RDS DR Strategies Automated backups

A

RTO - Good
RPO - Better
Cost - Low
Scope - Single region

20
Q

Comparing RDS DR Strategies Manual snapshots

A

RTO - Better
RPO - Good
Cost - Medium
Scope - Cross-Region

21
Q

Comparing RDS DR Strategies Read replicas

A

RTO - Best
RPO - Best
Cost - High
Scope - Cross-Region

22
Q

RDS Disaster Recovery Strategies

A
  • To ensure business continuity despite
    unexpected failures/events
  • Multi-AZ is not enough (it can’t protect from
    logical DB corruption, malicious attacks etc.)
  • Key metrics for DR plan – RTO and RPO
  • RDS PITR offers RPO of 5 minutes (typically)
  • RTO (Recovery time objective)
  • How long it takes you to recover after a disaster
  • Expressed in hours
  • RPO (Recovery point objective)
  • How much data you could lose due to a disaster
  • Expressed in hours (e.g. RPO of 1 hour means you could
    lose an hour worth of data)
23
Q

Troubleshooting high replica lag

A
  • Asynchronous logical replication typically results in replica lag
  • You can monitor ReplicaLag metrics in
    CloudWatch
  • ReplicaLag metric reports Seconds_Behind_Master values
  • Replication delays can happen due to:
  • Long-running queries on the primary instance (slow query log can help)
  • Insufficient instance class size or storage
  • Parallel queries executed on the primary instance
24
Q

Troubleshooting replication errors - Recommendations:

A
  • Size the replica to match the source DB (storage size and DB instance class)
  • Use compatible DB parameter group settings for source DB and replica
  • Ex. max_allowed_packet for read replica must same as that of the source DB
    instance
  • Monitor the Replication State field of the replica instance
  • If Replication State = Error, then see error details in the Replication Error field
  • Use RDS event notifications to get alerts on such replica issues
25
Q

Troubleshooting replication errors (contd.)

A
  • Writing to tables on a read replica
  • Set read_only=0 to make read replica writable
  • Use only for maintenance tasks (like creating indexes only on replica)
  • If you write to tables on read replica, it might make it incompatible with source DB and break the replication
  • So set read_only=1 immediately after completing maintenance tasks
  • Replication is only supported with transactional storage engines like InnoDB. Using
    engines like MyISAM will cause replication errors
  • Using unsafe nondeterministic queries such as SYSDATE() can break replication
  • You can either skip replication errors (if its not a major one) or delete and recreate
    the replica
26
Q

Troubleshooting MySQL read replica issues

A
  • Errors or data inconsistencies b/w source instance and replica
    • Can happen due to binlog events or InnoDB redo logs aren’t flushed during a
      replica or source instance failure
    • Must manually delete and recreate the replica
  • Preventive recommendations:
    • sync_binlog=1
    • innodb_flush_log_at_trx_commit=1
    • innodb_support_xa=1
  • These settings might reduce performance (so test before moving to production)
27
Q

Performance hit on new read replicas

A
  • RDS snapshots are EBS snapshots stored in S3
  • When you spin up a new replica, its EBS volume loads lazily in the background
  • This results in first-touch penalty (when you query any data, it takes longer to retrieve it for the first time)
  • Suggestions:
  • If DB is small, run “SELECT * FROM <table>” query on each table on the replica
  • Initiate a full table scan with VACUUM ANALYZE (in PostgreSQL)
  • Another reason could be an empty buffer pool (cache for table and index data)
28
Q

Scaling in RDS

A
  • Vertical Scaling (Scaling up)
    • Single-AZ instance will be unavailable during scaling op
    • Multi-AZ setup offers minimal downtime during scaling op – standby DB gets upgraded first and then primary will failover to the upgraded instance
  • Horizontal Scaling (Scaling out)
    • Useful for read-heavy workloads
    • Use read-replicas
    • Replicas also act as a DR target
29
Q

Sharding in RDS

A
  • Sharding = horizontal partitioning
  • Split and distribute data across
    multiple DBs (called shards)
  • Mapping / routing logic
    maintained at application tier
  • Offers additional fault tolerance
    (since no single point of failure)
  • If any shard goes through failover,
    other shards are not impacted
30
Q

RDS Monitoring - common metrics, native logs, manual monitoring tools

A
  • Common metrics
    • CPU, RAM, disk space consumption /
      Network traffic / DB connections / IOPS
      metrics
  • Native logs / extensions
    • e.g. pgaudit extension in PostgreSQL for
      auditing (DML / DCL / DDL etc)
  • Manual Monitoring Tools
    • RDS console (DB connections, R/W ops,
      storage consumption, memory
      utilization, N/W traffic)
    • AWS Trusted Advisor (cost optimization,
      security, fault tolerance, performance
      improvement checks)
    • CloudWatch (service health status etc.)
31
Q

RDS Monitoring - Automated Monitoring Tools

A
  • RDS event notifications
  • Database logs (can be exported to
    CloudWatch Logs)
  • CloudWatch (Metrics / Alarms / Logs)
  • Enhanced Monitoring (real-time)
  • Performance Insights
  • RDS Recommendations
  • CloudTrail (captures all RDS API calls, can
    be viewed in CloudTrail console or
    delivered to an S3 bucket)
  • Up to 90 days of your account activity can
    be viewed in CloudTrail console (can create
    a trail to deliver the audit logs to S3)
32
Q

RDS Notifications / Event subscriptions

A
  • Available within the RDS console
  • Allows you to create CloudWatch alarms to notify you
    whenever certain metric data crosses a threshold
  • You can send alarm notifications to an SNS topic (email
    / SMS)
  • You can also subscribe to RDS events
  • Event sources can be snapshots, instances, security
    groups, parameter groups, clusters, cluster snapshots,
    etc.
  • Events like DB instance creation, deletion, availability
    (shutdown / restart), backup, recovery, failover, failure,
    backtrack, config change etc.
33
Q

RDS Recommendations

A
  • Periodic automated suggestions for DB instances, read replicas, and DB parameter
    groups
34
Q

RDS Logs

A
  • View / watch / download DB logs from the RDS console
  • Can export logs to CloudWatch Logs (log types vary by DB engine)
  • CloudWatch Logs never expire. To expire them, set log group retention policy (1 day - 10 yrs)
  • Logs are accessible from RDS console even if you disable log export to CloudWatch Logs
35
Q

RDS Logs - Log types that can be exported to
CloudWatch Logs - topics

A
  • Alert log – Oracle
  • Audit log – Oracle, MariaDB, MySQL (must use option group with MARIADB_AUDIT_PLUGIN option for MariaDB and MySQL to audit database activity)
  • Listener log – Oracle
  • Trace log – Oracle
  • Error log – SQL Server, MariaDB, MySQL
  • Postgresql log – PostgreSQL (contains audit
    logs)
  • Upgrade log – PostgreSQL
  • General log – MariaDB, MySQL
  • Slow query log – MariaDB, MySQL
36
Q

Exporting AWS RDS logs to S3

A
  • RDS database log files can be accessed
    via RDS console, CLI or API
  • Transaction logs cannot be accessed
  • You can export log data from
    CloudWatch Logs to S3 by creating an
    export task in CloudWatch (createexport-task CLI command)
  • Log files can also be downloaded using
    the RDS API and uploaded to S3 (using
    Lambda or AWS SDK)
37
Q

RDS Enhanced Monitoring

A
  • To analyze real-time OS level metrics
    (CPU / memory usage etc.)
  • To monitor different processes or
    threads that are using the CPU
  • Helps identify performance issues
  • Increased granularity of 1 to 60
    seconds * 1, 5, 10, 15, 30, or 60 seconds
  • Requires an agent to be installed on the
    DB server to collect metrics
38
Q

RDS Performance Insights

A
  • Offers visual dashboard for performance
    tuning, analysis and monitoring
  • Monitors DB load for the instance (if the
    instance has multiple DBs, you’ll see
    aggregated metrics)
  • DB load – average number of active
    sessions (AAS – average active sessions)
  • Performance problems will appear as
    spikes in the DB load graph
  • Helps identify performance bottlenecks,
    expensive SQL statements, etc.
39
Q

RDS Performance Insights - DB load

A
  • You can visualize the DB load,
    filter it by waits / SQL / hosts /
    users
  • Waits - wait state for CPU, IO, Lock
    etc.
  • SQL – SQL statements
  • Hosts
  • Users
  • Identify slow queries (top SQL),
    locks
40
Q

CloudWatch Application Insights

A
  • For .NET and SQL Server
  • Also supports DynamoDB tables
  • Identifies and sets up key metrics, logs, and alarms for SQL Server workloads
  • Uses CloudWatch events and alarms
  • Useful for problem detection, notification and troubleshooting
41
Q

RDS on VMware

A
  • Lets you deploy RDS DBs in on-premises VMware environments (VMware vSphere)
  • Same user interface as in AWS
  • Supports MySQL, PostgreSQL, and SQL Server
  • Fully managed DBs
  • Uses health monitoring to detect unhealthy
    database instances and automatically recovers them
  • Support manual and automatic backups with PITR
  • Can use CloudWatch for monitoring
42
Q

RDS – Good things to know

A
  • Read replica can be made writable (for MySQL/MariaDB)
  • For other engines
    • read replica cannot be made writable, but you can promote it to make it writable
  • For Oracle and SQL Server
    • Automatic backups or manual snapshots are not supported on the replica
  • For Oracle
    • Does not yet support Oracle RAC (a cluster DB with a shared cache architecture)
  • For SQL Server
    • Supports both Multi-AZ options – Database Mirroring and Always On
  • For PostgreSQL
    • Only manual snapshots are supported on the replica (no automatic backups)
    • Set log retention with parameter rds.log_retention_period
  • For MySQL/MariaDB
    • Set log retention with stored procedures * e.g. call mysql.rds_set_configuration(‘binlog reten