Database Flashcards
RDS Features
- Hardware, OS, and database software deployment and maintenance
- Built-in monitoring
- Data encryption at rest and in transit
4.Industry compliance Automatic Multi-AZ data replication - Compute and storage scaling
- Minimal application downtime
RDS DB Engines?
Amazon Aurora
*
PostgreSQL
*
MySQL
*
MariaDB
*
Oracle Database
*
SQL Server
Amazon RDS Multi-AZ deployments
Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone.
Not used for read only
Read replicas
create read replicas of your database kept in sync with the primary DB instance.
Read replica DB engines
Read replicas are available in Amazon RDS for Aurora, MySQL, MariaDB,
PostgreSQL, Oracle, and Microsoft SQL Server.
Read replica use cases
*
Relieve pressure on your primary node with additional read capacity.
*
Bring data close to your applications in different AWS Regions.
*
Promote a read replica to a standalone instance as a disaster recovery (DR) solution if the primary DB instance fails.
With Amazon RDS for MySQL and MariaDB, you can also set the read replica as Multi-AZ, and as
a DR target.
RDS Data Encryption at Rest
RDS provides encryption of data at rest by using the AWS Key Management Service (AWS KMS)
Amazon Aurora
Amazon Aurora is an enterprise
-class relational database. It is compatible with MySQL and PostgreSQL . It is faster than standard MySQL databases and PostgreSQL databases.
Aurora helps to reduce your database costs by reducing unnecessary I/O
operations,
Consider Aurora if your
workloads require high availability. It replicates six copies of your data across three Availability Zones and
continuously backs up your data to Amazon Simple Storage Service (Amazon S3).
Amazon Aurora DB cluster
Amazon Aurora DB cluster consists of one or more DB instances and a cluster volume that manages the data for those DB instances.
Aurora cluster volume is a virtual database storage volume that spans multiple Availability Zones. Each
Availability Zone has a copy of the DB cluster data.
Aurora offers two instance types?
*Primary instance
–
Supports read and write operations and performs all the data modifications to the cluster
volume. Each Aurora DB cluster has one primary instance.
*Aurora replica
–
Supports read operations only. Each Aurora DB cluster can have up to 15 Aurora replicas in
addition to the primary instance. Multiple Aurora replicas distribute the read workload. You can increase
availability by locating Aurora replicas in separate Availability Zones. You can have a read replica in the same
Region as the primary instance.
Aurora Serverless v2
Aurora Serverless v2 is an on-
demand, auto scaling configuration for Amazon Aurora. Aurora Serverless v2 helps
to automate the processes of monitoring the workload and adjusting the capacity for your databases based on demand
DynamoDB
What do you pay for?
a fully managed NoSQL database service. The service manages the complexity of running
a scalable, distributed NoSQL database
pay for the storage that you are consuming and the I/O throughput that you have provisioned
DynamoDB Structure
Structured in key value pairs
Partitions data by key
Table has partition key and name
Each table consist of items (row - uniquely id by primary key), attributes (columns or key/value)
DynamoDB Features
DynamoDB supports end-to-end encryption and fine-grained access control.
Concurrent read-writes
DynamoDB replicates table data across three Availability Zones in a Region
Eventually consistent across all storage locations (~1 sec) by default but you can request strong consistent read
DynamoDB capacity configuration
provision capacity based the storage and throughput requirements
If you choose auto scaling, additional capacity is provisioned when the required I/O throughput increases,
within limits that you set.
The on-demand choice permits an application to seamlessly grow to support users concurrent requests to the database
DynamoDB primary Keys
Simple primary key– is composed of
just one attribute, which is designated as the partition key
No two items can have the same value.
Composite primary key – composed of both a partition key and a sort key. In this case, the partition key value for multiple items can be the same, but their sort key values must be different.
How does DynamoDB measure read and write capacity
RCU - read capacity unit - read request for up to 4 kb of items
WCU - write capacity unit -write number of write request per second for up to a 1kb item
How can you manage capacity of DynamoDB
Two options:
On demand - pay per request on reads and writes
Provisioned - Set maximum RCU and WCU; when traffic exceeds limits request will be throttled
What are things you have to consider
expected number of read/write request per second
size of requests
When is on demand DynamoDB capacity mode best?
Have unknown workloads
Have unpredictable traffic
Prefer to pay for only what you use
When is provisioned DynamoDB capacity mode best?
Have predictable application traffic
Have traffic that is consistent or changes gradually
Can forecast capacity requirements to control costs
DynamoDB Eventually Consistent
When you read data from a DynamoDB table, the response might not reflect the results of a recently completed
write operation. The response might include some stale data. If you repeat your read request after a short time,
the response should return the latest data
Strongly consistent reads
When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, which reflects updates from all prior successful write operations. A strongly consistent read might not be
available if a network delay or outage occurs.
DynamoDB global tables
A global table is a collection of one or more DynamoDB tables, which are all owned by a single AWS account,
identified as replica tables.
Any given global table can only have one replica table per Region
DynamoDB global tables provide a fully managed solution for deploying a multi-Region, multi-active database
you specify the regions
DynamoDB communicates changes over the AWS network backbone
DynamoDB replica
a single DynamoDB table that functions as part of a global table. Each replica stores the same set of data items
every replica has the same table name and the same primary key schema
Cache Strategies
Lazy Loading - if cache miss (data not in cache) retrieve from DB and write to cache
Write Through - App writes data to DB and cache when initialized
TTL - key expiration integer value configured against cache
Eviction policy - when cache is full evict data based on policy - usually evals TTL, least recently or frequently used
Amazon
ElastiCache
a web service that facilitates setting up, managing, and scaling a distributed in-memory
data store or cache environment in the cloud
ElastiCache supports two open-
source in-memory engines (in-
memory as compared to disk):
*Redis
*Memcached
ElastiCache for Memcached
ideal candidate for use cases where frequently accessed data must be in memory
a simple caching model with multi threading
supports Auto Discovery
ElastiCache for Memcached Features
Simple cache to offload database burden
Ability to scale horizontally for writes and storage
Multi AZ deployments
Multi threaded performance
ElastiCache for Redis
provides sub millisecond latency at internet scale (fast)
ElastiCache for Redis Features
Simple cache to offload database burden
Ability to scale horizontally for writes and
storage (when using cluster mode)
Multi AZ deployments
Advanced data types
Sorting and ranking datasets
Publish and subscribe capability
Backup and restore
Amazon DynamoDB Accelerator (DAX)
a caching service compatible with DynamoDB that provides fast in
memory performance for demanding
applications
How do you configure Amazon DynamoDB Accelerator (DAX)
You create a DAX cluster in your Amazon VPC to store cached data closer to your application. You install a DAX client on the EC2 instance that is running your application in that VPC. At runtime, the DAX client directs all of your application’s DynamoDB requests to the DAX cluster. If DAX can process a request directly, it does so.
Otherwise, it passes the request through to DynamoDB
AWS Database Migration Service (AWS DMS)
what is it and how is it configured
replicates data from a source to a target database in the AWS
Cloud
You create a source and a target connection to tell AWS DMS where to extract from and load to.
schedule a task that runs on this server to move your data. AWS DMS creates the tables and associated
primary keys if they don’t exist on the target
AWS DMS supported DBs? supported migrations?
Oracle, PostgreSQL, SQL
Server, Amazon Redshift, Aurora, MariaDB, and MySQL
supports homogenous (same engine) and heterogeneous (different engine migrations
Can migrate on prem to cloud or cloud to cloud but either source or target must reside in RDS or on EC2
can use snowball edge as target when DB is too big to move over the internet or poor connectivity
AWS Schema Conversion Tool (AWS SCT)
automatically converts the source database schema and a majority of the database code objects. The
conversion includes views, stored procedures, and functions
They are converted to a format that is compatible with the target database. Any objects that cannot be automatically converted are marked so that they can be manually converted to complete the migration.
The AWS SCT can also scan your application source code for embedded SQL statements and convert them as part of a database schema conversion project
After the schema conversion is complete, AWS SCT can help migrate data from various data warehouses to Amazon Redshift by using built
in data migration agents
When should you use a relational DB
You require strict schema rules and
data quality enforcement.
Your database doesn’t need extreme
read/write capacity.
If you have a relational dataset that
does not require extreme
performance, a relational database
management system can be the best,
lowest effort solution
When should you use a non relational DB
You need your database to scale
horizontally.
Your data does not lend itself well to
traditional schemas.
Your read/write rates exceed the rates
that can be economically supported
through a traditional structured query
language (SQL) database.