05 - Database Flashcards

1
Q

Database Considerations

A

1) Scalability
2) Storage Requirements
3) Object Size
4) Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Amazon RDS

A

1) Backups, software patching, failure detection, and recovery is all managed
2) Supports Aurora, MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server
3) Can have up to 40 Amazon RDS instances
4) High Availability - Primary instance with synchronous secondary instance as failover in different AZ

5) Read replica are available to scale database reads:
* Async copy from DB instance to read replica,
* Route read queries to the Read Replica,
* Can create Read Replica that has a different storage type from the source DB,
* Cross-region Read Replicas in multi-regions,
* Cannot have encrypted Read Replica of unencrypted DB, or vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Amazon RDS Auto Scaling

A

1) RDS automatically scales database storage capacity in response to workloads
2) Continuous monitoring of storage consumption and auto scaling when utilisation gets too close to provisioned capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon RDS Security

A

1) DB Security Groups (Used in EC2-Classic, Controls access to DB instance that is not in a VPC)
2) VPC Security Groups (Used in EC2-VPC, Controls access to DB instance in VPC)
3) EC2 Security Group (Used in EC2, can be used with DB instance as well)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Amazon RDS Monitoring

A

1) Amazon CloudWatch Metrics and Alarms
2) Amazon CloudWatch Logs
3) RDS Events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Securing RDS

A

1) Run the RDS instance in a Virtual Private Cloud (VPC)
2) Use AWS Identity and Access Management (IAM) policies for authentication and access
3) Use Security Groups to control what IP addresses or EC2 instances can connect to your databases
4) Use Secure Sockets Layer (SSL) for encryption in transit
5) Use Amazon RDS encryption on DB instances and snapshots to secure data at rest
6) Use the security features of your DB engine to control who can log in to the databases on a DB instance
7) Configure event notifications to alert you when important Amazon RDS events occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon Aurora

A

1) Managed PostgreSQL and MySQL compatible relational database
2) Serverless option available
3) Performance and high availability (5x throughput of MySQL, 3x throughput of PostgreSQL)
4) Automatic scaling up to 64TB
5) Cluster Types
* Primary DB instance - 1 per cluster, Performs all data modifications to cluster volume
* Aurora Replica - Supports only Read operations, Up to 15 replicas (synchronous replication)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Amazon Redshift

A

1) Fully Managed Petabyte-scale data warehouse and analytics
2) Can access data directly in Amazon S3
3) Consistent fast performance
4) Query Parquet, JSON, Avro, CSV
5) A Redshift Cluster is made up of a set of nodes (A leader note and one or more compute nodes, Scale in or out by adding nodes)
6) Amazon Redshift Enhanced VPC Routing (Forces all COPY and UPLOAD traffic between cluster and data repositories to travel through your VPC, Avoids the public internet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Amazon Redshift

A

1) Fully Managed Petabyte-scale data warehouse and analytics
2) Can access data directly in Amazon S3
3) Consistent fast performance
4) Query Parquet, JSON, Avro, CSV
5) A Redshift Cluster is made up of a set of nodes (A leader note and one or more compute nodes, Scale in or out by adding nodes)
6) Amazon Redshift Enhanced VPC Routing (Forces all COPY and UPLOAD traffic between cluster and data repositories to travel through your VPC, Avoids the public internet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Launch private RDS instance with Multi-AZ deployment

A

1) Create DB Subnet Group

2) Launch the RDS instance on the group with multi-AZ deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

NoSQL Database

A

1) Use dynamic schemas for unstructured data
2) Easy to use, low latency
3) Optimised for applications that have large amounts of data and flexible data models
4) Data management such as In-Memory key value stores, graph data models vs rows and columns
5) Use Cases: Internet of Things (IoT), Gaming applications, Mobile applications, Online transaction processing (OLTP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Amazon DynamoDB

A

1) Fully Managed NoSQL database offering
2) Supports key-value and document stores
3) Can store and retrieve any amount of data
4) Can serve any level of traffic
5) On-demand backups
6) Point in time recovery (up to 35 days)
7) Automatically replicated across multiple AZs
8) Full support for multi-master writes
9) Supports Endpoints with IP addresses for routing and firewall policies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Amazon DynamoDB Provisioned Mode

A

1) Differs from On-Demand as we are “Provisioning” our needs
2) When reading data from a table, the response is eventually consistent; you may have stale data
3) If you request a strongly consistent read, you get the most up to date data
4) Strongly consistent reads are not available across AWS regions
5) Throughput requirements must be specified when creating a table or index in DynamoDB
6) Specified in Read Capacity Units (RCU) and Write Capacity Units (WCU)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DynamoDB RCUs & WCUs

A

Amazon DynamoDB RCUs

  • 1 RCU = 1 Strongly Consistent Reads per second for an item up to 4kb in size
  • 1 RCU = 2 Eventually Consistent Reads per second for an item up to 4kb in size
  • 2 RCU = 1 Transactional read request (one read per second) up to 4kb in size

Amazon DynamoDB WCUs

  • 1 WCU = 1 write per second for an item up to 1kb in size (standard)
  • 2 WCU = 1 transactional write request (one write per second) up to 1 kb in size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DynamoDB Accelerator (DAX)

A

1) Fully managed, in-memory cache for DynamoDB
2) 10x performance increase, Milliseconds to microseconds
3) Developers need not manage cache, data population, or cluster management
4) Enable via console or using the AWS SDK
5) Pay for capacity you provision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amazon ElastiCache

A

1) Distributed in-memory cache
2) Redis and Memcached engines to choose from
3) Great for storing session state
4) ElastiCache Nodes
* The node is a fixed-size chink of secure, network RAM
* Multiple nodes make a cluster and are of the same instance type and runs the same cache engine
* Each cache node has it’s own DNS name and port

16
Q

Amazon ElastiCache Redis Endpoints

A

1) Single Node Redis (Cluster mode disabled) - Used for both read and writes
2) Multi-Node Redis (Cluster mode disabled) - Primary endpoint for all writes, Read endpoint points to read replicas
3) Redis (Cluster mode enabled) - Single configuration endpoint, Application connects to configuration endpoint and discovers the primary and read endpoints for each shard in the cluster

17
Q

Amazon RDS Cheat Sheet

A

1) Relational Database Service (RDS) is the AWS Solution for relational databases
2) RDS instances are managed by AWS, You cannot SSH into the VM running the database
3) There are 6 relational database options currently available on AWS - Aurora, MySQL, MariaDB, Postgres, Oracle, Microsoft SQL Server
4) Multi-AZ is an option you can turn on which makes an exact copy on your database in another AZ that is only standby
5) For Multi-AZ AWS automatically synchronises changes in the database over to the standby copy
6) Read-Replicas allow you to run multiple copies of your database, these copies only allows reads (no writes) and is intended to alleviate the workload of your primary database to improve performance
7) Read-Replicas use Asynchronous replication
8) You must have automatic backups enabled to use Read Replicas
9) You can have up to 5 read replica
10) You can combine Read Replicas with Multi-AZ
11) You can have Read Replicas in another Region (Cross-Region Read Replicas)
12) Replicas can be promoted to their own database, but this breaks replication
13) You can have Replicas of Read Replicas
14) RDS has 2 backup solutions: Automated Backups and Database Snapshots
15) Automated Backups, you choose a retention period between 1 and 35 days, There is no additional cost for backup storage, you define your backup window
16) Manual Snapshots, you manually create backups, if you delete your primary the manual snapshots will still exist and can be restored
17) When you restore an instance it will create a new database. You just need to delete your old database and point traffic to new restored database
18) You can turn on encryption at-rest for RDS via KMS

18
Q

Amazon Aurora Cheat Sheet

A

1) When you need a fully-managed Postgres or MySQL database that needs to scale, automatic backups, high availability and fault tolerance think Aurora
2) Aurora can run MySQL or Postgres database engine
3) Aurora MySQL is 5x faster over regular MySQL
4) Aurora Postgres is 3x faster over regular Postgres
5) Aurora is 1/10 the cost over its competitors with similar performance and availability options
6) Aurora replicates 6 copies for your database across 3 availability zones
7) Aurora is allowed up to 15 Aurora Replicas
8) An Aurora database can span multiple regions via Aurora Global Database
9) Aurora Serverless allows you to stop and start Aurora and scale automatically while keeping costs low
10) Aurora Serverless is ideal for new projects or projects with infrequent database usage

19
Q

Amazon Redshift - Data Warehouse Cheat Sheet

A

1) Data can be loaded from S3, EMR, DynamoDB, or multiple data sources on remote hosts
2) Redshift is Columnar Store database which can SQL-like queries and is an OLAP
3) Redshift can handle petabyte worth of data. Redshift is for Data Warehousing
4) Redshift most common use case is Business Intelligence
5) Redshift can only run in Single-AZ
6) Redshift can run vi single node or multi-node (clusters)
7) A single node is 160GB in size
8) A multi-node is comprised of a leader node and multiple compute nodes
9) You are bill per hour for each node (excluding leader node in multi-node)
10) You are not billed for the leader node
11) You can have up to 129 compute nodes
12) Redshift has 2 kinds of Node Type: Dense Compute and Dense Storage
13) Redshift attempts to backup 3 copies of your data, the original, on compute node, and on S3
14) Similar data is stored on disk sequentially for faster read
15) Redshift database can be encrypted via KMS or CloudHSM
16) Backup Retention is default to 1 day and can be increase to maximum of 35 days
17) Redshift can async back up your snapshot to Another Region delivered to S3
18) Redshift uses Massively Parallel Processing (MPP) to distribute queries and data across all loads
19) In the case of empty table, when importing Redshift will sample data to create a schema

20
Q

Amazon DynamoDB - NoSQL Cheat Sheet

A

1) DynamoDB is a fully managed NoSQL key/value and document database
2) Applications that contain large amounts of data but require predictable read and write performance while scaling is a good fit for DynamoDB
3) DynamoDB scales with whatever read and write capacity you specific per second
4) DynamoDB can be set to have Eventually Consistent Reads (default) and Strongly Consistent Reads
5) Eventually Consistent Reads data is returned immediately but data can be inconsistent. Copies of data will be generally consistent in 1 second
6) Strongly Consistent Reads will wait until data in consistent. Data will never be inconsistent but latency will be higher. Copies of data will be consistent with a guarantee of 1 second
7) DynamoDB stores 3 copies of data on SSD drives across 3 regions

21
Q

Amazon ElastiCache Cheat Sheet

A

1) ElastiCache is a manages in-memory caching service
2) ElastiCache can launch either Memcached or Redis
3) Memcached is a simple key / value store preferred for caching HTML fragments and is arguably faster than Redis
4) Redis has richer data types and operations. Great for leaderboard, geospatial data or keeping track of unread notifications
5) A cache is a temporary storage area
6) Most frequently identical queries are stored in the cache
7) Resources only within the same VPC may connect to ElastiCache to ensure low latencies