Performance Efficiency - Database performance options Flashcards

1
Q

Database considerations

A
  • Access patterns
  • Availability, durability, and consistency
  • Latency
  • Scalability
  • Partition tolerance: the database continues to function even if there is a “partition” (a communication break) between two nodes, when both nodes are up but can’t communicate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

EC2 using a customer-managed database

A
  • Refers to a customer-managed environment
  • Not certified or available on AWS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

RDS - Characteristics 1

A
  • It’s a managed, highly available and secure database service. It’s the default choice
  • The options are: Aurora, MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server
  • Provides multi-AZs instances and read replicas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

RDS - Characteristics 2

A
  • Encryption is possible before creating an instance, and only on the copy of an unencrypted DB snapshot (when there’s an instance already running)
  • Supports backups and snapshots. Backups are generated from snapshots
  • Offers configuration options by instance type, storage type, network setup, and backup
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

RDS - Automatic backups

A
  • It creates and saves (to S3) automated backups during the backup window of your DB instance
  • A storage volume snapshot of the entire DB instance is created, and then, saved according to a specified backup retention period
  • Also it uploads transaction logs for DB instances to S3 every 5 minutes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

DynamoDB - Characteristics 1

A
  • It’s a serverless and key-value NoSQL database
  • Offers built-in security, continuous backups, automated multi-region replication, in-memory caching, encryption, and data export tools
  • Global tables allow to store data in multiple regions
  • DynamoDB transactions allow to offer ACID across one or more tables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DynamoDB - Characteristics 2

A
  • DynamoDB Accelerator (DAX) is an in-memory cache that delivers up to 10 times performance improvement. It’s for intensive and not strongly consistent read workloads
  • Uses a partition key, a value that defines in which node the data is stored, and secondary indexes
  • DynamoDB Streams captures a time-ordered sequence of item-level modifications in DynamoDB tables, and stores that information in a log for up to 24 hours
  • Auto Scaling allows DynamoDB to dynamically adjust provisioned throughput capacity in response to actual traffic patterns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DynamoDB - Read / write capacity modes for each table

A
  • On-Demand:
  • For unpredictable application traffic, or tables with unknown workloads
  • Pay for what you use
  • Don’t have the auto scaling option. AWS scales for you
  • Provisioned:
  • Must define the number of reads and writes per second
  • Pay whether or not you meet those thresholds
  • You can define the auto scaling options to support the demanding operations

NOTE: Can set the read / write capacity mode when creating a table or change it later

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Redshift - Characteristics 1

A
  • It uses SQL to analyze structured and semi-structured petabytes of data across data warehouses, operational databases, and data lakes, using ML at any scale
  • It’s economical and can be setup in minutes. Supports encryption
  • It’s only available in one AZ. Always maintains at least three copies of the data: original, replica, and a backup in S3
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Redshift - Characteristics 2

A
  • Pricing is based on Redshift node types and instance types. Also charges apply to bytes scanned within S3 data lakes, data stored using RedShift Managed Store, and seconds used of Concurrency Scaling feature
  • Redshift Enhanced VPC Routing ensures that traffic from and to a Redshift Cluster doesn’t go through the internet
  • Redshift Spectrum queries and retrieves structured and semistructured data from files in a S3 bucket. Its queries employ massive parallelism against large datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Redshift - Limitations

A
  • Only 100 tables per database can be created
  • Maximum 100 databases per account can exist
  • Each table can only contain 20000 partitions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Redshift - Node types

A
  • Dense compute: configured nodes have fast CPUs, large memory and SSDs for fast performance
  • Dense storage: a more economic way to provide data warehouse solutions

NOTE: The node type can be changed once it has been configured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly