Databases Flashcards by erik MOLNAR

RDS runs on VMs

T or F

How well did you know this?

Not at all

Perfectly

RDS is serverless

T or F

it is not serverless

How well did you know this?

Not at all

Perfectly

aurora serverless is serverless

T or F

How well did you know this?

Not at all

Perfectly

read replicas are used for scaling, not DR

T or F

How well did you know this?

Not at all

Perfectly

must have auto backups turned on in order to deploy a read replica

T or F

How well did you know this?

Not at all

Perfectly

You can have up to ___ read replica copies of any DB

How well did you know this?

Not at all

Perfectly

you can have read replicas of read replicas

T or F

T, but watch out for latency

How well did you know this?

Not at all

Perfectly

read replica facts:

each read replica will have its own DNS end point

you can have read replicas that have multi AZ

you can create read replicas of multi az source databases

read replicas can be promoted to be their own DB. THis breaks replication

you can have a read replica in a second region

yes

How well did you know this?

Not at all

Perfectly

2 types of backups for rds

automated backups

database backups

yes

How well did you know this?

Not at all

Perfectly

read replica facts

multi az

used to increase performance

must have backups turned on

can be in different regions

can be mysql, postgres, mariadb, oracle, aurora

can be promoted to master, this will break the read replica

yes

How well did you know this?

Not at all

Perfectly

multi az tips

used for DR

you can force a failover from one az to another by rebooting the instance.

yes

How well did you know this?

Not at all

Perfectly

This DB service is:

stored on ssd storage

spread across 3 geographically distinct data centers

eventual consistent reads (default)

strongly consistent reads

what is dynamo DB?

How well did you know this?

Not at all

Perfectly

consistency across all copies of data is usually reached within a second with this type of read. reoeating a read after a short time should return the updated data. (best read performance)

eventual consistent reads

How well did you know this?

Not at all

Perfectly

A ____ consistent read returns a result that reflects all writes that received a successful response prior to the read

strongly

How well did you know this?

Not at all

Perfectly

This is a fully managed, highly available, in memory cache for dynamo DB

10x performance improvement

reduces request time from milliseconds to microseconds - even under load

no need for developers to manage cache

compatible with dynamo db api calls

dynamo db accelerator (DAX)

How well did you know this?

Not at all

Perfectly

dynamo db transactions notes:

multiple all or nothing operations

financial transactions

fulfilling orders

two underlying reads or writes - prepare/commit

up to 25 items or 4 mb of data

yes

How well did you know this?

Not at all

Perfectly

this type of dynamo db capacity provides:

pay per request pricing

balance cost and performance

no minimum capacity

no charge for read/write - only storage and backups

pay more per request than with provisioned capacity

new product launches

on-demand capacity

How well did you know this?

Not at all

Perfectly

dynamo db on demand backup and restore notes:

full backups at any time

zero impact on table performance or availability

consistent within seconds and retained until deleted

operates withing same region as source code

yes

How well did you know this?

Not at all

Perfectly

dynamo db point in time recovery notes:

protects against accidental ______ or deletes

restore to any point in the last ____ days

_____ backups

not enabled by default

latest restorable: ____ minutes in the past

writes

incremental

five

How well did you know this?

Not at all

Perfectly

dynamo db ___ are time ordered sequence of item lvel changes in a table

they are stored for 24 hours

inserts, updates, and deletes

combine with lambda functions for functionality like stored procedures

streams

How well did you know this?

Not at all

Perfectly

dynamo db global tables notes

managed multi master, multi region replication

globally distributed apps

based on dynamo db streams

multi region redundancy for dr or ha

no app rewrites

replication latency under one second

yes

How well did you know this?

Not at all

Perfectly

DMS =

database migration service

How well did you know this?

Not at all

Perfectly

dynamo db security

encyption at rest using ___

site to site ___

direct ____

IAM policies and ____

___ grained access

CW and CT

VPC endpoints

KMS

vpn

connect

roles

fine

How well did you know this?

Not at all

Perfectly

____ is a fast and powerful, fully managed, petabyte scale data warehouse service in the cloud. Customers can start small for just .25 per hour with no commitments or upfront costs and scale toa. apetabyte or more for 1,000 per TB per year, less than a tenth of most other data warehousing solutions

redshift

How well did you know this?

Not at all

Perfectly

\_\_\_\_\_ transaction example: net profit for EMEA and pacific for the digital radio product. pulls in large number of records sum of radios sold in EMEA sum of radios sold in pacific unit cost of radio in each region sales price of each radio salce price - unit cost

OLAP

olap

online analytics processing

REdshift can be configured as follows single node (160GB) multi node leader node (manages client connections and receives queries) computer node (store data and perform queries and computations) up to 128 compute nodes

yes

redshit advanced \_\_\_\_ columnar data stores can be compressed much more than row based data stores because similar data is stored sequentially on disk. redshift employs multiple compression techniques and can often achieve significant compression relative to traditional relational indexes or materialized views, and so uses less space than traditional relational database systems. when loading data into an empty table, redshift automatically samples your data and selects the most appropriate compression scheme.

compression

mpp =

massive parallel processing

\_\_\_ ___ \_\_\_ redshift automatically distributes data and query loads across all nodes. redshift makes it easy to add nodes to your data warehouse and enables you to maintain fast query performance as your data warehouse grows.

massively parallel processing

redshift backups enabled by default with a 1 day retention period max retention period is 35 days redshift always attempts to maintain at least three copies of your data (the original and replica on the compute nodes and a backup in s3) redshift can also asynchronously replicate your snapshots to s3 in another region for disaster recovery.

yes

redshift pricing compute node hours (total number of house ou run across all your compute nodes for the billing period. you are billed for 1 unit per node per hour, so a 3 node data warehouse cluster running persistently for an entire month would incur 2,160 instance hours. you will not be charged for leader node hours; only compute nodes will uncur charges.) charged for backups charged for data transfer (only within vpc, not outside it)

yes

redshift security considerations encrypted in transit using SQL encrypted at rest using AES-256 encryption by default redshift takes care of key management - manage your own keys through HSM - AWS key management service

yes

redshift availability \_\_\_ AZ(s) can restore snapshots to new AZs in event of an outage

What is aurora? it is a mysql and postgresql compatible _____ db engine that combines the speed and availability of high end commercial databases with the eimplicity and cost efefctiveness of open source databases.

relational

aurora provides up to ___ x better performance than mysql and \_\_\_x better than postgres dbs at a much lower price point, whilst delivering similar performance and availability

5, 3

THings to know about aurora 1. start with \_\_gb, scales in\_\_gb increments to \_\_\_tb (storage autoscaling) 2. compute resources can scale up to \_\_\_vCPUs and 244GB of RAM 3. ___ copies of your data is contained in each AZ, with max of ___ AZs. ___ copies of your data.

10,10,64 34 2, 3, 6

aurora is designed to transparently handle the loss of up to ___ copies of data without affecting db write avialbility and up to ___ copies without affecting read availability

2,3

t or f aurora storage is self healing. data blocks and disks are continuously scanned for errors and repaired automatically.

three types of aurora replicas are available: aurora replicas (how many?) mysql read replicas (how many?) postgresQL (how many?)

15, 5, 1

t or f backups are always enabled on aurora db instances

t or f backups impact db performance and must be done during slow traffic periods

false, they do not impact business

t or f aurora snapshots impact performance

f they do not impact performance

t or f aurora snapshots cannot be shared with other aws accounts

f they can

aurora ____ is an on demand autoscaling capable edition of aurora. an aurora ___ db cluster automatically starts up, shuts down, and scales capacity up or down based on your apps needs.

serverless

t or f aurora serverless provides a relatively simple, cost effective option for infrequent, intermittent, or unpredictable workloads

does memcached support simple cache to offload DB

yes

does memcached support ability to scale horizontally

YES

DOES MEMCAChed support multithreaded performance

yes

does memcached support advanced data types

does memcached support ranking/sorting data sets

does memcached support pub/sub capabilities

does memcached support persistence

does memcached support multi AZ

does memcached support backup and restore capabilities?

does redis support simple cache to offload DB

yes

does redis support ability to scale horizontally

yes

does redis support multi threaded performance

does redis support advanced data types

yes

does redis support ranking/sorting data sets

yes

does redis support ranking/sorting data sets

yes

does redis support pub/sub capabilities

yes

does redis support persistence?

yes

does redis support multi az?

yes

does redis support backup and restore capabilities?

yes

use ___ to increase DB and web application performance

elasticache

\_\_\_ ___ \_\_\_\_ is a cloud service that makes it easy to migrate relational databses, data warehouses, nosql dbs, and other types of data stores. you cna use ___ \_\_ ___ to migrate your data into the cloud, between on prem instances or between combinations of cloud and on prem setups.

database migration service (DMS)

SCT = ?

schema creation tool

t or f you need SCT even if you are migrating to identical databases

f you do not need sct if dbs are the same.

DMS - the source can either be on prem or inside aws iteself or another provider such as azure t or f

t or f dms allows you to migrate databses from one source to aws.

t or f you can do homogenous migrations(same db engines) or heterogenous migrations (different db engines) DMS

t or f if you do a heterogenous migration with dms, you will need the aws schema conversion tool

the follwing services have caching capabilities api gateway cloudfront elasticache - memcached and redis dynamodb accelerator (DAX)

yes

emr = ?

elastic map reduce

\_\_\_\_ is the industry leading cloud big data platform for processing vast amounts of data using open source tools such as apache spark, apache hive, hbase, flink, hudi, presto. with ____ you can run petabyte scale analysis at less than half the cost of traditional on prem solutions and over 3x faster than standard apache spark

emr

the central component of EMR is the \_\_\_\_\_\_

cluster

EMR match the nodes: master, core, task 1. a node w/ sw components that only runs tasks and does not store data in HDFS. they are optional 2. a node that manages the cluster. this node tracks the status of tasks and monitors the health of the cluster. every cluster has one. 3. a node with sw components that runs tasks and stores data in the hadoop distributed file system (HDFS) on your cluster. multinode clusters have at least one.

1 = task 2 = master 3 = core

emr archives log files to s3 at ___ minute intervals

emr log files are available even after the cluster terminates? t or f

emr - by default log data is stored on core node. t or f

f data is stored on master

t or f EMR you can configure replication to s3 on 5 min intervals for all log data from the master node, however, this can only be configured when creating the cluster for the first time.

mysql default port is \_\_\_

3306

When you add a rule to an RDS DB security group, you must specify a port number or protocol.

false a destination port is needed, but the rds instance port numbers is automatically applied to the rds db sg.

If you are using Amazon RDS Provisioned IOPS storage with a Microsoft SQL Server database engine, what is the maximum size RDS volume you can have by default?

16tb

What happens to the I/O operations of a single-AZ RDS instance during a database snapshot or backup?

I/O may be briefly suspended while the backup process initializes (typically under a few seconds), and you may experience a brief period of elevated latency.

In RDS, what is the maximum value I can set for my backup retention period?

35 days

Databases Flashcards

(90 cards)