Database, Analytics, ML Flashcards

1
Q

What is RDS backed by?

A

EBS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What happens when you are running out of provisioned space on an RDS database?

A

AWS will automatically scale it for you

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If you want to limit how much your DB in RDS can hold, what can you do?

A

Set the Maximum Storage Threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How many read replicas can you have in RDS?

A

5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or False: RDS Read Replicas in the same region do not pay the network fee

A

True. Data transfer to read replicas in other AZs have to pay a networking fee

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why would we use RDS Multi AZ?

A

It is primarily used for disaster recovery. It performs synchronous replication to read instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True or False: We cannot set our read replicas in Multi AZ for Disaster Recovery DBs

A

False. We can set our Read Replicas for DR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is RDS Custom?

A

RDS Custom allows us to use Oracle and SQL Server with OS and database customization. We can access the underlying EC2 instance, which RDS managed DB don’t allow us to do

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an Aurora Writer Endpoint?

A

A pointer that points to the Master. If the Master fails, and a Replica DB is promoted, the Writer Endpoint automatically shifts to the new Master.
Therefore we do not have to change out apps endpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an Aurora Reader Endpoint?

A

It is a pointer that points to a Load Balancer that sits in front of all the replica DBs to perform consistent and fault tolerant reads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

We want to run intensive queries on certain Aurora DB instances that have stronger underlying infra. What can we do?

A

Create a Custom Endpoint. Custom Endpoints allow us to target specific Read Replicas for different types of operations or needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Aurora Serverless?

A

It allows us to hand off the instantiation and scaling to AWS. It uses a fleet of DBs provisioned by AWS to hold our data. We do not have to pay for upfront capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Aurora Multi-Master?

A

Every DB Instance is a Read/Write node. If one fails, you can still write to other Master instances. You may have to configure conflict avoidance strategy, like implementing health checks to see if a Writer instance is still available. Great for if you need high write capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Global Aurora?

A

Your DB spans multiple regions, with up to 16 DB Read Instances in each. In one region you have your read/write DB, and up to five read-only secondary regions.

If one entire region fails, it will quickly shift read/write capabilities to another region

Cross-region replication takes less than one second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What two machine learning services can Aurora integrate with?

A

SageMaker - Deploy machine learning models

Comprehend - Uses machine learning to learn insights and connections in text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False: RDS Automated backups don’t expire but manual DB Snapshots do

A

False. Automated backups last for 1 to 35 days, manual DB snapshots don’t expire unless deleted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

If you stop an RDS DB for a long while, what is the recommended protocol?

A

Create a snapshot, delete the DB and then restore from snapshot when needing the DB. Stopped RDS instances still charge for storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Aurora Database Cloning?

A

Aurora DB cloning allows us to create a new Aurora DB Cluster from an existing one. It is faster than a snapshot & restore

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an RDS Proxy?

A

RDS Proxy sits in front of the RDS instance and pools together all the connections.

This can be beneficial as it puts less strain on the RDS instance resources (CPU, RAM) and minimizes connection timeoutes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

True or False: The RDS proxy is available to the public

A

False, It is only available with the VPC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What two DB engines can we choose from with Aurora?

A

Postgres and MySQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What AWS resource can we sit in front of RDS to take the load off of our DB resources?

A

We can use ElastiCache, which will handle caching (note: our applications will still have to implement our Caching strategy)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

We want to have Multi-AZ with Auto-Failover and read replicas for our caching, would we use Redis or Memcached?

A

Redis

https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/WhatIs.html#WhatIs.Overview

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

True or False: ElastiCache supports IAM authentication for Redis and Memcached

A

False, it only supports it for Redis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Which service supports SASL-based authentication?

A

Memcached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is Redis AUTH?

A

It allows you to create a password/token when creating the cluster; that password or token is then used to start making operations against Redis. Also allows for SSL in-flight encryption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the three cache patterns for ElastiCache?

A

Lazy Loading: Application checks for data in cache, if the data is not in cache it retrieves it from the database and then writes it to cache

Write-Through: All the write data is written to cache

Session Store: store temporary data in cache using TTL feature

These three are not mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

You’re planning for a new solution that requires a MySQL database that must be available even in case of a disaster in one of the Availability Zones. What should you use?

A

Enable Multi-AZ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

You have set up read replicas on your RDS database, but users are complaining that upon updating their social media posts, they do not see their updated posts right away. What is a possible cause for this?

A

Read Replicas have async replication, therefore it’s likely your users will miss the most up-to-date info because of eventual consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Which RDS (NOT Aurora) feature when used does not require you to change the SQL connection string: Multi-AZ or Read Replicas?

A

Multi-AZ, keeps the same connection string regardless of which database is up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

An analytics application is currently performing its queries against your main production RDS database. These queries run at any time of the day and slow down the RDS database which impacts your users’ experience. What should you do to improve the users’ experience?

A

Setup read replicas. This will take the load off the main production DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

You would like to ensure you have a replica of your database available in another AWS Region if a disaster happens to your main AWS Region. Which database do you recommend to implement this easily?

A

Aurora Global Database. Multi-AZ won’t work because that is an AZ and not a region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How can you enhance the security of your ElastiCache Redis Cluster by forcing users to enter a password when they connect?

A

Use Redis Auth

34
Q

You have migrated the MySQL database from on-premises to RDS. You have a lot of applications and developers interacting with your database. Each developer has an IAM user in the company’s AWS account. What is a suitable approach to give access to developers to the MySQL RDS DB instance instead of creating a DB user for each one?

A

IAM Database Authentication. This allows IAM users to use an authentication token to access the DB

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html

35
Q

Read replicas use ______ replication and Multi-AZ uses ______ replication

A

Async , Sync

36
Q

How do you encrypt an unencrypted RDS DB instance

A

Create a snapshot of the unencrypted RDS DB instance, copy the snapshot with encryption enabled, then restore the RDS DB instance from the encrypted snapshot

37
Q

What three DB engines are supported with IAM DB Authentication

A

MariaDB, MySQL and PostgresSQL

38
Q

You have an un-encrypted RDS DB instance and you want to create Read Replicas. Can you configure the RDS Read Replicas to be encrypted?

A

No, you can not create encrypted read replicas if the RDS DB instance is unencrypted

39
Q

An application running in production is using an Aurora Cluster as its database. Your development team would like to run a version of the application in a scaled-down application with the ability to perform some heavy workload on a need-basis. Most of the time, the application will be unused. Your CIO has tasked you with helping the team to achieve this while minimizing costs. What do you suggest?

A

Aurora Serverless. AWS will take care of the scaling for spiked workloads. Capacity is adjusted based on application demands

Great for variable workloads especially if they need intensive use

40
Q

How many Aurora Read Replicas can you have in a single Aurora DB Cluster?

A

15

41
Q

You work as a Solutions Architect for a gaming company. One of the games mandates that players are ranked in real-time based on their score. Your boss asked you to design then implement an effective and highly available solution to create a gaming leaderboard. What should you use?

A

Use ElastiCache for Redis - Sorted Sets

docs

42
Q

You need to store long-term backups for your Aurora database for disaster recovery and audit purposes. What do you recommend?

A

Perform On Demand Backups. Backups done automatically persist for as long as needed

43
Q

You have 100 EC2 instances connected to your RDS database and you see that upon a maintenance of the database, all your applications take a lot of time to reconnect to RDS, due to poor application logic. How do you improve this?

A

Use an RDS proxy

44
Q

What is Amazon Athena?

A

A serverless query service that uses SQL to analyze data stored in S3

45
Q

What is Athena Federated Query?

A

It allows you to run queries on other AWS services and on-premise DB; the results can be stored back into S3

docs

46
Q

What is Amazon Redshift?

A

Redshift is a fully managed data warehouse, used for storing and analyzing large amounts of data from several locations

47
Q

What is a data warehouse?

A

A data warehouse is a DB that is designed to analyze large amounts of data

48
Q

True or False: Resdhift is used for OLTP

A

False. It is used for OLAP. OLTP is for transactions, like for an application.
OLAP is used for storing and processing

49
Q

What is a Redshift cluster?

A

A Redshift cluster consists of a leader node and compute nodes. The leader node accepts the query, develops and execution plan. The compute node(s) then run the queries and the leader node accepts back the results

50
Q

How can we move a Redshift cluster to another region?

A

We can manually or automatically create snapshots, copy those to a new region and create a new cluster from that snapshot

51
Q

What are three ways we can get data into Redshift?

A

Kinesis
S3 COPY command (through internet or through VPC)
EC2 Instance through JDBC driver (need to write in batches)

52
Q

What is Redshift Spectrum?

A

Query data that is already in S3 without loading it into S3; we must already have a Redshift Cluster available to start the query. The query is then submitted to thousands of Redshift Spectrum nodes

53
Q

What is Amazon OpenSearch?

A

OpenSearch allows you to search massive amounts of data and retrieve relevant items

54
Q

Why is OpenSearch not considered serverless?

A

It requires the creation of a cluster of instances

55
Q

What is Amazon EMR?

A

Stands for Elastic MapReduce. Helps creating Hadoop clusters to analyze and process vast amounts of data (Big Data)

56
Q

What is Amazon QuickSight?

A

Serverless machine learning-powered business intelligence service to create interactive dashboards

57
Q

What is AWS Glue?

A

It is a managed extract, transform and load (ETL) service

58
Q

What service would we use to convert data into Parquet format?

A

AWS Glue

59
Q

What is AWS Lake Formation?

A

Data lake = central place to have all your data for analytics purpose

60
Q

What is Kinesis Data Analytics?

A

Kinesis Data Analytics read from Kinesis Data Streams or Firehouse, apply SQL or Apache Flake to analyze data and send them to sinks

61
Q

What is Amazon Rekognition?

A

Finds objects, people, text, scenes in images and videos using ML

62
Q

What is Amazon Transcribe?

A

Automatically converts speech into text

63
Q

What is AWS Polly?

A

Turns text into speech using ML

64
Q

What is AWS Translate?

A

Translate localizes content using ML

65
Q

What is Amazon Lex?

A

It is what powers Alexa. Gets speech recognition into text using NLP
Helps build chat bots or call center bots

66
Q

What is Amazon Connect?

A

It is a virtual call center

67
Q

What is AWS Comprehend?

A

Uses Natural Language Processing to find insights and relationships in text

68
Q

What is the difference between Lex and Comprehend?

A

Comprehend is for analytics and insights, Lex is for conversations and interactions

69
Q

What is AWS SageMaker?

A

It is a service to allow developers/data scientists build ML models

70
Q

What is AWS Forecast?

A

Helps you to do predictive modeling and forecasting

71
Q

How many storage nodes are there for Aurora?

A

6 nodes across 3 AZ. Do not confuse this with read replicas, which you can have up to 15 of them

72
Q

How much can Aurora scale to?

A

128 teribytes

73
Q

Give an overview of how Aurora is structured

A

Aurora helps manage MySQL and PostgreSQL DB engines. It provides a DB “cluster” that includes a Master DB instance, Read Replicas (instances) and a storage volume. The storage volume is six storage nodes across 3 AZs

The Master DB will control reading AND writing to the storage volume which spans multiple AZs. The Read Replicas will read from the storage volume

74
Q

What is the Aurora-like NoSQL service?

A

DocumentDB

75
Q

What is AWS fully managed graph db?

A

Amazon Neptune

76
Q

We want managed serverless, scalable Apache Cassandra DB service. What should we use?

A

Amazon Keyspaces

77
Q

What is Amazon Keyspaces generally used for?

A

IoT device info and time-series data

78
Q

What is Amazon QLDB

A

Stands for Quantum Ledger Database. Used to view all the changes made to your application data over time. Data is immutable

79
Q

What is Amazon Timestream?

A

A serverless time series database. Faster and less costly than a relational database

80
Q

You are looking to perform Online Transaction Processing (OLTP). You would like to use a database that has built-in auto-scaling capabilities and provides you with the maximum number of replicas for its underlying storage. What AWS service do you recommend?

A

Aurora

81
Q

As a Solutions Architect, a startup company asked you for help as they are working on an architecture for a social media website where users can be friends with each other, and like each other’s posts. The company plan on performing some complicated queries such as “What are the number of likes on the posts that have been posted by the friends of Mike?”. Which database do you recommend?

A

Neptune. A graph database service that makes it easy to build and run applications with highly connected datasets

82
Q

A startup is working on developing a new project to reduce forest fires due to climate change. The startup is developing sensors that will be spread across the entire forest to make some readings such as temperature, humidity, and pressures which will help detect the forest fires before it happens. They are going to have thousands of sensors that are going to store a lot of readings each second. There is a requirement to store those readings and do fast analytics so they can predict if there is a fire. Which AWS service can they use to store those readings?

A

Timestream