Databases and Analytics Flashcards

1
Q

What is Amazon RDS?

A

Amazon RDS (Relational Database Service) is a managed DB service for DBs that usse SQL as a query language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Amazon Aurora?

A

Aurora is a proprietary technology from AWS that supports PostgreSQL and MySQL as a DBMS.

Cloud Optimised Service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a read replica for RDS?

A

Allows you to scale the read workload of your DB.

Can create up to 15 Read Replicas, but remember that data is only written to the main DB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If you have a multi-region RDS deployment, do local applications write to the RDS read replica or to the main RDS DB?

A

Read replicas allow faster performance for applications in the same region.

Writes are done by applications to the main RDS DB, not the read replica.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Amazon ElastiCache?

A

Amazon ElastiCache provides managed Redis or Memcached. These are in-memory databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What Amazon service would you use if you needed an in-memory database with high performance and low latency?

A

Amazon ElastiCache.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of a Cache?

A

Caches help to reduce load off of databases to reduce read intensive workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is DynamoDB?

A

DynamoDB is a fully managed and highly available NoAQL database that is fast (low latency), massively scalable and serverless.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is another name for DAX?

A

DynamoDB Accelerator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is DynamoDB Accelerator?

A

A fully managed, in memory-cache for DynamoDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If you need a cache for DynamoDB, what service would you use and why?

A

You would use DynamoDB Accelerator.

This is because it is purpose built for DynamoDB and is fully integrated into the database service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does OLAP stand for?

A

Online Analytical Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Redshift?

What are the use cases for Redshift?

A

Database based on PostgreSQL for OLAP.

Redshift is used for Data Analytics and Data Warehousing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How often is data loaded into Redshift?

A

Every hour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Does Redshift support visualization?

A

Yes it integrates with BI tools like AWS QuickSight or Tableau.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does Amazon EMR stand for?

A

Amazon MapReduce.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Amazon EMR used for?

A

EMR helps create (provision/configure) Hadoop clusters to analyse and process large amounts of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the use cases for Amazon EMR?

A

Data Processing, ML, Web Indexing, Big Data

19
Q

What is Amazon Athena?

A

Athena is a fully serverless database with SQL capabilities used to query data in S3.

20
Q

How are you charged for using Amazon Athena?

A

Pay per query.

21
Q

What is Amazon QuickSight?

A

QuickSight is a serverless BI tool that allows you to build visualisations (dashboards).

22
Q

What is DocumentDB?

A

DocumentDB is Amazon’s MongoDB service (NoSQL DB).

23
Q

What is Amazon Neptune?

A

Amazon’s graph database, used for highly connected datasets.

24
Q

What does QLDB stand for?

A

Quantum Ledger Database.

25
Q

What is a ledger?

A

It is a book of recording financial transactions.

26
Q

What is Amazon QLDB used for?

A

It is a managed ledger database, that is used to review history of all the changes made to your application data over time.

27
Q

What is the difference between Amazon QLDB and Amazon Managed Blockchain?

A

With QLDB there is a central authority component, as opposed to Amazon Managed Blockchain which has a decentralised component.

28
Q

What is Amazon Managed Blockchain?

A

Allows you to build applications here multiple parties can execute transactions without the need for a trusted, central authority.

29
Q

What frameworks are compatible with Amazon Managed Blockchain?

A

Hyperledger Fabric

Ethereum

30
Q

What is Amazon DMS used for?

A

It is a database migration service that supports movement from one database to another.

31
Q

What is Amazon Glue used for?

A

It is a fully serverless ETL service.

32
Q

Which exclusive DynamoDB feature is an in-memory cache that can improve your performance up to 10x?

A

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.

33
Q

True or False?

RDS Multi-AZ deployments’ main purpose is high availability, while RDS Read replicas’ main purpose is scalability.

A

True.

RDS Multi-AZ deployments’ main purpose is high availability, and RDS Read replicas’ main purpose is scalability. Moreover, Multi-Region deployments’ main purpose is disaster recovery and local performance.

34
Q

You want to create a decentralized blockchain on AWS. Which AWS service would you use?

A

Amazon Managed Blockchain is a fully managed service that makes it easy to create and manage scalable blockchain networks using the popular open source frameworks Hyperledger Fabric and Ethereum. It allows multiple parties to execute transactions without the need of a trusted, central authority.

35
Q

Which AWS database is a data warehouse?

A

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud.

36
Q

Which AWS database is fully serverless and has SQL capabilities?

A

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

37
Q

You would like to use a serverless service to prepare data so it can be loaded for analytics. Which service would you use?

A

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

38
Q

Which relational database is a proprietary technology from AWS and is cloud-optimized?

A

Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. It is a proprietary technology from AWS.

39
Q

You would like to migrate databases to AWS while still being able to use the database during the migration. What service allows you to do this?

A

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

40
Q

How can you create Hadoop clusters to analyze and process a vast amount of data?

A

Amazon EMR is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data

41
Q

Which in-memory AWS database can you use to reduce the load off databases and has high performance, low latency?

A

Amazon ElastiCache is a web service that makes it easy to deploy and run Memcached or Redis protocol-compliant server nodes in the cloud. ElastiCache caches are in-memory databases with high performance, low latency. They help reduce load off databases for read intensive workloads.

42
Q

What is the name of a central repository to store structural and operational metadata for data assets in AWS Glue?

A

The AWS Glue Data Catalog is a central repository to store structural and operational metadata for all your data assets. For a given data set, you can store its table definition, physical location, add business relevant attributes, as well as track how this data has changed over time.

43
Q

Which of the following databases is a managed service with SQL capability suited for Online Transaction Processing (OLTP)?

A

Amazon Relational Database Service (Amazon RDS) is a SQL managed service that makes it easy to set up, operate, and scale a relational database in the cloud. It is suited for OLTP workloads

44
Q

You would like to set up a NoSQL database that can scale with no downtime and can handle millions of requests per second. Which AWS database is best suited for this work?

A

DynamoDB is a fast and flexible non-relational database service for any scale. It can scale with no downtime, it can process millions of requests per second, and is fast and consistent in performance.