Udemy lecture 7: Databases & analytics Flashcards

1
Q

What is a relational database?

A

Relational database is when you make a link to multiple tables (ex. a student made 1 table with student ID, Dept ID, Name, Email, & then a second table was made linking to the first one where in the second table it starts with Dept ID, then gives futher information) (think of it like an excel sheet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In relational databases it uses the __________ language to perform queries or lookups

A

SQL (Whenever you hear SQL think of relational databases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

______________ databases are nonrelational databases

A

NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

____________ databases are purpose built for specific data models & have flexible schemas for building modern applications

A

NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some benefits of NoSQL databases?

A
  • Flexible- easy to evolve data model
  • Scalability- designed to scale out by using distributed clusters
  • High-performance- optimized for a specific data model
    -Highly functional- types optimized for data model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does JSON stand for?

A

Javascript object notation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NoSQL can have its data in _________ format

A

JSON

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data can be _______ in the JSON format

A

Nested (storing data using in a structure way, but the fields (information) can change over time so have to change that information (support for new types of arrays))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is AWS responsibility related to databases

A
  • Responsible for the entire database in terms of patching
  • Automated backup & restore, operations, upgrades
  • Monitoring, alerting

-AWS offers to manage different databses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

______ is a relational database

A

RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does RDS stand for?

A

Relational database service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Relational database service?

A

A managed database service for database that will use SQL as a query language, & it will allow you to create databases in the cloud that will be managed by AWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

_________ is a proprietary database from AWS

A

Aurora

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the advantages to using RDS than deploying a database on EC2?

A
  • Automated provisioning, OS patching
  • Continuous backups & restore to specific timestamps (point-in-time restore)!
    -Monitoring dashboards
    -Read replicas for improved read performance
    -Multi-AZ setup for DR (disaster recovery)
  • Maintenance windows for upgrades
  • Scaling capability (vertical & horizontal)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

With RDS databases you can’t connect ________ to it

A

SSH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two kinds of database technologies that aurora supports?

A
  1. PostgreSQL
  2. MySQL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Aurora is supposed to be _________ optimized to yield better performances

A

Cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Aurora storage grows automatically from __________________

A

From 10 gigabytes to 128 terabytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

__________ & ___________ are the two ways to create relational databases on AWS

A

RDS & Aurora (They are both managed & aurora is more cloud-native whereas RDS is going to be running on the technologies you know that is a managed service)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The __________ option for Amazon Aurora is where the database instantiation is going to be automated

A

Serverless (also has auto scaling based on your usage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Both ________ & ___________ are supported as engines of aurora serverless database

A

PostgreSQL & MySQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Aurora serverless is great for _____________ workloads

A

Infrequent/unpredictable workloads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If your see Aurora with no management overhead then think of ______________

A

Aurora serverless

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

___________ can scale the read workload of your database

A

RDS read replicas (can create up to 15 replicas & data is only written to the main database)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
__________ is useful to have in case of AZ outage or main database has problems (high availability)
failover database (so its bascially multi AZ)
26
In the ___________ data is only read/written to the main database & can only have one other AZ as a ________
Failover
27
You can use read replicas in multi- regions & use you it for a ___________ in case of region issue & local performance improve, less latency but also has a replication cost
Disaster recovery
28
____________ is used to get managed Redis or Memcached databses
Elasticache
29
_________ databases are caches that are in-memory databases with high performance & low latency
Redis or Memcached
30
Whenever you see "in-memory" database should think of ___________
elastichache
31
________________ helps reduce load off databases for read-intensive workloads
Elasticache
32
____________ is fully managed and highly available with replication across 3 AZ
DynamoDB
33
DynamoDB is a __________ database
NoSQL database (not a relational database)
34
___________ has a single-digit millisecond latency- low latency retrieval & it scales to massive workloads, distributed "serverless" database
DynamoDB
35
DynamoDB is a __________ database
Key/value
36
_______________ is a fully managed in-memory cache for dynamoDB (will give you a 10x performance improvement)
DynamoDB accelerator-DAX
37
What is the difference between elasicache & dynamoDB DAX?
DAX is only used for & is integrated with dynamnoDB, while elasticache can be used for other databases
38
_______________ make dynamoDB tables accessible with low latency in multiple- regions
DynamoDB global tables
39
With dynamoDB-global tables its an ___________ replication (read/write to any AWS region)
Active-Active
40
______________ is based on PostgreSQL but its not used for OLTP
Redshift
41
Redshift uses ___________
OLAP Online analytical processing which is used to do analytics & data warehousing
42
Redshift stores data in a __________ storage
Columnar (instead of row-based)
43
Redshift __________ is a feature in redshift that allows you to automatically provision & scale data warehouse underlying capacity
Redshift serverless
44
With ________________ you run analytics workloads without managing data warehouse infrastructure
Redshift serverless
45
What does EMR mean?
Elastic Mapreduce
46
________ helps create Hadoop clusters (big data) to analyze & process vast amount of data
EMR
47
In Hadoop the cluster can be made of hundreds of ______________
EC2 instances
48
What are the different use cases for EMR?
- Data processing - Machine learning - Web indexing - Big data
49
__________ is a serverless query service to perform analytics against S3 objects
Amazon athena
50
Amazon Athena uses __________ language to query files
SQL
51
___________ analyze data in S3 using serverless SQL
Amazon athena
52
_____________ is a serverless machine learning-powered business intelligence service to create interactive dashboard
Amazon quicksight
53
___________ is the same for MongoDB (which is a NoSQL database)
DocumentDB
54
________________ is a fully managed graph database
Amazon Neptune
55
A popular graph datasets would be a ____________
Social network
56
What does QLDB mean?
Stands for quantum ledger database
57
A _________ is a book recording financial transactions
ledger
58
_____________ is used to just record financial transaction in AWS
Amazon QLDB
59
Amazon QLDB is used to review history of all the changes made to your application data over time & its an _________ system which means no entry can be removed or modified, cryptographically verifiable
Immutable
60
What is the difference with amazon managed blocked chain and Amazon QLDB?
With Amazon QLDB there is no concept of decentralization, which means there's just a central database owened by amazon but with managed blockchain, its gonna have a decentralized component
61
_______________ makes it possible to build applications where multiple parties can execute transaction without the need for a trusted, central authority
Amazon managed blockchain
62
Amazon managed blockchain is compatible with the frameworks ___________ & __________
Hyperledger fabric & ethereum
63
___________ is a managed extract, transform & load (ETL) service
AWS Glue
64
With __________ you get quick & securely migrated databases to AWS, resilient, self-healing (used to migrate databases)
DMS (Database migration service)
65
___________ is a fully managed, petabyte-scale data warehouse service in the cloud.
Amazon Redshift
66
__________ is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SOL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
Amazon Athena
67
________ is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.
AWS Glue
68
_____________ helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.
AWS Database Migration Service
69
The ____________ is a central repository to store structural and operational metadata for all your data assets. For a given data set, you can store its table definition, physical location, add business relevant attributes, as well as track how this data has changed over time.
AWS Glue Data Catalog
70
____________ is a SOL managed service that makes it easy to set up, operate, and scale a relational database in the cloud. It is suited for OLTP workloads
Amazon Relational Database Service (Amazon RDS)