Glue Flashcards

This deck aims to help retain concepts related to the Glue service.

1
Q

Which AWS serverless service offers fully managed Extract-Transform-Load (ETL) capabilities, enabling users to prepare and load data for analytics?

A

AWS Glue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the primary purpose of AWS Glue?

A

To facilitate data movement and transformation between sources and destinations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What types of data sources does AWS Glue support?

A
  • Data stores: S3, RDS, JDBC-compatible databases, and DynamoDB
  • Data streams: Kinesis Data Streams and Apache Kafka
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What destinations can AWS Glue write to?

A

S3, RDS, and JDBC-compatible databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does AWS Glue deliver its ETL functionality?

A

By using Glue Jobs, which leverage the Glue Data Catalog, data is extracted from sources, transformed via scripts, and loaded into destinations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What AWS Glue component serves as a metadata repository combined with tools for data management and search?

A

Glue Data Catalog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does the Glue Data Catalog help prevent data silos?

A

By providing a unified Data Catalog for each region within an AWS account

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which AWS services integrate with the Glue Data Catalog?

A

Athena, Redshift Spectrum, EMR, and AWS Lake Formation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does Glue Data Catalog discover data?

A

By using crawlers configured with the necessary credentials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What resources does a Glue Job use?

A

A pool of managed (warm) resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can a Glue Job be triggered?

A

It can be started manually, scheduled using EventBridge, or triggered by events from other sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What AWS service is ideal for a serverless, ad hoc, and cost-effective ETL solution?

A

AWS Glue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which service is utilized by AWS Data Pipeline for processing?

A

Elastic MapReduce (EMR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly