14. Miscellaneous Flashcards

1
Q

What is the primary purpose of AWS Glue?

A

AWS Glue is a fully managed ETL (Extract, Transform, Load) service that prepares data for analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: Amazon Redshift is a fully managed data warehouse service.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Fill in the blank: AWS __________ allows for serverless data integration.

A

Glue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the acronym ETL stand for?

A

Extract, Transform, Load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which AWS service is primarily used for real-time data streaming?

A

Amazon Kinesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the maximum number of nodes in an Amazon Redshift cluster?

A

128

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multiple choice: Which service can be used to automate the extraction of data from multiple sources? A) AWS Lambda B) AWS Data Pipeline C) Amazon CloudWatch

A

B) AWS Data Pipeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False: Amazon S3 is an ideal storage solution for big data analytics.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does AWS Lake Formation help to create?

A

A secure data lake

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: __________ is a managed service for stream processing in AWS.

A

Amazon Kinesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the main benefit of using Amazon EMR?

A

It allows for processing vast amounts of data quickly using frameworks like Apache Hadoop and Apache Spark.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Multiple choice: Which of the following is not a data lake storage option in AWS? A) Amazon S3 B) Amazon RDS C) AWS Lake Formation

A

B) Amazon RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False: AWS Data Pipeline can be used to schedule data workflows.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of AWS DataBrew?

A

AWS DataBrew is a visual data preparation tool that helps users clean and normalize data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does Amazon Athena allow you to do?

A

Run SQL queries on data stored in Amazon S3 without needing to set up a data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fill in the blank: AWS __________ provides a way to run machine learning models in the cloud.

A

SageMaker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the primary function of AWS Step Functions?

A

To coordinate multiple AWS services into serverless workflows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Multiple choice: Which service is best for batch processing of large data sets? A) Amazon Kinesis B) AWS Lambda C) Amazon EMR

A

C) Amazon EMR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

True or False: Amazon QuickSight is used for data visualization.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the function of AWS Glue Data Catalog?

A

It acts as a central repository for storing metadata about data assets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Fill in the blank: __________ is an AWS service used for data warehousing.

A

Amazon Redshift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What type of database is Amazon DynamoDB?

A

A fully managed NoSQL database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Multiple choice: Which service would you use to create a data pipeline? A) AWS Lambda B) AWS Glue C) Amazon RDS

A

B) AWS Glue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

True or False: Amazon S3 supports versioning of objects.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is Amazon RDS primarily used for?

A

Managing relational databases in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Fill in the blank: The AWS service __________ is designed for data lake management.

A

Lake Formation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does Amazon EMR stand for?

A

Amazon Elastic MapReduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Multiple choice: Which AWS service allows you to run queries against S3 data using SQL? A) Amazon Redshift B) Amazon Athena C) Amazon RDS

A

B) Amazon Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

True or False: AWS Glue can automatically discover and catalog metadata.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the primary benefit of using Amazon Kinesis Data Firehose?

A

It provides a way to reliably load streaming data into data lakes, data stores, and analytics services.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Fill in the blank: AWS __________ is a service that helps in data preparation and cleaning.

A

DataBrew

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the purpose of Amazon S3 Select?

A

To retrieve a subset of data from an object stored in S3.

33
Q

Multiple choice: Which of the following is a serverless data integration service? A) AWS Glue B) Amazon Redshift C) Amazon EMR

A

A) AWS Glue

34
Q

True or False: Amazon QuickSight supports embedding dashboards into applications.

35
Q

What type of data can be stored in Amazon S3?

A

Any type of data, including structured, semi-structured, and unstructured data.

36
Q

Fill in the blank: AWS __________ provides data analytics and visualization capabilities.

A

QuickSight

37
Q

What is the role of AWS Lambda in data engineering?

A

To run code in response to events without provisioning or managing servers.

38
Q

Multiple choice: Which service is not part of data analytics? A) Amazon Redshift B) AWS Glue C) Amazon EC2

A

C) Amazon EC2

39
Q

True or False: AWS Glue can be used to transform data in real-time.

40
Q

What is the primary function of Amazon RDS?

A

To provide managed relational database services.

41
Q

Fill in the blank: __________ allows you to run Spark jobs on AWS.

A

Amazon EMR

42
Q

What is the purpose of AWS CloudTrail?

A

To log and monitor AWS account activity.

43
Q

Multiple choice: Which of the following services is ideal for time-series data? A) Amazon RDS B) Amazon Timestream C) Amazon S3

A

B) Amazon Timestream

44
Q

True or False: AWS Data Pipeline is a fully managed service for processing data.

45
Q

What does Amazon Timestream specialize in?

A

Time-series data management.

46
Q

Fill in the blank: AWS __________ provides a fully managed data warehouse solution.

47
Q

What is the main advantage of using Amazon S3 for data storage?

A

Scalability and durability.

48
Q

Multiple choice: Which service is best for managing unstructured data? A) Amazon RDS B) Amazon S3 C) Amazon DynamoDB

A

B) Amazon S3

49
Q

True or False: Amazon EMR can automatically scale based on workload.

50
Q

What does the AWS Glue crawler do?

A

It scans your data sources and automatically creates metadata in the Glue Data Catalog.

51
Q

Fill in the blank: __________ is a fully managed data warehousing service from AWS.

A

Amazon Redshift

52
Q

What service would you use for real-time data analytics?

A

Amazon Kinesis Data Analytics

53
Q

Multiple choice: Which of the following services is designed for data lakes? A) Amazon S3 B) Amazon RDS C) AWS Lambda

A

A) Amazon S3

54
Q

True or False: AWS Glue can integrate with both structured and semi-structured data.

55
Q

What is the primary purpose of Amazon Kinesis Data Streams?

A

To collect and process real-time data streams.

56
Q

Fill in the blank: __________ is a managed service that simplifies running Spark applications.

A

Amazon EMR

57
Q

What is the function of Amazon S3 Glacier?

A

To provide low-cost archive storage for data.

58
Q

Multiple choice: Which AWS service is used for data visualization? A) Amazon Redshift B) Amazon QuickSight C) Amazon S3

A

B) Amazon QuickSight

59
Q

True or False: Amazon RDS supports multiple database engines.

60
Q

What does AWS Data Pipeline help with?

A

Orchestrating data workflows and data movement.

61
Q

Fill in the blank: AWS __________ allows for serverless data integration and preparation.

62
Q

What is the primary use case for Amazon Redshift?

A

Data warehousing and analytics.

63
Q

Multiple choice: Which service is best suited for data that needs to be accessed frequently? A) Amazon S3 Standard B) Amazon S3 Glacier C) Amazon S3 Intelligent-Tiering

A

A) Amazon S3 Standard

64
Q

True or False: AWS Glue can perform data transformations.

65
Q

What is the main purpose of AWS Lake Formation?

A

To simplify the setup and management of data lakes.

66
Q

Fill in the blank: __________ allows users to run SQL queries on large datasets stored in S3.

A

Amazon Athena

67
Q

What is the benefit of using Amazon EMR over traditional Hadoop clusters?

A

It provides a scalable, cost-effective solution for big data processing.

68
Q

Multiple choice: Which AWS service is used for event-driven architectures? A) AWS Lambda B) AWS Data Pipeline C) Amazon Redshift

A

A) AWS Lambda

69
Q

True or False: Amazon QuickSight provides machine learning capabilities.

70
Q

What is the main function of Amazon Kinesis Data Firehose?

A

To deliver real-time streaming data to destinations like S3, Redshift, and Elasticsearch.

71
Q

Fill in the blank: __________ is the AWS service designed for managing time-series data.

A

Amazon Timestream

72
Q

What is the primary function of AWS Glue’s ETL jobs?

A

To extract data from sources, transform it, and load it into data stores.

73
Q

Multiple choice: Which service is best for running analytics on structured data? A) Amazon DynamoDB B) Amazon Redshift C) Amazon S3

A

B) Amazon Redshift

74
Q

True or False: AWS Glue can only work with AWS data sources.

75
Q

What does Amazon S3 Select allow users to do?

A

Retrieve a subset of data from an object without having to download the entire object.

76
Q

Fill in the blank: AWS __________ can help manage and optimize data lakes.

A

Lake Formation

77
Q

What is the main function of AWS DataBrew?

A

To provide a visual interface for data preparation and cleaning.

78
Q

Multiple choice: Which of the following services is designed for batch processing? A) Amazon Kinesis B) Amazon EMR C) AWS Lambda

A

B) Amazon EMR