AWS Certified Machine Learning - Specialty (MLS-C01) Flashcards

1
Q

NOTE: You can use S3 while training ML models with Sagemaker. S3 is integrated with Sagemaker to store training data and training output.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Amazon FSx for Lustre

A

Use when training data already in S3 and plan on run training jobs several times during different algos and parameters.

Speeds up tainkngnjobs by serving S3 data to Sagemaker at high speeds by copying data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NOTE: If training data is already in AWS EFS then recommends using that as training data source.

EFS has benefit of directly launching training jobs without need for data movement, resulting in faster training start times.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon Kinesis

A

Recommended for ingesting fast moving / real-time data. Allows you to build custom streaming data applications for specialized needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Amazon Elastic MapReduce (EMR)

A

Provided a managed framework that can process massive quantities of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

AWS Athena

A

AWS Athena is a powerful, serverless query service that enables you to analyze data directly in Amazon S3 using standard SQL without complex ETL processes or infrastructure management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

AWS Glue

A

AWS Glue is a fully managed ETL service that makes preparing and loading your data for analytics easy. It provides a serverless environment to create, run, and monitor ETL jobs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

AWS Ground Truth

A

Amazon SageMaker Ground Truth helps you build highly accurate training datasets for machine learning quickly.

SageMaker Ground Truth offers easy access to public and private human labelers and provides them with built-in workflows and interfaces for common labeling tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

AWS Mechanical Turk

A

Amazon Mechanical Turk (MTurk) is a crowdsourcing marketplace that makes it easier for individuals and businesses to outsource their processes and jobs to a distributed workforce who can perform these tasks virtually. This could include anything from conducting simple data validation and research to more subjective tasks like survey participation, content moderation, and more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly