SageMaker - Built-In Algorithms Flashcards

1
Q

Does SageMaker handle the entire ML workflow?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a SageMaker notebook?

A

It is just a notebook that is spun up from the console.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can you use Scikit_learn, Spark, and Tensorflow from a SageMaker notebook?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can you launch servers from your SageMaker notebook?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the SageMaker Input Mode S3 File Mode?

A

It is the default. It copies the data to the docker container. This is okay for smaller datasets, but not large ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the SageMaker Input Mode S3 Fast File Mode?

A

It streams the data from the S3 source. This was a replacement for Pipe Mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the SageMaker Input Mode S3 Express One Zone Mode?

A

It is a high performance storage class in one AZ. Works with other input modes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the SageMaker Input Mode Amazon FsX for Lustre?

A

This is for HPC and 100s of GB of throughput. This is really meant for large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the SageMaker Input Mode EFS Mode?

A

Uses EFS as a file system for the source data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the Linear Learner Model in SageMaker?

A

It handles linear regression. This is used for predications and classifications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If your model training is taking too much time to get started, what can you do?

A

Use pipe mode which will stream the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Does Linear Learner require nomalized data?

A

Yes. This can be done in advance or within the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What kind of regularization does Linear Learner support?

A

L1 and L2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is XGBoost?

A

It is a boosted group of decision trees. The new trees made to correct the errors of the previous trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you prevent overfitting when using XGBoost?

A

Use the subsample or Eta hyperparameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is XGBoost memory or CPU bound?

A

Memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is LightGBM?

A

A gradient boosting decision tree. Like XGBoost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are good use cases for LightGBM?

A

Classification, Regression, or Ranking

19
Q

What does the Seq2Seq model do?

A

It takes an input series of tokens and outputs a series of tokens.

20
Q

What is Seq2Seq good for?

A

Machine translation
Speech to text
Text summarization

21
Q

What is Seq2Seq often implemented with?

A

RNNs and CNNs

22
Q

Are there pre-trained Seq2Seq models available in SageMaker?

A

Yes

23
Q

What can Seq2Seq optimize on?

A

Accuracy

BLEU Score

Perplexity

24
Q

What is the DeepAR model used for?

A

Forecasting one-dimensional time series data.

25
Q

What is BlazingText?

A

It can predict labels for a sentence or to create vector representations of words.

26
Q

Is BlazingText used for sentences or documents.

A

Sentences only.

27
Q

What is Word2vec in BlazingText?

A

It creates a vector representation of a word. It does not work on sentences. It finds words similar to eachother.

28
Q

What is Object2Vec used for?

A

Finding similar objects. Similar to word2Vec, but for objects.

29
Q

What are some good use cases for Object2vec

A

Genre prediction, neared neighbors of objects, recommendations

30
Q

What is SageMaker Object Detection?

A

It identifies all objects in an images with bounding boxes. Detection and classification.

31
Q

What is SageMaker Image Classification?

A

It assigns one or more labels to an image. It does not perform object recognition.

32
Q

What is the SageMaker Semantic Segmentation algorithm?

A

It is pixel level object classification.

33
Q

What is SageMaker Random Cut Forest?

A

It is anomaly detection.

34
Q

Does Random Cut Forest support file or pipe mode?

A

Both

35
Q

What is the Neural Topic Model in SageMaker?

A

It organizes documents into topics.

36
Q

Is Neural Topic Model in SageMaker supervised or unsupervised?

A

Unsupervised.

37
Q

What is the LDA model in Sagemaker for?

A

Topic modeling. Similar to Neural Topic Model, but not using deep learning.

38
Q

What is KNN in SageMaker for?

A

Finds the closest points to your sample and returns the most frequent label. Nearest neighbor.

39
Q

Does KNN in SageMaker include a dimensionality reduction stage?

A

Yes

40
Q

What is K-Means Clustering in SageMaker?

A

Unsupervised clustering algorithm. Finds clusters of data.

41
Q

What is Principal Component Analysis (PCA) in SageMaker?

A

It performs dimensionality reduction.

42
Q

Is Principal Component Analysis (PCA) in SageMaker supervised or unsupervised?

A

Unsupervised.

43
Q

What does Factorization Machine Models do in SageMaker?

A

They deal with sparse data. Click predictions, recommendations, etc..