SageMaker - Built-In Algorithms Flashcards

1
Q

Does SageMaker handle the entire ML workflow?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a SageMaker notebook?

A

It is just a notebook that is spun up from the console.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can you use Scikit_learn, Spark, and Tensorflow from a SageMaker notebook?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can you launch servers from your SageMaker notebook?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the SageMaker Input Mode S3 File Mode?

A

It is the default. It copies the data to the docker container. This is okay for smaller datasets, but not large ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the SageMaker Input Mode S3 Fast File Mode?

A

It streams the data from the S3 source. This was a replacement for Pipe Mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the SageMaker Input Mode S3 Express One Zone Mode?

A

It is a high performance storage class in one AZ. Works with other input modes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the SageMaker Input Mode Amazon FsX for Lustre?

A

This is for HPC and 100s of GB of throughput. This is really meant for large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the SageMaker Input Mode EFS Mode?

A

Uses EFS as a file system for the source data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the Linear Learner Model in SageMaker?

A

It handles linear regression. This is used for predications and classifications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If your model training is taking too much time to get started, what can you do?

A

Use pipe mode which will stream the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Does Linear Learner require nomalized data?

A

Yes. This can be done in advance or within the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What kind of regularization does Linear Learner support?

A

L1 and L2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is XGBoost?

A

It is a boosted group of decision trees. The new trees made to correct the errors of the previous trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you prevent overfitting when using XGBoost?

A

Use the subsample or Eta hyperparameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is XGBoost memory or CPU bound?

A

Memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is LightGBM?

A

A gradient boosting decision tree. Like XGBoost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are good use cases for LightGBM?

A

Classification, Regression, or Ranking

19
Q

What does the Seq2Seq model do?

A

It takes an input series of tokens and outputs a series of tokens.

20
Q

What is Seq2Seq good for?

A

Machine translation
Speech to text
Text summarization

21
Q

What is Seq2Seq often implemented with?

A

RNNs and CNNs

22
Q

Are there pre-trained Seq2Seq models available in SageMaker?

23
Q

What can Seq2Seq optimize on?

A

Accuracy

BLEU Score

Perplexity

24
Q

What is the DeepAR model used for?

A

Forecasting one-dimensional time series data.

25
What is BlazingText?
It can predict labels for a sentence or to create vector representations of words.
26
Is BlazingText used for sentences or documents.
Sentences only.
27
What is Word2vec in BlazingText?
It creates a vector representation of a word. It does not work on sentences. It finds words similar to eachother.
28
What is Object2Vec used for?
Finding similar objects. Similar to word2Vec, but for objects.
29
What are some good use cases for Object2vec
Genre prediction, neared neighbors of objects, recommendations
30
What is SageMaker Object Detection?
It identifies all objects in an images with bounding boxes. Detection and classification.
31
What is SageMaker Image Classification?
It assigns one or more labels to an image. It does not perform object recognition.
32
What is the SageMaker Semantic Segmentation algorithm?
It is pixel level object classification.
33
What is SageMaker Random Cut Forest?
It is anomaly detection.
34
Does Random Cut Forest support file or pipe mode?
Both
35
What is the Neural Topic Model in SageMaker?
It organizes documents into topics.
36
Is Neural Topic Model in SageMaker supervised or unsupervised?
Unsupervised.
37
What is the LDA model in Sagemaker for?
Topic modeling. Similar to Neural Topic Model, but not using deep learning.
38
What is KNN in SageMaker for?
Finds the closest points to your sample and returns the most frequent label. Nearest neighbor.
39
Does KNN in SageMaker include a dimensionality reduction stage?
Yes
40
What is K-Means Clustering in SageMaker?
Unsupervised clustering algorithm. Finds clusters of data.
41
What is Principal Component Analysis (PCA) in SageMaker?
It performs dimensionality reduction.
42
Is Principal Component Analysis (PCA) in SageMaker supervised or unsupervised?
Unsupervised.
43
What does Factorization Machine Models do in SageMaker?
They deal with sparse data. Click predictions, recommendations, etc..