Modeling 2 Flashcards

1
Q

Object Detection

A

it detects objects in an image with bounding boxes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does Object Detection work ?

A

with a single deep neural network

CNN with Single shot multibox
Detector (SDD) algorithm
- CNN can be VGG-16 or ResNet-50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how does Object Score provide confidence?

A

using a confidence score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to train object detection?

A

i. train from scratch

ii. use pre-trained models based on ImageNet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Object Detection input?

A

RecordIO / image format (JPG, PNG)

for training images
JSON to provide metadata like bounding boxes and labels per image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Object Detection output

A

all instances of objects in the image with categories and confidence scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Object Detection transfer learning mode

A

use pre-trained model for the base network weights, instead of random initial weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how does Object Detection avoid over fitting?

A

flip
rescale
jitter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Object Detection hyperparameters

A

usual ones in a CNN

mini_batch_size
learning_rate
optimizer
- sgd, adam, rmsprop, adadelta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Object Detection instance types

A

GPU instances for training (honestly it’s the demanding CNN)

multi GPU multi Machine (scales up nicely)

ml. p2.xlarge
ml. p2.8xlarge
ml. p2.16xlarge
ml. p3.2xlarge
ml. p3.8clarge
ml. p3.16xlarge

for inference:
CPU or GPU
C5, M5, P2, P3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Image Classification

A

like Object detection but simpler

doesn’t tell you where objects are but gives you label for the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Image Classification Input

A

Apache MXNet RecordIO

  • not protobuf
  • for interoperability with other deep learning frameworks

Raw jpg, png images

image format requires .lst files to associate

  • image index
  • class label
  • path to image

augmented manifest image format enables pipe mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is pipe mode?

A

allows you to stream data from s3 instead of copy the data over

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Image Classification works?

A

ResNet CNN

full training mode:
- network initialized with random weights

Transfer Learning mode:

  • initialized with pre-trained weights
  • top fully-connected layer is initialized with random weights
  • network is fine-tuned with new training data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Default image specifications for Image Classification

A

224x224
3-channel
(imageNet’s dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Image Classification hyperparameters

A

batch size
learning rate
optimizer

optimizer-specific parameters

  • weight decay
  • beta 1
  • beta 2
  • eps
  • gamma
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Image classification instance types

A

Multi-gpu multi-machine
GPU instance for training (p2,p3)

GPU or CPU for inferences (C4, p2, p3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Semantic Segmentation

A

Pixel level object classification
not like object detection with bounding boxes
not like image classification with labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Semantic Segmentation use cases

A

self-driving vehicles
medical imaging diagnosis
robot sensing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Semantic segmentation output

A

segmentation mask

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Semantic Segmentation training

A

jpg, png

label maps to describe annotations
- for training and validation

augmented manifest image format supported for pipe mode

jpg images accepted for inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Semantic Segmentation

A

MXNet Gluon and Gluon CV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Semantic Segmentation algorithms

A

Fully-Convolutional Network (FCN)

Pyramid Scene Parsing (PSP)

DeepLabV3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Choices of backbones for Semantic Segmentation

A

ResNet50
ResNet101
Both trained on ImageNet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Semantic Segmentation training from scratch or incremental

A

both are supported

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Semantic Segmentation hyperparameters

A
epochs
learning rate
batch size
optimizer
algorithm
backbone
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Semantic Segmentation instance types

A

GPU only: P2, P3
Single Machine Only

ml. p2.xlarge
ml. p2.8xlarge
ml. p2.16xlarge
ml. p3.8xlarge
ml. p3.16xlarge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Inference instances for Semantic Segmentation

A

CPU C5, M5

GPU P2, P3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Random cut forest

A

anomaly detection
unsupervised
detect unexpected spikes in time series data
breaks in periodicity
unclassifiable data points
based on an algorithm developed by amazon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

random cut forest output

A

assigns an anomaly score to each data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

random cut forest training input

A

RecordIO-protobuf or CSV

can use file or pipe mode on either

optional test channel for computing accuracy, precision, recall and F1 on labeled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

How does random cut forest work?

A

creates a forest of trees
where each tree is a partition of the training data
looks at expected change in complexity of the tree as a result of adding a point to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

how data is sampled in random cut forest ?

A

Randomly sampled and then trained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

is it possible to use random cut forest in Kinesis Analytics?

A

yes it is. it can work ok streaming data too.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

random cut forest hyperparameters

A

num_trees
- increasing reduces noise

num_samples_per_tree
- 1/num_samples_per_tree approximates the ratio of anomalous to normal data

36
Q

Random cut forest instance types

A

does not use GPU

use M4, C4, C5 for training

ml.c5.xl for inference

37
Q

Neural Topic Modeling

A

organize documents into topics

classify or summarize dox based on topics

not just TF-IDF

unsupervised

38
Q

Neural Topic Modeling algorithm

A

Neural Variational Inference

39
Q

Training input for Neural Topic Modeling

A

Four data channels

  • train is required
  • validation, rest and auxiliary are optional

recordIO-protobuf or CSV

words must be tokenized into integers

every document must contain a count for every word in the vocabulary in CSV

the auxiliary channel is for the vocabulary

file or pipe mode which obviously pipe is faster

40
Q

how to use Neural Topic Modeling

A

define how many topics we have

41
Q

does the Neural Topic Modeling give us topic names ?

A

No, topics are a latent representation based on top ranking words

one of two topic modeling algorithms in SageMaker - you can try them both

42
Q

Neural topic model

important hyperparameters

A

lowering mini_batch_size
and learning_rate can reduce validation loss at expense of training time

num_topics

43
Q

Neural Topic Modeling instance types

A

GPU or CPU
GPU recommended for training
CPU which is cheaper is ok for inference

44
Q

Latent Dirichlet Allocation (LDA)

A

topic modeling not based on Deep Learning

unsupervised
- topics are unlabeled, which means they are just groupings of documents with a shared subset of words

can be used for things other than words

45
Q

how can you use LDA for things other than words ?

A

cluster customers based on purchases

harmonic analysis in music

46
Q

LDA input for training

A

Train Channel, Optional Test Channel

RecordIO-protobuf or CSV

Each doc has counts for every word in vocabulary (CSV)

pipe mode only supported with RecordIO

47
Q

LDA:

un/supervised?

A

unsupervised

48
Q

LDA:

optional test channel can be used for … ?

A

Scoring results

- per-word log likelihood

49
Q

LDA vs Topic modeling

A

similar to NTM but CPU based

- therefore cheaper / more efficient

50
Q

LDA hyperparameters

A

num_topics
alpha0
- initial guess for concentration parameter
- smaller values generate sparse topic mixtures
- larger values (>1.0) produce uniform mixture

51
Q

LDA instance type

A

Single CPU

52
Q

KNN

A

K-Nearest-Neighbors

Simple Classification or regression algorithm

supervised

53
Q

KNN Classification

A

find the K closest points to a sample point and return the most frequent label

54
Q

KNN Regression

A

Find the K closest points to a sample point and return the average value

55
Q

KNN input

A

Training channel, contains data
Test channel, emits accuracy or MSE

RecordIO-protobuf or CSV training
- first column is label

File or Pipe mode, either

56
Q

KNN in SageMaker, how does it work?

A

1- Data is sampled
2- SageMaker includes a dimensionality reduction stage
- avoid sparse data (Curse of dimensionality)
- at cost of noise / accuracy
- sign or fjlt methods

3- built an index for looking up neighbours
4- serialize the model
5- query the model for a given K

57
Q

KNN hyperparameters

A

K!

Sample_size

58
Q

KNN Instance types

A

Training on CPU or GPU

  • ml.m5.2xlarge
  • ml.p2.xlarge

Inference

  • CPU for lower latency
  • GPU for higher throughput on large batches
59
Q

K-Means

A

unsupervised clustering

divide the data into K groups where members of a group are similar as possible to each other

  • you define similar
  • measured by Euclidean distance

SageMaker offers web-scale k-means clustering

60
Q

K-Means input

A

training channel
optional test

  • train ShardedByS3Key,
  • test FullyReplicated

RecordIO-protobuf or CSV
File or Pipe on either

61
Q

K-Mean under the hood

A

every observation mapped to n-dimensional space
n is number of features

works to optimize the center of K clusters
“extra cluster centers” may be specified to improve accuracy (which end up getting reduced ti k)
K = k * x

62
Q

K-Mean Algorithm

A

Determine initial cluster centers

  • random or k-means++approach
  • K-means++tries to make initial clusters far apart

Iterate over training data and calculate cluster centers

Reduce clusters from K to k
- using Lloyd’s method with k-means++

63
Q

K-Mean hyperparameters

A

K!

  • choosing k is tricky
  • plot within-cluster sum of squares as function of K
  • elbow method
  • basically optimize for tightness of clusters

mini_batch_size
extra_center_factor
Init_method

64
Q

K-Mean Instance type

A

CPU or GPU but CPU recommended

only one GPU/instance on GPU
p*.xlarge

65
Q

PCA

Principal Component Analysis

A

Dimensionality Reduction

avoid the curse of dimensionality

while minimizing loss of information

66
Q

PCA

un/supervised?

A

unsupervised

67
Q

what are the reduced dimensions called?

A

Components

first component has largest possible variability
second component has the next largest

68
Q

PCA input

A

recordIO-protobuf or CSV

File or Pipe on either

69
Q

PCA under the hood?

A

Covariance matrix is created
then singular value decomposition (SVD)

Two modes
- regular
for sparse data and moderate number of observation and features

  • randomized
    for large number of observations and features
    uses approximation algorithm
70
Q

PCA hyperparameters

A

Algorithm_mode
Subtract_mean
- unbiased data

71
Q

PCA instance type

A

CPU or GPU

- it depends on the specifics of the input data

72
Q

Factorization Machines

A

Classification and regression

Dealing with Sparse data

73
Q

Factorization Machines use cases

A

Click Prediction
Item Recommendations
Since an individual user doesn’t interact with most pages / products the data is sparse

74
Q

Factorization Machines

un/supervised?

A

supervised

- Classification or Regression

75
Q

is it limited to pair-wise interactions

A

yes. e.g. user - item

76
Q

Factorization Machines input

A

recordIO-protobuf with Float32

- Sparse data means CSV isn’t practical

77
Q

Factorization Machines, how does it work ?

A

Finds factors we can use to predict a classification
e.g. Click or not / Purchase or not

or value (predicted rating?)

given a matrix representing some pair of things
(users and items)

usually used in the context of recommender systems

78
Q

Factorization Machines hyperparameters

A

initialization methods for bias, factors, and linear terms

  • uniform, normal or constant
  • can tune properties of each method
79
Q

Factorization Machines instance types

A

CPU or GPU
CPU recommended
GPU only works for dense data

80
Q

IP insights

A

finding fishy behaviour

identify suspicious behaviour from ip address

identify logins from anomalous ip’s
identify accounts creating resources from anomalous IP’s

81
Q

IP insights

un/supervised

A

unsupervised

82
Q

IP insights input

A
username
account id(raw data no need to pre-process)

training channel, optional validation (computes AUC scores)

CSV only (Entity, IP)

83
Q

IP insights, how is it used?

A

uses a neural network to learn latent vector representations of entities and ip addresses

entities are hashed and embedded.
- need sufficiently large hash size

automatically generates negative samples during training by randomly pairing entities and IP’s

84
Q

IP Insights hyperparameters

A

num_entity_vectors

  • hash size
  • set to twice the number of unique entity identifiers

Vector_dim

  • size of embedding vectors
  • scales model size
  • too large results in overfitting

Epochs, Learning rate, batch size, etc.

85
Q

IP Insights instance type

A
CPU or GPU
GPU recommended 
ml.p3.2xlarge or higher 
can use multiple GPU
size of CPU depends on 
- vector_dim
- num_entity_vectors