AWS Machine Learning Foundations Course - Lesson 2 Flashcards

Question

What is the end-to-end training process

Answer 1

1. Feed the training data into the model 2. Compute the loss function on the results 3. Update the model parameters in a direction that reduce loss

Answer 2

If you don’t know how to define the groups, then you can use a cluttering algorithms (unsupervised learning) to segment your clusters of similar customers If you know what groups you would like to have, then you can feed many examples of each group to a classification algorithm (supervised learning) and it will classify all your customers into these groups

Answer 3

reinforcement learning

Answer 4

Roughly the average error across the test dataset, in general as the model improves, the better the RMS result will be

Answer 5

Spam detection is a typical supervised learning problem: the algorithm is fed many emails along with their labels (spam not spam)

Answer 6

Refers to different statistical tools which can be used to calculate missing values from the dataset

Answer 7

The process of using machine learning to identify different cases based on patterns found in data (example: spam not spam)

Answer 8

The data on which the model will be trained

Answer 9

1. A training dataset | 2. A test dataset

Answer 10

To test against the bias variance - trade off

Answer 11

Allows you to keep some data hidden during training so that data can be used to evaluate your model before you put it into production

Answer 12

1. Clustering 2. Visualization 3. Dimensionality 4. Association Rule Learning

Answer 13

1. Training will be 80% | 2. Test will be 20%

Answer 14

The data withheld from the model during training which is used to test how well your model will generalize to new data

Answer 15

There are no labels for the training data, the algorithm tries to learn the underlying patterns or distributions that govern the data

Answer 16

1. Clustering 2. Association 3. . Dimensionality Reduction

Answer 17

Enables training over larger datasets involving sequences of data - it is a more modern replacement for RNN/LSTMs

Answer 18

1. Regression | 2. Classification

Answer 19

1. Data is labeled (already has the solution) 2. Every training sample from the dataset 3. Has the corresponding label or output 4. Value associated with it and as a result the algorithm learns to predict labels or output

Answer 20

The algorithm figures out which actions to take in a situation to maximize a reward (in the form of a number) on the way to reaching a specific goal

Answer 21

Supervised learning uses labeled input and output data, unsupervised learning does not have/use labeled data

Answer 22

1. Categorial label | 2. Continuous label

Answer 23

A score from -1 to 1 describing the clusters found during modeling

Answer 24

A list of words removed by natural language processing tools when building a dataset

Answer 25

Successful identification of discrete non-overlapping clusters

Answer 26

If you use all the data you have collected during training, you won’t have any with which to test the model during the model evaluation phase

Answer 27

Hyperparameters are not updated during model training and are set manually

Answer 28

1. Generating predictions 2. finding patterns in your data 3. using a trained model 4. testing your model on data it has not seen before

Answer 29

overlapping clusters

Answer 30

Processing sequences of data

Answer 31

Structured to effectively represent for loops in traditional computing, collecting data while iterating over some object

Answer 32

a training set that contains the desired solution(aka label) for each instance

Answer 33

Data that already contains the solution

Answer 34

A mathematical term for a flat surface (like a piece of paper) on which two points can be joined by a straight line

Answer 35

Simple computational units of neural networks

Answer 36

Mathematical representations of how much information to allow to flow from one neuron to the next The trainable model parameters that are the connections between neural networks

Answer 37

1. FNNN 2. CNN 3. RNN/LSTM 4. Transformer

Answer 38

Acollection of very simple models connected together

Answer 39

Data Visualization

Answer 40

Data points that are significantly different from others in the same sample

Answer 41

Settings or configurations the training algorithm can update to change how well the model behaves

Answer 42

1. Weights | 2. Biases

Answer 43

Iteratively update model parameters to minimize some loss function

Answer 44

They work through an iterative process where the current model iteration is analyze to determine what changes can be made to get closer to the goal. Those changes are made and the iteration continues until the model is evaluated to meet the goals

Answer 45

Randomly split the data set

Answer 46

When the trained model is used to generate predictions

AWS Machine Learning Foundations Course - Lesson 2 Flashcards

(83 cards)