MODULE 2 S1 Flashcards
M2S1 - Supervised Machine Learning
It predicts consecutive numbers (real numbers).
Regression
T/F
Simply duplicating the same data points or collecting very similar data will not help.
TRUE
If your model performs well on the training set but poorly on the validation set.
Overfitting
It occurs when you fit a model too closely to the particularities of the training set and obtain a model that works well on the training set but is not able to generalize to new data.
Overfitting
This occurs when a model learns the training data too well, including its noise and outliers.
Overfitting
If your model is too simple then you might not be able to capture all the aspects of and variability in the data, and your model will do badly even on the training set. Choosing too simple a model is called ____________.
Underfitting
The two phases of supervised ML process: Training, ________.
Predicting
Input object : __________
Output value : __________
Feature
Label
A model that performs poorly on both training and new data because it hasn’t learned enough from the training data.
Underfitting
It refers to algorithms that address classification problems where the output variable is categorical.
Classification
Where labeled training data refers to a dataset that includes both the input data and the corresponding correct output.
Supervised Learning
It refers to the error from having wrong / too simple assumptions in the learning algorithm.
Bias
Classification : _____________ variable
Regression : ______________ variable
Categorical
Continuous
It predicts one of the possible class labels.
Classification
Two Types of Classification
Binary Classification
Multiple Classification
T/F
Classification algorithms address classification problems where the output variable is categorical.
TRUE
It refers to the error resulting from sensitivity to the noise / fluctuations in the training data
Variance
T/F
The more complex we allow our model to be, the better we will be able to predict on the training data.
TRUE
Categories of Supervised Learning
Classification
Regression
T/F
The larger variety of data points your data set contains, the more complex a model you can use without overfitting.
TRUE
T/F
Model complexity does not depend on the variation of inputs contained in the training dataset.
FALSE
(it is intimately tied)
These concepts helps to understand how well a model performs: Overfitting, Underfitting, _________.
Generalization
If a model is able to make accurate predictions on unseen data, we say it is able to _____________ from the training set to the test set.
Generalize
T/F
The primary objective of the supervised learning technique is to map the input variable with the output variable.
TRUE
These are algorithms that handle regression problems where input and output variables have a linear relationship
Regression
It refers to when a model is built on the training data and then is able to make accurate predictions on new, unseen data.
Generalization
Classification Algorithms
Random Forest Algorithm
Decision Tree Algorithm
Logistic Regression Algorithm
Support Vector Machine Algorithm
Regression Algorithms
Simple Linear Regression Algorithm
Multivariate Regression Algorithm
Decision Tree Algorithm
Ridge Algorithm
Lasso Algorithm
In supervised learning, market trend is an example of _______________
Regression