11_Machine Learning Flashcards
1
Q
AI vs ML vs DL

A
2
Q
Machine Learning Options

A
3
Q
What is Machine Learning
- Process of combining inputs to produce useful predictions on never-before-seen data.
- Makes a machine learn from data to make predictions on future data, instead of programming every scenario.
- How it works:
- Train a model with examples
- Example = input + label
-
Training = adjust model to learn relationship between features and label - minimize error:
- Optimize weights and biases (parameters) to different input features.
- Feature = input variable(s)
- Inference = apply trained model to unlabeled examples.
- Separate test and training data ensures model is generalized for additional data.
- Otherwise, leads to overfitting (only models to training data, not new data)

A
4
Q
Machine Learning Pipeline

A
5
Q
Features and Labels

A
6
Q
Curve Fitting

A
7
Q
Optimization using Gradient Descent

A
8
Q
Machine Learning Types
-
Supervised learning
- Apply labels to data (“cat”, “spam”)
- Regression - Continuous, numeric variables:
- Predict stock price, student test scores
- Classification - categorical variables:
- yes/no, decision tree
- “is this email spam?” “is this picture a cat?”
- Same types for dataset columns:
- continuous (regression) and categorical (classification)
- income, birth year = continuous
- gender, country = categorical
-
Unsupervised learning
- Clustering - finding patterns
- Not labeled or categorized
- “Given the location of a purchase, what is the likely amount purchased?”
- Heavily tied to statistics
-
Reinforcement learning
- Use positive/negative reinforcement to complete a task
- Complete a maze, learn chess
- Use positive/negative reinforcement to complete a task
A
9
Q
Supervised Learning

A
10
Q
Reinforcement Learning

A
11
Q
Model Type - Regression

A
12
Q
Model Type - Classification

A
13
Q
Model Type - Clustering

A
14
Q
Transfer Learning

A
15
Q
Overfitting
- Training model overfitted to training data: Unable to generalize with new data
- Training model fails to generalize: Accounting for slightly different but close enough data.
-
Causes of Overfitting:
-
Not enough training data
- Need more variety of samples
-
Too many features
- Too complex
- Model fitted to unnecessary features unique to training data, a.k.a “Noise”
-
Not enough training data
- Solving for Overfitting:
- Use more data:
- Add more training data
- More varied data allows for better generalization
- Make the model less complex:
- Use less (but more relevant) features = Feature Selection
- Combine multiple co-dependant/redundant features into a single representative feature
- This also helps reduce model training time
- Remove noise
- Increase regularization parameters
- Regularization
- Early Stopping
- Cross Validation
- Dropout Methods
- Use more data:
- If data is scarce:
- Use independent test data
- Cross Validation

A
16
Q
Regularization
- Training: minimise (loss (data | model)
-
Regularization: complexity (model)
- L2 Regularization term
- L1 Regularization term
- Training with Regularization: minimise (loss (data | model) + Lambda * complexity (model)
- Adds a penalty to a model as it becomes more complex
- Penalizing parameters = better generalization
- Cuts out noise and unimportant data, to avoid overfitting
Regularization Types
L1 and L2 regularization - Different approaches to tuning out noise. Each has different use case and purpose
-
L1 - Lasso Regression: Assigns greater importance to more influential features
- Shrinks less important features influence to zero
- Good for models with many features, some more important than others
- Example: Choosing features to predict likelihood of home selling:
- House price more influential feature than carpet color
- L2 - Ridge Regression: Performs better when all the input features influence the output, and with all weights being roughly equal size.
A
17
Q
Hyperparameters
- Selection: Hyperparameter values needs to be specified before training begins
-
Types of Hyperparameters:
- Model hyperparameters relate directly to the model that is selected.
- Algorithm hyperparameters relate to the training of the model.
- Training and Tuning: The process of finding the optimal, or near optimal values for hyperparameters.
- Not related to training data!
- Examples:
- Batch size
- Training epochs
- Number of hidden layers in neural network
- Number of nodes in hidden layers in neural network
- Regularization type
- Regularization rate
- Learning rate aka steps size

A
18
Q
Feature Engineering
Transform data so it is fit for Machine Learning.
- Imputation (for missing data)
- Outliers and Feature Clipping
- One-hot Encoding (for categorical data)
- Linear Scaling
- Log Scaling
- Bucketing/Bucketization
- Feature prioritization

A
19
Q
Techniques Glossary
- Precision: formula to check how accurate the model is when most of the output are positives.
- Recall: formula to check how accurate the model is when most of the output are negatives.
- Gradient Descent: optimization algorithm to find the minimal value of a function. Gradient descent is used to find the minimal RMSE or cost function
- Dropout Regularization: regularization method to remove random selection of a fixed number of units in a neural network layer. More units dropped out, the stronger the regularization.
A