11_Machine Learning Flashcards by Julien Heck

AI vs ML vs DL

How well did you know this?

Not at all

Perfectly

Machine Learning Options

How well did you know this?

Not at all

Perfectly

What is Machine Learning

Process of combining inputs to produce useful predictions on never-before-seen data.
Makes a machine learn from data to make predictions on future data, instead of programming every scenario.
How it works:
- Train a model with examples
- Example = input + label
- Training = adjust model to learn relationship between features and label - minimize error:
  - Optimize weights and biases (parameters) to different input features.
- Feature = input variable(s)
- Inference = apply trained model to unlabeled examples.
- Separate test and training data ensures model is generalized for additional data.
  - Otherwise, leads to overfitting (only models to training data, not new data)

How well did you know this?

Not at all

Perfectly

Machine Learning Pipeline

How well did you know this?

Not at all

Perfectly

Features and Labels

How well did you know this?

Not at all

Perfectly

Curve Fitting

How well did you know this?

Not at all

Perfectly

Optimization using Gradient Descent

How well did you know this?

Not at all

Perfectly

Machine Learning Types

Supervised learning
- Apply labels to data (“cat”, “spam”)
- Regression - Continuous, numeric variables:
  - Predict stock price, student test scores
- Classification - categorical variables:
  - yes/no, decision tree
  - “is this email spam?” “is this picture a cat?”
- Same types for dataset columns:
  - continuous (regression) and categorical (classification)
  - income, birth year = continuous
  - gender, country = categorical
Unsupervised learning
- Clustering - finding patterns
- Not labeled or categorized
- “Given the location of a purchase, what is the likely amount purchased?”
- Heavily tied to statistics
Reinforcement learning
- Use positive/negative reinforcement to complete a task
  - Complete a maze, learn chess

How well did you know this?

Not at all

Perfectly

Supervised Learning

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

How well did you know this?

Not at all

Perfectly

Model Type - Regression

How well did you know this?

Not at all

Perfectly

Model Type - Classification

How well did you know this?

Not at all

Perfectly

Model Type - Clustering

How well did you know this?

Not at all

Perfectly

Transfer Learning

How well did you know this?

Not at all

Perfectly

Overfitting

Training model overfitted to training data: Unable to generalize with new data
Training model fails to generalize: Accounting for slightly different but close enough data.
Causes of Overfitting:
- Not enough training data
  - Need more variety of samples
- Too many features
  - Too complex
- Model fitted to unnecessary features unique to training data, a.k.a “Noise”
Solving for Overfitting:
- Use more data:
  - Add more training data
  - More varied data allows for better generalization
- Make the model less complex:
  - Use less (but more relevant) features = Feature Selection
  - Combine multiple co-dependant/redundant features into a single representative feature
    - This also helps reduce model training time
- Remove noise
  - Increase regularization parameters
- Regularization
- Early Stopping
- Cross Validation
- Dropout Methods
If data is scarce:
- Use independent test data
- Cross Validation

How well did you know this?

Not at all

Perfectly

Regularization

Training: minimise (loss (data | model)
Regularization: complexity (model)
- L2 Regularization term
- L1 Regularization term
Training with Regularization: minimise (loss (data | model) + Lambda * complexity (model)
Adds a penalty to a model as it becomes more complex
Penalizing parameters = better generalization
Cuts out noise and unimportant data, to avoid overfitting

Regularization Types

L1 and L2 regularization - Different approaches to tuning out noise. Each has different use case and purpose

L1 - Lasso Regression: Assigns greater importance to more influential features
- Shrinks less important features influence to zero
- Good for models with many features, some more important than others
- Example: Choosing features to predict likelihood of home selling:
  - House price more influential feature than carpet color
L2 - Ridge Regression: Performs better when all the input features influence the output, and with all weights being roughly equal size.

Hyperparameters

Selection: Hyperparameter values needs to be specified before training begins
Types of Hyperparameters:
- Model hyperparameters relate directly to the model that is selected.
- Algorithm hyperparameters relate to the training of the model.
Training and Tuning: The process of finding the optimal, or near optimal values for hyperparameters.
Not related to training data!
Examples:
- Batch size
- Training epochs
- Number of hidden layers in neural network
- Number of nodes in hidden layers in neural network
- Regularization type
- Regularization rate
- Learning rate aka steps size

Feature Engineering

Transform data so it is fit for Machine Learning.

Imputation (for missing data)
Outliers and Feature Clipping
One-hot Encoding (for categorical data)
Linear Scaling
Log Scaling
Bucketing/Bucketization
Feature prioritization

Techniques Glossary

Precision: formula to check how accurate the model is when most of the output are positives.
Recall: formula to check how accurate the model is when most of the output are negatives.
Gradient Descent: optimization algorithm to find the minimal value of a function. Gradient descent is used to find the minimal RMSE or cost function
Dropout Regularization: regularization method to remove random selection of a fixed number of units in a neural network layer. More units dropped out, the stronger the regularization.