11_Machine Learning Flashcards

1
Q

AI vs ML vs DL

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Machine Learning Options

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Machine Learning

  • Process of combining inputs to produce useful predictions on never-before-seen data.
  • Makes a machine learn from data to make predictions on future data, instead of programming every scenario.
  • How it works:
    • Train a model with examples
    • Example = input + label
    • Training = adjust model to learn relationship between features and label - minimize error:
      • Optimize weights and biases (parameters) to different input features.
    • Feature = input variable(s)
    • Inference = apply trained model to unlabeled examples.
    • Separate test and training data ensures model is generalized for additional data.
      • Otherwise, leads to overfitting (only models to training data, not new data)
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Machine Learning Pipeline

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Features and Labels

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Curve Fitting

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Optimization using Gradient Descent

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Machine Learning Types

  • Supervised learning
    • Apply labels to data (“cat”, “spam”)
    • Regression - Continuous, numeric variables:
      • Predict stock price, student test scores
    • Classification - categorical variables:
      • yes/no, decision tree
      • “is this email spam?” “is this picture a cat?”
    • Same types for dataset columns:
      • continuous (regression) and categorical (classification)
      • income, birth year = continuous
      • gender, country = categorical
  • Unsupervised learning
    • Clustering - finding patterns
    • Not labeled or categorized
    • “Given the location of a purchase, what is the likely amount purchased?”
    • Heavily tied to statistics
  • Reinforcement learning
    • Use positive/negative reinforcement to complete a task
      • Complete a maze, learn chess
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Supervised Learning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reinforcement Learning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Model Type - Regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Model Type - Classification

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Model Type - Clustering

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Transfer Learning

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Overfitting

  • Training model overfitted to training data: Unable to generalize with new data
  • Training model fails to generalize: Accounting for slightly different but close enough data.
  • Causes of Overfitting:
    • Not enough training data
      • Need more variety of samples
    • Too many features
      • Too complex
    • Model fitted to unnecessary features unique to training data, a.k.a “Noise”
  • Solving for Overfitting:
    • Use more data:
      • Add more training data
      • More varied data allows for better generalization
    • Make the model less complex:
      • Use less (but more relevant) features = Feature Selection
      • Combine multiple co-dependant/redundant features into a single representative feature
        • This also helps reduce model training time
    • Remove noise
      • Increase regularization parameters
    • Regularization
    • Early Stopping
    • Cross Validation
    • Dropout Methods
  • If data is scarce:
    • Use independent test data
    • Cross Validation
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Regularization

  • Training: minimise (loss (data | model)
  • Regularization: complexity (model)
    • L2 Regularization term
    • L1 Regularization term
  • Training with Regularization: minimise (loss (data | model) + Lambda * complexity (model)
  • Adds a penalty to a model as it becomes more complex
  • Penalizing parameters = better generalization
  • Cuts out noise and unimportant data, to avoid overfitting

Regularization Types

L1 and L2 regularization - Different approaches to tuning out noise. Each has different use case and purpose

  • L1 - Lasso Regression: Assigns greater importance to more influential features
    • Shrinks less important features influence to zero
    • Good for models with many features, some more important than others
    • Example: Choosing features to predict likelihood of home selling:
      • House price more influential feature than carpet color
  • L2 - Ridge Regression: Performs better when all the input features influence the output, and with all weights being roughly equal size.
A
17
Q

Hyperparameters

  • Selection: Hyperparameter values needs to be specified before training begins
  • Types of Hyperparameters:
    • Model hyperparameters relate directly to the model that is selected.
    • Algorithm hyperparameters relate to the training of the model.
  • Training and Tuning: The process of finding the optimal, or near optimal values for hyperparameters.
  • Not related to training data!
  • Examples:
    • Batch size
    • Training epochs
    • Number of hidden layers in neural network
    • Number of nodes in hidden layers in neural network
    • Regularization type
    • Regularization rate
    • Learning rate aka steps size
A
18
Q

Feature Engineering

Transform data so it is fit for Machine Learning.

  • Imputation (for missing data)
  • Outliers and Feature Clipping
  • One-hot Encoding (for categorical data)
  • Linear Scaling
  • Log Scaling
  • Bucketing/Bucketization
  • Feature prioritization
A
19
Q

Techniques Glossary

  • Precision: formula to check how accurate the model is when most of the output are positives.
  • Recall: formula to check how accurate the model is when most of the output are negatives.
  • Gradient Descent: optimization algorithm to find the minimal value of a function. Gradient descent is used to find the minimal RMSE or cost function
  • Dropout Regularization: regularization method to remove random selection of a fixed number of units in a neural network layer. More units dropped out, the stronger the regularization.
A