Lecture 2 Flashcards

1
Q

Terminology Mapping: Number of observations, Data set size, Variables, Dependants variable, Coefficient.

A

Number of observations (Statistics) = Number of samples (Machine Learning)
Data set size (Statistics) = Sample size (Machine Learning)
Variables (Statistics) = Features (Machine Learning)
Dependent variable (Statistics) = Label (Machine Learning)
Coefficient (Statistics) = Weight (Machine Learning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the purpose of splitting data into training and test sets?

A

Training set: Used to train the ML model.
Test set: Used to evaluate the model’s performance on unseen data.
Why split? To prevent overfitting, where a model memorizes training data but fails to generalize.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is shuffle=False important for time-series data splitting?

A

Time-series data relies on temporal order.
Setting shuffle=False preserves the sequence, ensuring future data isn’t used to predict past events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Distinguish classification and regression.

A

Classification: Predicts a discrete class label (e.g., spam/not spam).
Regression: Predicts a continuous value (e.g., stock price).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name models that work for both classification and regression.

A
  • K-Nearest Neighbors
  • Decision Trees
  • Support Vector Machines (SVMs)
  • Ensemble Methods
  • ANNs (including deep neural networks)
    Exceptions: Linear/Logistic Regression are task-specific.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

List common supervised ML algorithms.

A
  • Linear Regression
  • Regularized Regression (Ridge, LASSO, Elastic Net)
  • Logistic Regression
  • K-Nearest Neighbors (KNN)
  • Support Vector Machines (SVMs)
  • Naive Bayes Classifiers
  • CART (Classification and Regression Trees)
  • ANN-Based models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the No Free Lunch Theorem.

A

No single ML algorithm works best for all problems.
Performance depends on assumptions about the data.

model –> simplification–> assumptions –> fail.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two steps to train a linear regression model?

A
  • Define a loss function: Residual Sum of Squares (RSS)
  • Minimize the loss: Adjust coefficients to fit the data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Strengths and weaknesses of linear regression.

A
  • Strengths: Simple, interpretable, no parameters.
  • Weaknesses: Prone to overfitting, assumes linearity, sensitive to multicollinearity.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the goal of regularized regression?

A

Add penalties (L1/L2) to coefficients to reduce overfitting.
Types:
* Ridge (L2): Shrinks coefficients toward zero.
* LASSO (L1): Sets some coefficients to zero (feature selection).
* Elastic Net: Combines L1 and L2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ridge regression minimizes which objective function?

A

RSS + λ∑j=1pβj2
λ controls penalty strength.
Larger λ → simpler model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does LASSO differ from Ridge?

A

LASSO uses L1 penalty (∑|βj|), forcing some coefficients to exactly zero.
Enables automatic feature selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Elastic Net?

A

Combines L1 and L2 penalties.
Requires tuning two parameters: λ (strength) and α (L1/L2 mix).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is logistic regression a classification algorithm?

A

Logistic regression is a classification algorithm because it predicts probabilities for class labels and applies a threshold (e.g., 0.5) to assign a category to each observatiom, using logistic function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Strengths and weaknesses of logistic regression.

A
  • Strengths: easy to implement, Interpretable, works for linearly separable data.
  • Weaknesses: Overfits when there are many features, cannot model complex relationships(non-linear or complex relation between x and y), and handle badly multicollinearity.
17
Q

What makes KNN a lazy learner?

A

No explicit training phase; memorizes the entire dataset.
Predictions rely on distance metrics (e.g., Euclidean) to find nearest neighbors.

18
Q

How does KNN handle classification vs. regression?

A
  • Classification: Majority vote of neighbors.
  • Regression: average (or weighted average) of the target values of its K nearest neighbors
19
Q

What are KNN’s main weaknesses?

A
  • Slow predictions with large datasets.
  • Requires feature scaling.
  • Performs poorly on sparse data.
20
Q

Key parameters for KNN.

A
  • Number of neighbors (k): Small values (3-5) often work.
  • Distance metric: Default is Euclidean.