Midterm 1 Flashcards

1
Q

The model learns the relationship between inputs and outputs by minimizing _____ between predicted and actual values

A

the difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Supervised learning in regression tasks involves fitting the data to a _____ line using _____ data to predict an output y=h(x) from a given input x.

A

Straight, labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The gradient descent algorithm is an optimization method used to minimize a cost function by iteratively updating model parameters in the direction of the _____, which is the _____ of the function

A

Steepest descent, negative gradient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The update step size is controlled by the _____ , and the algorithm continues until convergence, which is typically defined by a sufficiently small change in the cost function

A

Learning rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

(T/F) Logistic regression can only be used for binary classification problems

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

(T/F) The output of logistic regression is a probability value between 0 and 1

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(T/F) Logistic regression does not assume any relationship between the input features and the output

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

(T/F) Logistic regression uses the sigmoid function to model the probability of a class

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

(T/F) SVMs aim to find a decision boundary that maximizes the margin between classes

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

(T/F) The SVM cost function can be approximated by piecewise linear functions, though this increases computational complexity

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

(T/F) The data points closest to the decision boundary are called support vectors

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

(T/F) SVMs achieve better general by maximizing the margin of separation

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(T/F) Overfitting occurs when a model is too complex and captures noise in the training data

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

(T/F)A model that overfits will have high training accuracy but poor test accuracy

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

(T/F) Overfitting typically happens when the model has too few parameters relative to the training data

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

(T/F) Regularization techniques like L1 or L2 can help prevent overfitting by simplifying the model

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

(T/F) A model that over fits will perform well on both training and test data

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

(T/F) Underfitting typically happens when the model has too many parameters relative to the training data

18
Q

(T/F) Underfitting occurs when a model is too simple to capture the underlying patterns in the data

19
Q

(T/F) Underfitting occurs when a model learns only the noise in the training data

20
Q

(T/F) A model that underfits will have both low training accuracy and low test accuracy

21
Q

(T/F) Increasing model complexity can help address underfitting

22
Q

(T/F) Regularization techniques add a penalty term to the model cost function to reduce the risk of overfitting

23
Q

(T/F) The regularization parameter controls the strength of regularization applied to the model

24
Q

(T/F) Small regularization parameters may allow the model to overfit the training data

25
Q

(T/F) Too large regularization parameters may result in undercutting by making the model too simple

26
Q

(T/F) Regularization can help prevent overfitting even when the training data is small, though more data may be needed for optimal test results

27
Q

(T/F) Increasing regularization always improves model performance, regardless of the situation

28
Q

(T/F) Removing irrelevant or redundant features can help prevent overfitting and improve model generalization

29
Q

(T/F) Adding more training data can reduce overfitting and improve generalization, especially for high-variance models

30
Q

(T/F) Adding polynomial features is a good strategy to fix high-variance (overfitting) problems

31
Q

(T/F) Reducing the number of features by removing irrelevant ones is a method of reducing the risk of overfitting in a machine learning model

32
Q

(T/F) Increasing the regularization parameter(λ) is a method of reducing the risk of overfitting in a machine learning model

33
Q

(T/F) Collecting more training data is a method of reducing the risk of overfitting in a machine learning model

34
Q

(T/F) Using a more complex model with higher polynomial features is a method of reducing the risk of overfitting in a machine learning model

36
Q

Formula for accuracy?

A

(TP + TN) / (TP + FP + TN + FN)

37
Q

When do we use regression?

A

When value is continuous, trying to predict value

38
Q

When do we use classification?

A

When data is labeled, trying to categorize data

39
Q

What is the evaluation metric for regression?

A

MSE, RMSE, R-Squared

40
Q

What is the evaluation metric for classification?

A

Accuracy, Precision, Recall, F-1 Score

41
Q

Formula for precision?

A

TP/(TP + FP)

42
Q

Formula for recall?

A

TP/(TP + FN)

43
Q

Formula for F1-Score?

A

2 * [ (Precision * Recall) / (Precision + Recall)