8. Machine Learning & Statistical Concepts Flashcards

Question

What is underfitting?

Answer 1

Underfitting occurs when a model is too simple to capture underlying patterns in the data.

Answer 2

A tradeoff between a model’s ability to generalize (low variance) and fit the training data well (low bias).

Answer 3

A resampling method used to evaluate a model’s performance on unseen data.

Answer 4

A method where the dataset is split into k subsets, and the model is trained k times, each time using a different subset as the validation set.

Answer 5

A special case of k-fold where k equals the number of samples, leaving one sample for validation at each step.

Answer 6

A preprocessing step that normalizes or standardizes data for better model performance.

Answer 7

The phenomenon where high-dimensional data causes issues such as sparsity and increased computational cost.

Answer 8

A dimensionality reduction technique that projects data onto new axes maximizing variance.

Answer 9

Supervised learning involves labeled data, while unsupervised learning deals with unlabeled data.

Answer 10

A function that measures how well a model’s predictions match actual values.

Answer 11

A table that summarizes classification model performance by showing TP, FP, TN, FN.

Answer 12

The proportion of true positives among all positive predictions (TP / (TP + FP)).

Answer 13

The proportion of true positives among all actual positives (TP / (TP + FN)).

Answer 14

The harmonic mean of precision and recall, balancing the two metrics.

Answer 15

A graph showing the performance of a classification model at various thresholds.

Answer 16

A measure of a model’s ability to distinguish between classes, where 1 is perfect and 0.5 is random guessing.

Answer 17

A method that adds absolute values of coefficients to the loss function, encouraging sparsity.

Answer 18

A method that adds squared coefficients to the loss function, preventing overfitting.

Answer 19

A regularization technique that randomly drops units during training to prevent overfitting.

Answer 20

A technique combining multiple models to improve performance.

Answer 21

An ensemble method that trains multiple models on different subsets of data and averages their predictions.

Answer 22

An ensemble method that trains models sequentially, focusing on errors made by previous models.

Answer 23

Overfitting occurs when a model learns the training data too well, capturing noise instead of general patterns.

Answer 24

Underfitting occurs when a model is too simple to capture underlying patterns in the data.

Answer 25

A tradeoff between a model’s ability to generalize (low variance) and fit the training data well (low bias).

Answer 26

A resampling method used to evaluate a model’s performance on unseen data.

Answer 27

A method where the dataset is split into k subsets, and the model is trained k times, each time using a different subset as the validation set.

Answer 28

A special case of k-fold where k equals the number of samples, leaving one sample for validation at each step.

Answer 29

A preprocessing step that normalizes or standardizes data for better model performance.

Answer 30

The phenomenon where high-dimensional data causes issues such as sparsity and increased computational cost.

Answer 31

A dimensionality reduction technique that projects data onto new axes maximizing variance.

Answer 32

Supervised learning involves labeled data, while unsupervised learning deals with unlabeled data.

Answer 33

A function that measures how well a model’s predictions match actual values.

Answer 34

A table that summarizes classification model performance by showing TP, FP, TN, FN.

Answer 35

The proportion of true positives among all positive predictions (TP / (TP + FP)).

Answer 36

The proportion of true positives among all actual positives (TP / (TP + FN)).

Answer 37

The harmonic mean of precision and recall, balancing the two metrics.

Answer 38

A graph showing the performance of a classification model at various thresholds.

Answer 39

A measure of a model’s ability to distinguish between classes, where 1 is perfect and 0.5 is random guessing.

Answer 40

A method that adds absolute values of coefficients to the loss function, encouraging sparsity.

Answer 41

A method that adds squared coefficients to the loss function, preventing overfitting.

Answer 42

A regularization technique that randomly drops units during training to prevent overfitting.

Answer 43

A technique combining multiple models to improve performance.

Answer 44

An ensemble method that trains multiple models on different subsets of data and averages their predictions.

Answer 45

An ensemble method that trains models sequentially, focusing on errors made by previous models.

Answer 46

Overfitting occurs when a model learns the training data too well, capturing noise instead of general patterns.

Answer 47

Underfitting occurs when a model is too simple to capture underlying patterns in the data.

Answer 48

A tradeoff between a model’s ability to generalize (low variance) and fit the training data well (low bias).

Answer 49

A resampling method used to evaluate a model’s performance on unseen data.

Answer 50

A method where the dataset is split into k subsets, and the model is trained k times, each time using a different subset as the validation set.

Answer 51

A special case of k-fold where k equals the number of samples, leaving one sample for validation at each step.

Answer 52

A preprocessing step that normalizes or standardizes data for better model performance.

Answer 53

The phenomenon where high-dimensional data causes issues such as sparsity and increased computational cost.

Answer 54

A dimensionality reduction technique that projects data onto new axes maximizing variance.

Answer 55

Supervised learning involves labeled data, while unsupervised learning deals with unlabeled data.

Answer 56

A function that measures how well a model’s predictions match actual values.

Answer 57

A table that summarizes classification model performance by showing TP, FP, TN, FN.

Answer 58

The proportion of true positives among all positive predictions (TP / (TP + FP)).

Answer 59

The proportion of true positives among all actual positives (TP / (TP + FN)).

Answer 60

The harmonic mean of precision and recall, balancing the two metrics.

Answer 61

A graph showing the performance of a classification model at various thresholds.

Answer 62

A measure of a model’s ability to distinguish between classes, where 1 is perfect and 0.5 is random guessing.

Answer 63

A method that adds absolute values of coefficients to the loss function, encouraging sparsity.

Answer 64

A method that adds squared coefficients to the loss function, preventing overfitting.

Answer 65

A regularization technique that randomly drops units during training to prevent overfitting.

Answer 66

A technique combining multiple models to improve performance.

Answer 67

An ensemble method that trains multiple models on different subsets of data and averages their predictions.

Answer 68

An ensemble method that trains models sequentially, focusing on errors made by previous models.

Answer 69

Overfitting occurs when a model learns the training data too well, capturing noise instead of general patterns.

Answer 70

Underfitting occurs when a model is too simple to capture underlying patterns in the data.

Answer 71

A tradeoff between a model’s ability to generalize (low variance) and fit the training data well (low bias).

Answer 72

A resampling method used to evaluate a model’s performance on unseen data.

Answer 73

A method where the dataset is split into k subsets, and the model is trained k times, each time using a different subset as the validation set.

Answer 74

A special case of k-fold where k equals the number of samples, leaving one sample for validation at each step.

Answer 75

A preprocessing step that normalizes or standardizes data for better model performance.

Answer 76

The phenomenon where high-dimensional data causes issues such as sparsity and increased computational cost.

8. Machine Learning & Statistical Concepts Flashcards

(100 cards)