machine learning Flashcards

Question 1

Q

What are model performance metrics?

Answer

A

Performance metrics offer various perspectives on the model’s performance, allowing data scientists to choose appropriate evaluation criteria based on project objectives.

Question 2

Q

What is Ridge Regularization?

Answer

A

Ridge Regularization (L2) modifies overfitted or underfitted models by adding a penalty equal to the sum of the squares of the coefficients.

Question 3

Q

What does Ridge Regression aim to achieve?

Answer

A

Ridge Regression aims to reduce model complexity while keeping all predictors in the model.

Question 4

Q

What does the loss function in Ridge Regression consist of?

Answer

A

The loss function consists of the residual sum of squares (RSS) and a penalty term controlled by λ.

Question 5

Q

What is the effect of a larger λ in Ridge Regression?

Answer

A

A larger λ forces the coefficients to shrink more.

Question 6

Q

What is Lasso Regularization?

Answer

A

Lasso Regularization (L1) prevents overfitting by adding a penalty term that penalizes the sum of the absolute values of the model’s coefficients.

Question 7

Q

What is the main benefit of Lasso Regularization?

Answer

A

Lasso Regularization reduces overfitting by promoting sparsity and implicitly performing feature selection.

Question 8

Q

How does Lasso Regularization affect coefficients?

Answer

A

Lasso can shrink some coefficients to exactly zero, simplifying the model by excluding some features altogether.

Question 9

Q

What is Elastic Net?

Answer

A

Elastic Net is a hybrid method that combines both Ridge and Lasso penalties.

Question 10

Q

When is Elastic Net particularly useful?

Answer

A

Elastic Net is useful when there are many correlated predictors.

Question 11

Q

What are the three main types of regularization techniques?

Answer

A

Ridge Regularization (L2) * Lasso Regularization (L1) * Elastic Net (Hybrid Model)

Question 12

Q

What is the purpose of regularization techniques?

Answer

A

Regularization techniques are used to prevent overfitting and improve model performance.

Question 13

Q

What is the confusion matrix?

Answer

A

A confusion matrix is a table used to describe the performance of a classification model on a set of test data.

Question 14

Q

What does accuracy measure?

Answer

A

Accuracy measures the proportion of correct predictions among the total number of cases examined.

Question 15

Q

What is precision in classification metrics?

Answer

A

Precision is the ratio of true positive predictions to the total positive predictions.

Question 16

Q

What is recall in classification metrics?

Answer

A

Recall is the ratio of true positive predictions to the actual positives in the data.

Question 17

Q

What is the F1 Score?

Answer

A

The F1 Score is the harmonic mean of precision and recall, providing a balance between the two.

Question 18

Q

What does high bias indicate?

Answer

A

High bias indicates that the model is unable to learn the patterns in the data, leading to underfitting.

Question 19

Q

What does high variance indicate?

Answer

A

High variance indicates that the model learns noise from the training data, leading to overfitting.

Question 20

Q

What is the role of regularization in bias and variance?

Answer

A

Regularization techniques help to balance between bias and variance.

Question 21

Q

What is logistic regression used for?

Answer

A

Logistic regression is used for predicting a categorical dependent variable using independent variables.

Question 22

Q

How does logistic regression differ from linear regression?

Answer

A

Logistic regression predicts probabilities for categorical outcomes, while linear regression predicts continuous values.

Question 23

Q

What is the sigmoid function?

Answer

A

The sigmoid function maps predicted values to probabilities between 0 and 1, forming an S-shaped curve.

Question 24

Q

What are the three types of logistic regression?

Answer

A

Binomial * Multinomial * Ordinal

Question 25

Q

What is a key step in logistic regression modeling?

Answer

A

Define the problem by identifying the dependent and independent variables.

Question 26

Q

What is the purpose of exploratory data analysis (EDA) in logistic regression?

Answer

A

EDA visualizes relationships between variables and identifies outliers or anomalies.

Question 27

Q

What is overfitting?

Answer

A

Overfitting occurs when a model performs well on training data but fails to generalize to new, unseen data.

Question 28

Q

What are key indicators of overfitting?

Answer

A

High training accuracy but low test accuracy * High variance * Model complexity

Question 29

Q

What are methods to avoid overfitting?

Answer

A

Simplify the model * Apply regularization * Use cross-validation * Use ensemble methods * Increase training data

Question 30

Q

What is overfitting?

Answer

A

Overfitting occurs when a model learns specific details of the training data, leading to poor generalization on new data.

Question 31

Q

What can help a model generalize better?

Answer

A

Increasing training data can help a model generalize better by exposing it to diverse patterns.

Question 32

Q

What does ROC-AUC stand for?

Answer

A

Receiver Operating Characteristic - Area Under Curve.

Question 33

Q

What does the AUC value represent?

Answer

A

AUC represents the likelihood of the model distinguishing between positive and negative classes.

Question 34

Q

What is a confusion matrix?

Answer

A

A confusion matrix is a table showing the actual vs. predicted classifications.

Question 35

Q

What do TN, TP, FP, and FN stand for in a confusion matrix?

Answer

A

TN - True Negatives
TP - True Positives
FP - False Positives
FN - False Negatives

Question 36

Q

How is accuracy defined?

Answer

A

Accuracy is defined as the ratio of the number of correct predictions to the total number of predictions.

Question 37

Q

What is R-squared (R²)?

Answer

A

R-squared measures how well a statistical model predicts an outcome, with values between 0 and 1.

Question 38

Q

What does an R-squared value of 0.81 indicate?

Answer

A

It indicates that the input variables explain 81% of the variation in the output variable.

Question 39

Q

What is Adjusted R-squared?

Answer

A

Adjusted R-squared adjusts R-squared based on the number of predictors in the model, penalizing for irrelevant variables.

Question 40

Q

What are common techniques of feature engineering?

Answer

A

Handling missing values
Categorical encoding
Feature scaling
Feature creation
Dimensionality reduction
Variable transformations

Question 41

Q

What is feature selection?

Answer

A

Feature selection focuses on choosing a subset of the most relevant features from the available ones.

Question 42

Q

What are filter methods in feature selection?

Answer

A

Filter methods use statistical tests to score features based on their correlation with the target variable.

Question 43

Q

What is the purpose of using regularization techniques?

Answer

A

Regularization techniques reduce overfitting and help in feature selection by penalizing irrelevant features.

Question 44

Q

What is the Mean Absolute Error (MAE)?

Answer

A

MAE is the average of the absolute errors between predicted and actual values.

Question 45

Q

What does RMSE stand for?

Answer

A

Root Mean Squared Error.

Question 46

Q

How is RMSE calculated?

Answer

A

RMSE is calculated as the square root of the mean squared error (MSE).

Question 47

Q

What does precision measure?

Answer

A

Precision measures the ratio of true positives to the sum of true positives and false positives.

Question 48

Q

What is recall also known as?

Answer

A

Recall is also known as sensitivity.

Question 49

Q

What is specificity?

Answer

A

Specificity is the ratio of true negatives to the sum of true negatives and false positives.

Question 50

Q

What does log loss measure?

Answer

A

Log loss measures the performance of a classification model where the prediction is a probability value.

Question 51

Q

What is the F1-score?

Answer

A

F1-score is the harmonic mean of precision and recall.

Question 52

Q

What does AUC-ROC stand for?

Answer

A

Area Under The Curve - Receiver Operating Characteristics.

Question 53

Q

What is the formula for Mean Squared Error (MSE)?

Answer

A

MSE = Σ(y_i - p_i)² / n.

Question 54

Q

What is logistic regression used for?

Answer

A

Logistic regression is used to predict a binary output variable.

Question 55

Q

Fill in the blank: The output variable in logistic regression is transformed using a _______ function.

Question 56

Q

True or False: A model with an AUC score of 0.5 is considered perfect.

Question 57

Q

What happens to R-squared when irrelevant variables are added?

Answer

A

R-squared either stays the same or increases, even if the new variables do not relate to the output variable.

Question 58

Q

What is logistic regression?

Answer

A

A type of classification algorithm used to predict a binary output variable.

Question 59

Q

In logistic regression, what does the output variable get transformed into?

Answer

A

A probability value between 0 and 1.

Question 60

Q

What is Random Forest?

Answer

A

An ensemble technique capable of performing both regression and classification tasks using multiple decision trees.

Question 61

Q

What technique is commonly known as bagging?

Answer

A

Bootstrap and Aggregation.

Question 62

Q

What is the basic idea behind Random Forest?

Answer

A

To combine multiple decision trees in determining the final output.

Question 63

Q

What is the first step in building a decision tree?

Answer

A

Select the root node based on which feature best splits the data.

Question 64

Q

What does a confusion matrix show?

Answer

A

Counts of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).

Answer 63

A

A type of support vector machine used for both classification and regression tasks.

Answer 64

A

The predicted label is determined by the labels of its k nearest neighbors.

Answer 65

A

Instance-based learning.

Answer 66

A

To improve prediction accuracy and avoid underfitting or overfitting.

Answer 67

A

y = θx + b.

Answer 68

A

When a model learns detail and noise in the training data, performing poorly on new data.

Answer 69

A

To model a non-linear relationship between the dependent variable and independent variables.

Answer 70

A

y = β₀ + β₁x + β₂x² + β₃x³ + … + β_dx^d + ϵ.

Answer 71

A

The y-intercept (b), the value of y when x is zero.

Answer 72

A

It performs computations only when making predictions.

Answer 73

A

A supervised learning algorithm that splits the dataset into subsets based on feature values.

Answer 74

A

Prediction
Generalization
Optimization
Evaluation

Answer 75

A

A supervised learning task where the goal is to assign a label to an input based on learned patterns.

Answer 76

A

Binary Classification
Multiclass Classification
Multilabel Classification

Answer 77

A

To assess model performance using appropriate evaluation metrics.

Answer 78

A

To identify relevant features.

Answer 79

A

KNN doesn’t explicitly build a model during the training phase.

Answer 80

A

Overfitting and underfitting.

Answer 81

A

To understand the relationship between one dependent variable and one or more independent variables.

Answer 82

A

Linear Regression
Polynomial Regression
Decision Tree Regression
Random Forest Regression
Support Vector Regression
Ridge Regression
Lasso Regression
ElasticNet Regression
Bayesian Linear Regression

Answer 83

A

To model the relationship between a scalar response and multiple predictors.

Answer 84

A

The highest power of x.

Answer 85

A

overfitting.

Answer 86

A

underfitting.

Answer 87

A

y = β₀ + β₁x + β₂x² + β₃x³ + … + β_dx^d + ϵ

Here, βᵢ are the coefficients, d is the degree of the polynomial, and ϵ is the error term.

Answer 88

A

The complexity of the relationship being modeled

Higher degrees can capture more intricate curves but also risk overfitting.

Answer 89

A

To find the values for β that minimize the error term (ϵ) and provide the best fit for the data.

Answer 90

A

d = 1

This is essentially standard linear regression.

Answer 91

A

U-shaped or inverted U-shaped relationships.

Answer 92

A

It can capture more complex S-shaped curves.

Answer 93

A

They can be used for very intricate relationships but are prone to overfitting.

Answer 94

A

A statistical process for estimating the relationships between dependent and independent variables.

Answer 95

A

When dealing with a dataset that has the target variable in the form of continuous data.

Answer 96

A

A flowchart-like tree structure for classification and prediction.

Answer 97

A

A test on an attribute.

Answer 98

A

Bayes’ Theorem

It is primarily used for classification tasks.

Answer 99

A

Features are independent of each other.

Answer 100

A

Spam filtering
Sentiment analysis
Document classification
Medical diagnosis

Answer 101

A

They follow a normal distribution.

Answer 102

A

Text classification (e.g., bag-of-words model).

Answer 103

A

Binary feature data (e.g., presence or absence of a word in a document).

Answer 104

A

A continuous value.

Answer 105

A

To predict the class or category of the target variable based on input variables.

Answer 106

A

Linear regression
Polynomial regression
Decision trees

Answer 107

A

Logistic regression
Decision trees
Support vector machines
Neural networks

Answer 108

A

real or continuous value.

Answer 109

A

continuous.