machine learning Flashcards
What are model performance metrics?
Performance metrics offer various perspectives on the model’s performance, allowing data scientists to choose appropriate evaluation criteria based on project objectives.
What is Ridge Regularization?
Ridge Regularization (L2) modifies overfitted or underfitted models by adding a penalty equal to the sum of the squares of the coefficients.
What does Ridge Regression aim to achieve?
Ridge Regression aims to reduce model complexity while keeping all predictors in the model.
What does the loss function in Ridge Regression consist of?
The loss function consists of the residual sum of squares (RSS) and a penalty term controlled by λ.
What is the effect of a larger λ in Ridge Regression?
A larger λ forces the coefficients to shrink more.
What is Lasso Regularization?
Lasso Regularization (L1) prevents overfitting by adding a penalty term that penalizes the sum of the absolute values of the model’s coefficients.
What is the main benefit of Lasso Regularization?
Lasso Regularization reduces overfitting by promoting sparsity and implicitly performing feature selection.
How does Lasso Regularization affect coefficients?
Lasso can shrink some coefficients to exactly zero, simplifying the model by excluding some features altogether.
What is Elastic Net?
Elastic Net is a hybrid method that combines both Ridge and Lasso penalties.
When is Elastic Net particularly useful?
Elastic Net is useful when there are many correlated predictors.
What are the three main types of regularization techniques?
- Ridge Regularization (L2) * Lasso Regularization (L1) * Elastic Net (Hybrid Model)
What is the purpose of regularization techniques?
Regularization techniques are used to prevent overfitting and improve model performance.
What is the confusion matrix?
A confusion matrix is a table used to describe the performance of a classification model on a set of test data.
What does accuracy measure?
Accuracy measures the proportion of correct predictions among the total number of cases examined.
What is precision in classification metrics?
Precision is the ratio of true positive predictions to the total positive predictions.
What is recall in classification metrics?
Recall is the ratio of true positive predictions to the actual positives in the data.
What is the F1 Score?
The F1 Score is the harmonic mean of precision and recall, providing a balance between the two.
What does high bias indicate?
High bias indicates that the model is unable to learn the patterns in the data, leading to underfitting.
What does high variance indicate?
High variance indicates that the model learns noise from the training data, leading to overfitting.
What is the role of regularization in bias and variance?
Regularization techniques help to balance between bias and variance.
What is logistic regression used for?
Logistic regression is used for predicting a categorical dependent variable using independent variables.
How does logistic regression differ from linear regression?
Logistic regression predicts probabilities for categorical outcomes, while linear regression predicts continuous values.
What is the sigmoid function?
The sigmoid function maps predicted values to probabilities between 0 and 1, forming an S-shaped curve.
What are the three types of logistic regression?
- Binomial * Multinomial * Ordinal
What is a key step in logistic regression modeling?
Define the problem by identifying the dependent and independent variables.
What is the purpose of exploratory data analysis (EDA) in logistic regression?
EDA visualizes relationships between variables and identifies outliers or anomalies.
What is overfitting?
Overfitting occurs when a model performs well on training data but fails to generalize to new, unseen data.
What are key indicators of overfitting?
- High training accuracy but low test accuracy * High variance * Model complexity
What are methods to avoid overfitting?
- Simplify the model * Apply regularization * Use cross-validation * Use ensemble methods * Increase training data
What is overfitting?
Overfitting occurs when a model learns specific details of the training data, leading to poor generalization on new data.
What can help a model generalize better?
Increasing training data can help a model generalize better by exposing it to diverse patterns.
What does ROC-AUC stand for?
Receiver Operating Characteristic - Area Under Curve.
What does the AUC value represent?
AUC represents the likelihood of the model distinguishing between positive and negative classes.
What is a confusion matrix?
A confusion matrix is a table showing the actual vs. predicted classifications.
What do TN, TP, FP, and FN stand for in a confusion matrix?
- TN - True Negatives
- TP - True Positives
- FP - False Positives
- FN - False Negatives
How is accuracy defined?
Accuracy is defined as the ratio of the number of correct predictions to the total number of predictions.
What is R-squared (R²)?
R-squared measures how well a statistical model predicts an outcome, with values between 0 and 1.
What does an R-squared value of 0.81 indicate?
It indicates that the input variables explain 81% of the variation in the output variable.
What is Adjusted R-squared?
Adjusted R-squared adjusts R-squared based on the number of predictors in the model, penalizing for irrelevant variables.
What are common techniques of feature engineering?
- Handling missing values
- Categorical encoding
- Feature scaling
- Feature creation
- Dimensionality reduction
- Variable transformations
What is feature selection?
Feature selection focuses on choosing a subset of the most relevant features from the available ones.
What are filter methods in feature selection?
Filter methods use statistical tests to score features based on their correlation with the target variable.
What is the purpose of using regularization techniques?
Regularization techniques reduce overfitting and help in feature selection by penalizing irrelevant features.
What is the Mean Absolute Error (MAE)?
MAE is the average of the absolute errors between predicted and actual values.
What does RMSE stand for?
Root Mean Squared Error.
How is RMSE calculated?
RMSE is calculated as the square root of the mean squared error (MSE).
What does precision measure?
Precision measures the ratio of true positives to the sum of true positives and false positives.
What is recall also known as?
Recall is also known as sensitivity.
What is specificity?
Specificity is the ratio of true negatives to the sum of true negatives and false positives.
What does log loss measure?
Log loss measures the performance of a classification model where the prediction is a probability value.
What is the F1-score?
F1-score is the harmonic mean of precision and recall.
What does AUC-ROC stand for?
Area Under The Curve - Receiver Operating Characteristics.
What is the formula for Mean Squared Error (MSE)?
MSE = Σ(y_i - p_i)² / n.
What is logistic regression used for?
Logistic regression is used to predict a binary output variable.
Fill in the blank: The output variable in logistic regression is transformed using a _______ function.
logistic
True or False: A model with an AUC score of 0.5 is considered perfect.
False
What happens to R-squared when irrelevant variables are added?
R-squared either stays the same or increases, even if the new variables do not relate to the output variable.
What is logistic regression?
A type of classification algorithm used to predict a binary output variable.
In logistic regression, what does the output variable get transformed into?
A probability value between 0 and 1.
What is Random Forest?
An ensemble technique capable of performing both regression and classification tasks using multiple decision trees.
What technique is commonly known as bagging?
Bootstrap and Aggregation.
What is the basic idea behind Random Forest?
To combine multiple decision trees in determining the final output.
What is the first step in building a decision tree?
Select the root node based on which feature best splits the data.
What does a confusion matrix show?
Counts of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).
What is Support Vector Regression (SVR)?
A type of support vector machine used for both classification and regression tasks.
What is the principle behind K-Nearest Neighbors (KNN)?
The predicted label is determined by the labels of its k nearest neighbors.
What type of learning does KNN represent?
Instance-based learning.
What is the purpose of choosing the right K value in KNN?
To improve prediction accuracy and avoid underfitting or overfitting.
What is the formula for linear regression?
y = θx + b.
What does the term ‘overfitting’ refer to in linear regression?
When a model learns detail and noise in the training data, performing poorly on new data.
What is polynomial regression used for?
To model a non-linear relationship between the dependent variable and independent variables.
What is the general syntax of polynomial regression?
y = β₀ + β₁x + β₂x² + β₃x³ + … + β_dx^d + ϵ.
What does the term ‘bias’ refer to in linear regression?
The y-intercept (b), the value of y when x is zero.
What is a key characteristic of K-Nearest Neighbors (KNN)?
It performs computations only when making predictions.
What is a Decision Tree Classifier?
A supervised learning algorithm that splits the dataset into subsets based on feature values.
What are the major objectives of supervised learning?
- Prediction
- Generalization
- Optimization
- Evaluation
What is the definition of classification in machine learning?
A supervised learning task where the goal is to assign a label to an input based on learned patterns.
What are the types of classification?
- Binary Classification
- Multiclass Classification
- Multilabel Classification
What is the purpose of model evaluation in supervised learning?
To assess model performance using appropriate evaluation metrics.
What is the role of feature selection in the workflow of a classification model?
To identify relevant features.
What does ‘lazy learning’ mean in the context of KNN?
KNN doesn’t explicitly build a model during the training phase.
What is a common challenge in classification tasks?
Overfitting and underfitting.
What is the main goal of regression analysis?
To understand the relationship between one dependent variable and one or more independent variables.
What are some major types of regression techniques?
- Linear Regression
- Polynomial Regression
- Decision Tree Regression
- Random Forest Regression
- Support Vector Regression
- Ridge Regression
- Lasso Regression
- ElasticNet Regression
- Bayesian Linear Regression
What is the main focus of linear regression?
To model the relationship between a scalar response and multiple predictors.
What does the ‘degree of the polynomial’ indicate in polynomial regression?
The highest power of x.
Fill in the blank: A small K in KNN can lead to _______.
overfitting.
Fill in the blank: A large K in KNN can lead to _______.
underfitting.
True or False: KNN uses a model to predict values during the training phase.
False.
What is the formula for a polynomial regression model?
y = β₀ + β₁x + β₂x² + β₃x³ + … + β_dx^d + ϵ
Here, βᵢ are the coefficients, d is the degree of the polynomial, and ϵ is the error term.
What does the degree (d) in polynomial regression determine?
The complexity of the relationship being modeled
Higher degrees can capture more intricate curves but also risk overfitting.
What is the goal of polynomial regression?
To find the values for β that minimize the error term (ϵ) and provide the best fit for the data.
What is linear regression characterized by?
d = 1
This is essentially standard linear regression.
What type of relationship does quadratic regression (d = 2) model?
U-shaped or inverted U-shaped relationships.
What is the defining feature of cubic regression (d = 3)?
It can capture more complex S-shaped curves.
What is a key characteristic of higher-order polynomials (d > 3)?
They can be used for very intricate relationships but are prone to overfitting.
What is regression analysis?
A statistical process for estimating the relationships between dependent and independent variables.
When is regression analysis typically used?
When dealing with a dataset that has the target variable in the form of continuous data.
What does a decision tree represent?
A flowchart-like tree structure for classification and prediction.
What does each internal node in a decision tree denote?
A test on an attribute.
What is a Naïve Bayes classifier based on?
Bayes’ Theorem
It is primarily used for classification tasks.
What is the key assumption of the Naïve Bayes algorithm?
Features are independent of each other.
What are the main applications of Naïve Bayes classifiers?
- Spam filtering
- Sentiment analysis
- Document classification
- Medical diagnosis
What does Gaussian Naïve Bayes assume about features?
They follow a normal distribution.
What is Multinomial Naïve Bayes used for?
Text classification (e.g., bag-of-words model).
What type of data does Bernoulli Naïve Bayes handle?
Binary feature data (e.g., presence or absence of a word in a document).
In regression, what is the target variable?
A continuous value.
What is the goal of classification?
To predict the class or category of the target variable based on input variables.
What are some examples of regression algorithms?
- Linear regression
- Polynomial regression
- Decision trees
What are examples of classification algorithms?
- Logistic regression
- Decision trees
- Support vector machines
- Neural networks
Fill in the blank: A regression problem is when the output variable is a _______.
real or continuous value.
Fill in the blank: A decision tree can be used to predict a _______ outcome.
continuous.
True or False: Naïve Bayes is a non-parametric method.
False.