V2024 ML Flashcards
Which of the following best describes supervised learning?
The machine learns from data that is unlabeled.
The machine learns from data that is labeled.
The machine learns to identify patterns in unlabeled data.
The machine uses a set of rules provided by a human expert to make decisions.
The machine learns from data that is labeled.
Supervised learning involves training a model on a labeled dataset, meaning that each training example is paired with an output label. The model learns to map inputs to outputs based on these labeled examples.
What is the primary objective of a classification algorithm?
To predict a continuous value output.
To predict discrete class labels for given input data.
To find the mean and variance of the dataset.
To group similar data points together without prior knowledge of the categories.
To predict discrete class labels for given input data.
The primary objective of a classification algorithm is to assign discrete class labels to input data.
This is opposed to regression, which predicts continuous values, or clustering, which groups similar data points together.
In the context of machine learning, what is overfitting?
A model that performs well on training data but poorly on unseen data.
A model that performs poorly on both training data and unseen data.
A model that performs well on unseen data but poorly on training data.
A model that uses too few features to make predictions.
A model that performs well on training data but poorly on unseen data.
Overfitting occurs when a model learns the training data too well, capturing noise and details that do not generalize to unseen data. This results in a model that performs well on training data but poorly on new, unseen data.
Which of the following statements about decision trees is false?
Decision trees can handle both numerical and categorical data.
Decision trees require feature scaling.
Decision trees can be prone to overfitting.
Decision trees are easy to interpret and visualize.
Decision trees require feature scaling.
Decision trees do not require feature scaling because they partition the data based on feature values without considering their scale. They are known for their interpretability and ability to handle both
numerical and categorical data. However, they can be prone to overfitting, particularly when they are allowed to grow deep.
Which technique can be used to prevent overfitting in decision trees?
Increasing the depth of the tree.
Decreasing the depth of the tree.
Using more test data
Decreasing the depth of the tree.
What is ensemble learning?
Using a single machine learning model to make predictions.
Combining multiple machine learning models to improve performance.
Training models without any labeled data.
Using artificial neural networks to solve complex problems.
Combining multiple machine learning models to improve performance
Ensemble learning is a technique where multiple machine learning models are combined to improve overall performance. This can help in reducing errors and improving prediction accuracy compared to using a single model.
Which ensemble method combines multiple models by averaging their predictions to improve accuracy and reduce overfitting?
Boosting
Bagging
Stacking
Blending
Bagging
Bagging (Bootstrap Aggregating) involves training multiple models on different subsets of the data and then averaging their predictions to improve accuracy and reduce overfitting. Random forests are
a popular example of bagging. Boosting, stacking, and blending are other ensemble methods but use different approaches for combining models.
Which of the following statements about random forests are true? Choose all correct statements.
Random forests use a single decision tree to make predictions.
Random forests can handle large datasets with higher dimensionality.
In random forests, each tree is trained on the entire dataset.
Random forests reduce overfitting by averaging multiple decision trees
Random forests can handle large datasets with higher dimensionality.
Random forests reduce overfitting by averaging multiple decision trees.
Random forests can handle large datasets with higher dimensionality effectively (B). They reduce overfitting by averaging the predictions of multiple decision trees, which leads to a more robust model (D). Random forests do not use a single decision tree; instead, they combine the results of multiple trees to make predictions (A). Each tree in a random forest is trained on a different subset of the data, not the entire dataset, which helps in creating diverse trees (C).
Which of the following statements about cross-validation are true? Choose all correct statements
Cross-validation helps to assess performance of a model on an independent dataset.
Cross-validation is used to increase the size of the training dataset.
Cross-validation is only applicable to classification models, not to regression models.
Crossvalidation can help in tuning hyperparameters of a machine learning model.
Cross-validation helps to assess performance of a model on an independent dataset.
Crossvalidation can help in tuning hyperparameters of a machine learning model.
Cross-validation is a technique used to assess how well a model generalizes to an independent dataset, addressing potential overfitting issues (A). It is also useful in hyperparameter tuning, providing a reliable estimate of model performance across different hyperparameter settings (D).
Cross-validation does not increase the size of the training dataset; rather, it splits the existing data into subsets to evaluate the model’s performance (B). It is not limited to linear regression models but is applicable to a wide range of machine learning models (C).
Which of the following statements about handling missing values in machine learning are true?
Choose all correct statements
Using algorithms that ignore missing values is an effective strategy.
Imputing missing values can improve the performance of a machine learning model.
Dropping rows with missing values is always the best approach.
Missing values can be ignored if they are not present in the test dataset.
Using algorithms that ignore missing values is an effective strategy.
Imputing missing values can improve the performance of a machine learning model.
Imputing missing values, such as using the mean, median, or mode, can improve the performance of a machine learning model by providing a more complete dataset (B). Some algorithms can effectively handle or ignore missing values internally, which simplifies preprocessing and can
improve the model’s performance (A). Ignoring missing values entirely is not advisable as they may affect the model’s performance if they are present in both training and test datasets (D). Dropping rows with missing values is not always the best approach, especially if it leads to a significant loss of data (C).
Which metric measures the proportion of actual positives that are correctly identified by a classification model?
Accuracy
Precision
Recall
F1 Score
Recall
Recall (also known as sensitivity or true positive rate) measures the proportion of actual positives that are correctly identified by the model. It is calculated as the number of true positives divided by the sum of true positives and false negatives, indicating how well the model captures the positive instances in the dataset. Precision, on the other hand, measures the proportion of predicted positives that are actually positive. Accuracy measures the overall correctness of the model, and the F1 score is the harmonic mean of precision and recall.
Describe the process of hyperparameter tuning and explain why it is crucial in machine learning. Provide an example of a method used for hyperparameter tuning.
Hyperparameter tuning involves selecting the best set of hyperparameters for a machine learning model to optimize its performance. Hyperparameters are parameters not learned during training but
set before the training process, such as learning rate, number of trees in a random forest, or kernel type in SVM. Tuning these parameters is crucial because they can significantly impact the model’s accuracy, convergence, and generalization. Common methods for hyperparameter tuning include
Grid Search, which exhaustively searches over a specified parameter grid, and Random Search, which randomly samples parameter settings to find the best combination.