Decision Trees, Boosting, SVMs Flashcards

Question 1

Q

Gini Impurity

Answer

A

Most common loss function used in classification trees within random forests. Gini impurity measures the likelihood of incorrectly classifying a random chosen element if it was randomly labeled according to the distribution of labels in the node. The goal is to minimize Gini impurity at each split, leading to more homogeneous nodes.

Question 2

Q

Hinge Loss

Answer

A

Most common loss function for SVMs for classification. The hinge loss penalizes predictions that are not only wrong but also not far enough on the correct side of the decision boundary (i.e within the margin). This encourages the model to create a larger margin between classes.

Question 3

Q

Pruning

Answer

A

When branches that have weak predictive power are removed in order to reduce the complexity of the model and increase the predictive accuracy of a decision tree model.
Can happen bottoms up, or top down, with approaches such as reduced error pruning and cost complexity pruning.
Reduced error is perhaps the simplest but also is optimized for maximum accuracy
- Replace each node, if it doesn’t decrease predictive accuracy, keep it pruned.

Question 4

Q

What is a decision tree?

Answer

A

A tree structure where internal nodes represent feature tests, branches represent outcomes, and leaf nodes represent decision or classifications.

Question 5

Q

What is information gain?

Answer

A

The reduction in entropy after a dataset is split on an attribute. It’s used to build decision trees.

Entropy can be thought of as how much variance the data has. For example, a dataset of only blues would have very low entropy, while a dataset of mixed blues, greens, and reds would have relatively high entropy. High entropy means more uncertainty, while low entropy means more predictability.
Information gain is a measure of how much information a feature provides about a class. It’s calculated using entropy and is used to determine which feature should be used to split the data at each internal node of the decision tree. The greater the information gain, the greater the decrease in entropy or uncertainty

Question 6

Q

What is a random forest?

Answer

A

An ensemble of decision trees where each tree is built on a random subset of the data and features. Predictions are made by averaging or voting over the trees

Question 7

Q

What is bagging?

Answer

A

Bootstrap Aggregating or bagging is a method that involves training multiple models on different subsets of the training data and combining their predictions to improve accuracy.

This is what a random forest is, good for high variance, low bias issues.

Question 8

Q

What is boosting?

Answer

A

A technique that combines weak learners (usually decision trees) sequentially, with each learner correcting errors of its predecessors.

This is like XGBoost, good for high bias, low variance situations.

Question 9

Q

What is AdaBoost?

Answer

A

A boosting technique that adjusts the weights of incorrectly classified instances, so subsequent models focus on those harder cases.

Question 10

Q

What is gradient boosting?

Answer

A

A boosting technique where new models are trained to predict the residual errors of the existing model in a gradient descent manner.

Question 11

Q

What is XGBoost

Answer

A

An optimized implementation of gradient boosting that is efficient and widely used in machine learning competitions.

Question 12

Q

What is LightGBM?

Answer

A

A gradient boosting framework that uses tree-based learning algorithms and is designed for speed and efficiency.

Question 13

Q

What is a kernel in SVMs?

Answer

A

A function that allows SVM to work in high-dimensional spaces by mapping the input space into a dimensional feature space.

Hence, Kernel Trick

Question 14

Q

What is the margin in SVM?

Answer

A

The distance between the hyperplane and the nearest data points from both classes. SVMs aim to maximize this margin. (this separation between classes)

Question 15

Q

What is soft margin in SVM?

Answer

A

A concept in SVMs where some misclassifications are allowed in order to balance the tradeoff between margin maximization and classification accuracy.

Question 16

Q

What is the difference between bagging and boosting?

Answer

Study These Flashcards

A

Bagging trains models independently and aggregates them, while boosting trains models sequentially, with each one focusing on the errors of the previous model.

Question 17

Q

What is feature importance in decision trees?

Answer

Study These Flashcards

A

A measure of the contribution of each feature to the model’s predictions, often based on Gini impurity reduction or information gain.

Question 18

Q

What is the out-of-bag error in random forests?

Answer

Study These Flashcards

A

The error rate estimated using the samples that were not included in the bootstrap samples (out-of-bag data) used to train individual trees.

Question 19

Q

What is the CART algorithm?

Answer

Study These Flashcards

A

Classification and Regression Trees (CART) is a decision tree algorithm that splits data into subsets based on the value of input features. CART picks the separation with the lowest impurity score.

Question 20

Q

What is early stopping in gradient boosting?

Answer

Study These Flashcards

A

A technique used to stop training when the performance on a validation set stops improving to avoid overfitting.

Decision Trees, Boosting, SVMs Flashcards

(20 cards)