Machine Learning Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a confidence interval?

A

A 95% confidence interval is defined as the range of values that, with 95% probability, will contain the true value of the unknown parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is residual standard error?

A

It is the estimate of the standard deviation of epsilon. It is the average amount by which the response will deviate from the true regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define R2 statistics?

A

Proportion of variability in Y that can be explained by X.
R2 = (TSS - RSS)/TSS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define population regression line?

A

Best linear approximation of the true linear relationship between X and Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is F-statistics preferred over t-statistics in multiple regression?

A

F-statistics adjusts for the number of predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of variable selection methods

A

Forward selection, backward selection and variable selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the assumptions for linear regression model?

A
  1. Linear relationship between X and Y.
  2. Uncorrelated error terms
  3. Constant variance of the error terms
  4. No outliers
  5. No leverage points
  6. No collinearity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Difference between prediction intervals and confidence intervals?

A

Confidence intervals determines how close y’ is to f(x) and prediction intervals determines how close y’ is to y. Prediction intervals are wider than confidence intervals because they include both irreducible and reducible error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain how decision trees work in brief?

A

Decision trees divided the predictor space into distinct and overlapping regions. We use recursive binary splitting to construct a decision tree. It is a top down greedy approach where we start at the top when all the observations belong to the same region. As we move down the tree, we successively split the region into two new branches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is tree pruning?

A

Tree pruning is done to avoid overfitting. We grow a very large tree and then prune it back to obtain a subtree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some measures that can be used for splitting a node in tree-based methods?

A
  1. classification error rate: fraction of training observations in a region that do not belong to the most common class in that region.
  2. Gini index: measure of total variance across the K classes. It is also a measure of node purity. A small value indicates that the node is pure. E = sum ( p_mk * (1 - p_mk) )
  3. Entropy: similar to gini index entropy is also sensitive to node purity. E = - sum ( p_mk * log(p_mk) )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is node purity important?

A

High node purity indicates higher confidence in the predictions of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why are random forest, bagging and boosting more commonly used than decision trees?

A

Decision trees tend to have high variance. In ensemble modelling, we combine multiple weak learners which helps to reduce the variance in the final model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is recursive binary splitting is a greedy approach?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the advantages and disadvantages of decision trees?

A

Advantages:
1. easy to interpret.
2. can capture non linear relationship between predictor variables and response.
3. can handle qualitative predictors without creating dummy variables.

Disadvantages:
1. high variance
2. non-robust.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does random forest improves over bagging?

A

Random forests provides an improvement over bagging in the reduction of variance by decorrelating the trees. In bagging, if we have a strong predictor in our set then all the trees will look alike as they use the predictor in the top split. Random forests prevents this from happening by using a subset predictors at each split.

17
Q

How can we measure variable importance in tree-based methods?

A

By averaging over the reduction in RSS or gini index when split is performed across a particular predictor.