1. Statistical Learning Flashcards
What is the difference between supervised and unsupervised learning
Supervised: has response variable
Unsupervised: analyzes the observations or the variables without a response variable. Main idea is to identify patterns that may exist in the data.
What is the difference between a parametric method and a non-parametric method?
Parametric: specifies a functional form for f that includes free parameters (parameters that we estimate).
Non-parametric: makes no assumption about f’s functional form, f is then mainly algorithmic
What are the two main objectives to supervised learning?
Inference and prediction
A methods predictive strength coincides with its _______
Flexibility
What does one’s ability to make inferences depend on?
The interpretability of the model
Why are flexibility and interpretability inversely related?
Because if a model is very flexibly (fits the data too well), then it is likely that the model is complicated (not easily interpreted)
Methods that are less flexible, but more interpretable?
Lasso and subset selection
Methods that are moderately flexible and interpretable?
Least squares
Regression trees
Classification trees
Methods that are very flexible, but not interpretable?
Bagging
Boosting
Do flexibility and predictive accuracy go hand in hand? Why or why not
They do not. When a method is highly flexible, that means that it is flexible on the training data, not the test data.
Highly flexible = perfect predictions on past data
What does the bias of a model speak to?
The bias relates to the average closeness between f-hat and f.
What is the difference between prediction and inference?
Prediction: output of f-hat
Inference: comprehension of f
In KNN regression, which of the following are true as k increases?
A. Flexibility increases
B. Squared bias increases
C. Variance decreases
As k increases, the model becomes less flexible (worse at predicting)
A. False
B. True
C. True
Rank these three in terms of flexibility, in decreasing order.
Linear regression
Ridge regression
Regression tree
Most flexible: regression tree
Linear regression
Least flexible: Ridge regression
Rank in decreasing order of flexibility.
Linear regression
Lasso regression
Boosting
Boosting
Regression
Lasso