Wronged Questions: Linear Models Flashcards
T/F: Error terms are considered to have a dimensionless measure.
False. The error term is not dimensionless. Since it is defined as ei = Yi - B0 - B1Xi, it has the same unit as the target variable.
T/F: The error representation is based on the Poisson theory of errors.
False. The error representation is based on the Gaussian theory of errors. The error terms follow a Gaussian/normal distribution.
T/F: Error terms are also known as disturbance terms.
True. The error representation is based on the Gaussian theory of errors. The error terms follow a Gaussian/normal distribution.
T/F: Error terms are observable quantities.
False. There isn’t a specific definition for disturbance terms. The Frees text (page 31) states that error terms are also called disturbance terms.
T/F: A model with a higher sum of squared errors has a higher total sum of squares compared to a model with lower sum of squared errors.
False. The total sum of squares for both models would be the same.
T/F: The validation set approach is a special case of k-fold cross-validation.
False. Neither the validation set approach nor k-fold CV is a special case of each other.
Note that LOOCV is a special case of k-fold CV with k = n.
T/F: The validation set approach is conceptually complex to implement.
False. The validation set approach is conceptually simple and easy to implement.
T/F: Performing the validation set approach multiple times always yield the same results.
False. While performing LOOCV multiple times always yields the same results, this is not true for the validation set approach, where results vary due to randomness in the split.
T/F: The validation error rate will tend to underestimate the test error rate.
False. One of the major drawbacks of the validation set approach is that it only uses the training dataset
LOOCV uses all the data so it doesn’t have this issue as much.
T/F: The validation set approach has higher bias than leave-one-out cross-validation.
True. The LOOCV approach has lower bias than the validation set approach since almost all data is used in the training set, meaning it does not overestimate the test error rate as much as the validation set approach.
T/F: The validation set approach is conceptually simple and straightforward to implement.
True
T/F: The validation estimate of the test error rate can exhibit high variability, depending on the composition of observations in the training and validation sets.
True
T/F: The model is trained using only a subset of the observations, specifically those in the training set rather than the validation set.
True
T/F: Given that statistical methods typically perform worse when trained on fewer observations, this implies that the validation set error rate may tend to underestimate the test error rate for the model fitted on the entire dataset.
False
T/F: The leverage for each observation in a linear model must be between 1/n and 1.
True
T/F: The n leverages in a linear model must sum to the number of explanatory variables.
False. The leverages must sum to p+1, which is the number of predictors plus the intercept.
T/F: If an explanatory variable is uncorrelated with all other explanatory variables, the corresponding variance inflation factor would be zero.
False. If an explanatory variable is uncorrelated with all other explanatory variables, the corresponding variance inflation factor would be 1.
T/F: In best subset selection the predictors in the k-variable model must be a subset of the predictors in the (k+1)-variable model.
False. The predictors in the k-variable model do not need to be a subset of those in the (k+1)-variable model.
T/F: In best subset selection, if p is the number of potential predictors, then 2^(p-1) models have to be fitted.
False. The correct number of models that need to be fitted is 2^p, not 2^(p-1).
T/F: In best subset selection, the residual sum of squares of the k-variable model is always lower than that of the (k+1)-variable model.
False. The residual sum of squares for the k-variable model must be higher than or equal to that of the (k+1)-variable model (as long as the models are nested).
T/F: In each step of best subset selection, the most statistically significant variable is dropped.
False. In each step of best subset selection, the least statistically significant variable is dropped.
T/F: In high-dimensional settings, best subset selection is computationally infeasible.
True. In high-dimensional settings, the computational complexity of fitting all possible models makes best subset selection infeasible.
se_b0
se_b1
se_hat(y) - used for estimators
se_hat(y)_n+1
Frees rule of thumb for identifying outliers
Observation is an outlier if the standardised residuals exceeds 2 in absolute value
High leverage point
Observation that is unusual in the horizontal direction
R^2 adj
F statistic
Variance-covariance matrix
(X^TX)^-1
Mallow’s C_p
AIC
-2ln(L)+2k
BIC
kln(n)-2ln(L)
Leverage formula
Cook’s distance
Breusch-Pagan test for heteroscedasticity
LOOCV Error
Centered variable
Variable resulting from subtracting the sample mean from the variable
Scaled variable
Variable resulting from dividing a variable by its standard deviation
Standardised variable
Variable resulting from first centering, then scaling the variable
Ridge regression
Lasso Regression
- performs variable selection
- yields more interpretable models
Frees rule of thumb for high leverage points
[3(p+1)]/n
Coefficient Matrix
T/F: The best model by AIC will not also be the best model by 𝐶_p.
False. The best model by AIC will also be the best model by C_p.
T/F: AIC, BIC, C_p, and R^2adj are not reliable when the model has been overfitted.
True
List the cross validation techniques in order of least to most bias
LOOCV < k-fold < validation set
List the cross validation techniques in order of most to least variance
LOOCV > k-fold > validation set
F statistic using R^2
Sum of Squares Regression (SSR)
T/F: The standard error of the regression provides an estimate of the variance of y for a given x based on n-1 degrees of freedom.
False. The standard error of the regression provides an estimate of the variance of y for a given x based on n-2 degrees of freedom.
T/F: In forward stepwise selection, if p is the number of potential predictors, then 2^p models have to be fitted.
False. Forward stepwise follows a more complex pattern. 2^p applies to best subset selection
T/F: The predictors in the k-variable model must be a subset of the predictors in the (k+1)-variable model in forward stepwise selection.
True
T/F: At each iteration, the variable chosen is the one that minimizes the test RSS based on cross-validation in forward stepwise selection.
False. At each iteration, the variable chosen is the one that minimizes the training RSS based on cross-validation in forward stepwise selection.
T/F: Forward subset selection cannot be used even if the number of variables is greater than the number of observations.
False. Backward subset selection cannot be used if the number of variables is greater than the number of observations.
T/F: The least squares line always passes through the point [bar(x), bar(y)].
True.
T/F: The squared sample correlation between x and y is equal to the coefficient of determination of the model.
True
T/F: The choice of explanatory variable x affects the total sum of squares.
False, because SST is not a function of x.
T/F: The F-statistic of the model is the square of the t-statistic of the coefficient estimate for x.
True. This is true if both tests have the same set of hypotheses.
T/F: A random pattern in the scatterplot of y against x indicates a coefficient of determination close to zero.
True
Var(X+Y)
Var(X) + Var(Y) + 2Cov(X,Y)
Var(X-Y)
Var(X) + Var(Y) - 2Cov(X,Y)
T/F: As λ increases, the budget parameter increases.
False. An increase in λ actually corresponds to a decrease in the “budget” allowed for the coefficients’ magnitudes, not an increase.
T/F: As λ decreases towards 0, the model becomes more biased.
False. As λ decreases towards 0, the model becomes less biased due to flexibility (and thus variance) increasing when λ decreases.
T/F: Increasing the budget parameter decreases the variance of the model.
False. Increasing the budget parameter decreases λ, which results in an increase in variance.