Linear Regression Flashcards
Financial Analysis (MBA 728)
What is the primary function of the ‘lm()’ command in R programming for regression analysis?
a.
To plot data points
b.
To perform linear regression
c.
To calculate the mean of a dataset
d.
To calculate the standard deviation
A
What does it mean if the coefficient of a predictor variable in a regression model is negative?
a.
The predictor variable has no relationship with the response variable.
b.
The predictor variable has a direct relationship with the response variable.
c.
The predictor variable is not statistically significant.
d.
The predictor variable inversely affects the response variable.
D
In regression analysis, what is the implication of a high standard error for a coefficient?
a.
The coefficient is highly significant
b.
There is a high level of uncertainty associated with the coefficient estimate
c.
The coefficient is not important in the model
d.
The coefficient is precisely estimated
B
In a simple linear regression model, how is the value of R-squared related to the correlation between the response and predictor variables?
a.
R-squared is equal to the square of the correlation coefficient
b.
There is no relationship between R-squared and the correlation coefficient
c.
R-squared is inversely proportional to the correlation coefficient
d.
R-squared is the square root of the correlation coefficient
A
What is the purpose of regression analysis?
a.
To calculate the mean value of a dataset
b.
To explain the relationship between a response variable and one/more predictor variables
c.
To classify data into distinct categories
d.
To summarize data in a concise way
B
In multiple regression analysis, the beta coefficients represent the partial correlation between the predictor and response variables.
a.
TRUE
b.
FALSE
A
In linear regression, it is necessary for the predictor variables themselves to be normally distributed.
a.
FALSE
b.
TRUE
A
What does the term ‘partial regression coefficient’ in multiple regression analysis refer to?
a.
The average change in the response variable for a one-unit change in the predictor, holding other predictors constant.
b.
The coefficient that measures the relationship between two predictor variables.
c.
The change in the response variable when all predictors are set to zero.
d.
The coefficient obtained from univariate regression of each predictor.
A
Which of the following is a key assumption of linear regression analysis?
a.
The response variable should be categorical.
b.
The relationship between the predictor and the response variable is curvilinear.
c.
The variance of the residuals should be constant across all levels of the predictor variables (homoscedasticity).
d.
The predictor variables must be normally distributed.
C
Which of the following is true about the residuals in a regression model?
a.
They are always positive
b.
They have a mean of zero
c.
They represent the slope of the regression line
d.
They are independent of the predictor variables
B
What does the R-squared value in regression analysis indicate?
a.
The error term of the regression model
b.
The correlation between the predictor and response variables
c.
The percentage of variability in the response variable that is explained by the predictor variable(s)
d.
The slope of the regression line
C
What does a high value of Adjusted R-squared in a regression model indicate?
a.
The model has overfitted the data.
b.
The model explains a significant portion of the variance in the dependent variable.
c.
The predictors in the model are highly correlated.
d.
The model is highly biased.
B
In regression analysis, the significance of coefficients is solely determined by their value, with larger coefficients being more significant.
a.
TRUE
b.
FALSE
B
In regression analysis, what is the purpose of conducting an ANOVA (Analysis of Variance)?
a.
To determine the overall fit of the model
b.
To plot the regression line
c.
To test the significance of individual predictors
d.
To calculate the mean value of the response variable
A
What does the Adjusted R-squared value adjust for in regression analysis?
a.
The mean of the response variable
b.
The variance of the error term
c.
The correlation between predictor variables
d.
The number of predictors in the model
D
How is the Total Sum of Squares (TSS) calculated in regression analysis?
a.
By summing the squared difference between predicted and actual values.
b.
By summing the squared differences between actual values and the mean of the dependent variable.
c.
By dividing the sum of squares by the number of observations
d.
By multiplying the sum of squared residuals with the degrees of freedom
B
In regression models, interaction terms are used to investigate the effect of one predictor variable at different levels of another predictor variable.
a.
FALSE
b.
TRUE
B
In multiple regression, adding more predictor variables always improves the accuracy of the model.
a.
FALSE
b.
TRUE
A
The Adjusted R-squared value can be used to compare models with a different number of predictors.
a.
FALSE
b.
TRUE
B
What does the F-statistic in regression analysis test?
a.
The variance of the error term
b.
The mean of the residuals
c.
The overall significance of the model
d.
The correlation between variables
C
In multiple regression, the presence of an interaction term implies that the effect of one predictor on the response variable depends on the value of another predictor.
a.
FALSE
b.
TRUE
B
Which component in a linear regression equation represents the slope?
a.
The constant term.
b.
The coefficient of the predictor variable.
c.
The coefficient of the intercept.
d.
The residual term.
B
The presence of significant interaction effects in a regression model implies that the main effects of the involved variables are also significant.
a.
TRUE
b.
FALSE
B
What is the main objective of using multiple regression over simple regression?
a.
To allow for a single predictor variable
b.
To account for the influence of several predictor variables
c.
To reduce the number of data points required
d.
To reduce the complexity of the analysis
B
Residual analysis is not necessary for validating a regression model.
a.
FALSE
b.
TRUE
A
In regression analysis, what does collinearity refer to?
a.
The relationship between two residual terms.
b.
The linearity between the predictor and response variables.
c.
The relationship between two predictor variables.
d.
The error between observed and predicted values.
C
What is the primary purpose of residual analysis in regression?
a.
To determine the number of predictor variables.
b.
To check the assumptions of the regression model.
c.
To estimate the coefficients of the regression model.
d.
To test the strength of the relationship between variables.
B
In the context of regression analysis, what does the term ‘residual’ refer to?
a.
The slope of the regression line
b.
The difference between observed and predicted values
c.
The correlation coefficient
d.
The intercept of the regression line
B
In multiple regression, if two predictor variables are highly correlated, removing one of them will significantly change the coefficients of the other predictors.
a.
TRUE
b.
FALSE
A
What type of variable is typically the response or dependent variable in OLS regression?
a.
Discrete
b.
Categorical
c.
Binary
d.
Continuous
D
In a regression model, a significant F-test implies that all individual predictors are significant.
a.
FALSE
b.
TRUE
A
What is the consequence of multicollinearity in a regression model?
a.
Increased residual error.
b.
Decreased significance of the F-statistic.
c.
Increased accuracy of the coefficient estimates.
d.
Reduced reliability of the coefficient estimates.
B
What is the effect of outliers on a regression model?
a.
They can significantly skew the results of the model
b.
They reduce the R-squared value
c.
They increase the accuracy of the model
d.
They have no effect on the model
A
What is the significance of the t-value in regression analysis?
a.
It indicates the strength of the relationship between a predictor and the response variable
b.
It represents the correlation coefficient
c.
It represents the total sum of squares
d.
It is used to calculate the residuals
A
In the context of regression analysis, what does ‘overfitting’ refer to?
a.
A model with a low number of predictors
b.
A model that accurately predicts the outcome for the sample data but not for new data
c.
A model with a high R-squared value
d.
A model with standardized coefficients
B
In linear regression, what is the implication of a p-value greater than 0.05 for a predictor variable?
a.
The model does not fit the data.
b.
The variable is not statistically significant.
c.
The variable is statistically significant.
d.
The model fits the data well.
B
What is the purpose of the ‘cor()’ function in R in the context of regression analysis?
a.
To standardize variables
b.
To plot a scatter plot
c.
To calculate the regression coefficients
d.
To perform correlation analysis between variables
D
In regression analysis, what is the purpose of using interaction terms?
a.
To handle missing data.
b.
To investigate if the effect of one predictor varies with another predictor.
c.
To increase the linearity between predictors and the response variable.
d.
To reduce the number of predictor variables.
B
The absence of multicollinearity can be confirmed solely by observing low correlations between pairs of predictor variables.
a.
TRUE
b.
FALSE
B
In regression analysis, what does ‘interaction effect’ refer to?
a.
The combined effect of two or more predictors on the response variable
b.
The correlation between the predictor variables
c.
The relationship between the predictors and residuals
d.
The effect of outliers on the model
A
In multiple regression analysis, what is the purpose of using dummy variables?
a.
To replace missing values.
b.
To incorporate categorical variables into the model.
c.
To standardize the predictor variables.
d.
To identify outliers in the dataset.
B
Which of the following indicates a strong correlation between a predictor variable and the response variable in a regression analysis?
a.
A large standard error.
b.
A low p-value.
c.
A small coefficient estimate.
d.
A high residual standard error.
B
In regression analysis, how is the adjusted R-squared different from the regular R-squared?
a.
Adjusted R-squared is only used in simple linear regression.
b.
Adjusted R-squared measures the correlation between variables.
c.
Adjusted R-squared is always higher than regular R-squared.
d.
Adjusted R-squared considers the number of predictors in the model.
D
In regression analysis, what is the significance of a p-value less than 0.05 for a coefficient?
a.
The coefficient is not significant
b.
The model is overfitted
c.
The model has a high R-squared value
d.
The coefficient is significantly different from zero
D
What does a scatter plot with a fitted regression line show in regression analysis?
a.
The error terms of the regression model
b.
The relationship between predictor and response variables
c.
The standard deviation of the data
d.
The mean value of the data points
B
Outliers in the data can significantly affect the coefficients in a linear regression model.
a.
FALSE
b.
TRUE
B