Regression Flashcards
When is linear regression used?
Used when the relationship between variables x and y can be described with a straight line
Determines the strength of the relationship between x and y but it doesn’t tell us how much y changes based on a given change in x
a. Correlation
b. Regression
a. Correlation
Define correlation
Determines the strength of the relationship between x and y but it doesn’t tell us how much y changes based on a given change in x
Determines the strength of the relationship between x and y and tells us how much y changes based on a given change in x
a. Correlation
b. Regression
b. Regression
Define regression
Determines the strength of the relationship between x and y and tells us how much y changes based on a given change in x
By proposing a model of the relationship between x and y, regression allows us to …?
Estimate how much y will change as a result of a given change in x
Estimate how much y will change as a result of a given change in x
a. Correlation
b. Regression
b. Regression
Distinguishes between the variable being predicted and the variable(s) used to predict
a. Correlation
b. Regression
b. Regression
True or False?
Correlation distinguishes between the variable being predicted and the variable(s) used to predict
False
Regression distinguishes between the variable being predicted and the variable(s) used to predict
How many predictor variables are in a simple linear regression?
There is only one predictor variable
What is the variable that is being predicted?
a. x
b. y
b. y
What is the outcome variable?
a. x
b. y
b. y
What is the predictor variable?
a. x
b. y
a. x
What is the variable that is used to predict?
a. x
b. y
a. x
y is…?
a. The criterion variable
b. The dependent variable
c. The outcome variable
d. The predictor variable
e. The independent variable
f. The explanatory variable
a. The criterion variable
b. The dependent variable
c. The outcome variable
x is…?
a. The predictor variable
b. The dependent variable
c. The independent variable
d. The criterion variable
e. The outcome variable
f. The explanatory variable
a. The predictor variable
c. The independent variable
f. The explanatory variable
Why might researchers use regression?
List 3 reasons
- To investigate the strength of the effect x has on y
- To estimate how much y will change as a result of a given change in x
- To predict a future value of y, based on a known value of x
Makes the assumption that y is (to some extent) dependent on x
a. Correlation
b. Regression
b. Regression
True or False?
The dependence of y on x will always reflect causal dependency
False
The dependence of y on x may or may not reflect causal dependency
True or False?
Regression provides direct evidence of causality
False
Regression does not provide direct evidence of causality
Linear regression consists of 3 stages
What are they?
- Analysing the relationship between variables
- Proposing a model to explain that relationship
- Evaluating the model
- Analysing the relationship between variables
- Proposing a model to explain that relationship
- Evaluating the model
These are stages of…?
a. Regression
b. Correlation
c. ANOVA
d. t-test
a. Regression
The first stage of regression involves analysing the relationship between variables
How do we do this?
By determining the strength and direction of the relationship (equivalent to correlation)
The second stage of regression involves proposing a model to explain the relationship
How do we do this?
By drawing the line of best-fit (regression line)
The third stage of regression involves evaluating the model to explain that relationship
How do we do this?
By assessing the goodness of the line of best-fit
What is the intercept?
Value of y when x is 0
What is the slope?
How much y changes as a result of a 1 unit increase in x
How much y changes as a result of a 1 unit increase in x
This is known as…?
a. The slope
b. The intercept
a. The slope
Value of y when x is 0
This is known as…?
a. The slope
b. The intercept
b. The intercept
Assumes no relationship between x and y (b=0)
a. Best model
b. Simplest model
b. Simplest model
Based on the relationship between x and y
a. Best model
b. Simplest model
a. Best model
Consists of the regression line
a. Best model
b. Simplest model
a. Best model
Consists of a flat, horizontal line
a. Best model
b. Simplest model
b. Simplest model
What does the simplest model assume?
Assumes no relationship between x and y (b=0)
What is the best model based on?
Based on the relationship between x and y
How do we calculate the goodness of fit in the simplest model regression?
Refer to the total variance
Variance not explained by the mean of y
a. Best model
b. Simplest model
b. Simplest model
Variance not explained by the regression line
a. Best model
b. Simplest model
b. Simplest model
What is the residual variance for the best model?
Variance not explained by the regression line
What is the total variance for the simplest model?
Variance not explained by the mean of y
How do we calculate the goodness of fit in the best model regression?
Refer to the residual variance
Calculate goodness of fit using residual variance
a. Best model
b. Simplest model
a. Best model
Calculate goodness of fit using total variance
a. Best model
b. Simplest model
b. Simplest model
The difference between the observed values of y and the mean of y
i.e. the variance in y not explained by the simplest model (b = 0)
a. SST
b. SSR
a. SST
SST is…?
a. The difference between the observed values of y and those predicted by the regression line
b. The difference between the observed values of y and the mean of y
b. The difference between the observed values of y and the mean of y
The difference between the observed values of y and those predicted by the regression line
i.e. the variance in y not explained by the regression model
a. SST
b. SSR
b. SSR
What does the difference between SST and SSR reflect?
Reflects the improvement in prediction using
the regression model compared to the simplest model
i.e. the reduction in unexplained variance using the regression model compared
to the simplest model
What is the formula to calculate SSM?
SST - SSR = SSM
The larger the SSM, the _______ the improvement in prediction using the regression model over the simplest model
a. Smaller
b. Bigger
b. Bigger
The larger the SSM, the bigger the …?
Improvement in prediction using the regression model over the simplest model
The _____ the SSM, the bigger the improvement in prediction using the regression model over the simplest model
a. Larger
b. Smaller
a. Larger
How do we evaluate the improvement due to the model (SSM), relative to the variance the model does not explain (SSR)?
Use an F-test (ANOVA)
What do we use an F-test (ANOVA) for when assessing the goodness of fit for a regression?
To evaluate the improvement due to the model
(SSM), relative to the variance the model does not explain (SSR)
The improvement due to the model is known as…?
a. SSM
b. SSR
a. SSM
The variance the model does not explain is known as…?
a. SSM
b. SSR
b. SSR
Rather than using the Sums of Squares (SS) values, the F-test uses …?
Mean Squares (MS) values
True or False?
F-test uses Sums of Squares (SS) values
False
F-test uses Mean Squares (MS) values