2A Flashcards
What is regression analysis?
- Regression analysis is about predicting values on a y-variable that you already observed based on one or more x-variables.
- You try to determine what the association is between these variables.
Explanatory purpose of regression analysis
It can be informative about causal relationship (but correlation does not imply causation!)
‘‘could wealth be a cause of the quality of democracy?’’
Descriptive purpose of regression analysis
Even without a causal relation, it is interesting in its own right to know that two things often go together
for example poverty and authoritarianism often go together
The simple linear regression model
to make a prediction, we use a model:
Yi = b0 + b1Xi + εi
Y = dependent variable
i = value for each case i
b0 = intercept / constant (where the line intercepts the x-axis)
b1 =regression coefficient (how steep the line is, with an increase of each unit of X, how much does Y increase?)
X = independent variable
e = error term (residual, the difference between the predicted y and the observed y)
- the intercept is also the value for Y if the value of X is 0.
- without the residual, the equation wouldn’t really be true because not every case is on the line
- The predicted value of Ŷ increases or decreases by the value of the regression coefficient for every one unit increase of X.
predicted values
values on Y for each case based on the estimated model
without residuals
observed values
values on Y for each case that we actually observe in the sample
with residuals
why is this called simple linear regression?
simple: we have just one independent variable X
linear: the effect of x on Y can be represented by a straight line
- steepness of the line (slope) given by value of regression coefficient
- we multiply the value of X of each case by b1
how do you test the significance of regression coefficients?
- to test the statistical significance of the regression coefficients, we use the t-test
- t = b1 estimated - b1 expected under H0 / SE b1
- so, t = b1 estimated/SEb1 because b1 expected under H0 is 0.
- this is because H0 means there is no effect, so the regression coefficient is 0.
- we can look up the value of t in t-distribution to determine p-value
meaning of the p-value
- the probability that you would have found the estimated coefficient for b1 (or an even larger coefficient) in our sample if X and Y were completely unrelated in the population.
- if p-value < a = we can reject to null-hypothesis
= the effect of X on Y is statistically significant and exists in the population
where in the output can you find the values you need
How to interpret SPSS-output for simple regression analysis
- You look at the ANOVA table
- b0 is under the constant B
- b1 is under the constant for the independent variable
- SEb1 is the standard error for the independent variable
- t is the t for the independent variable.