simple regression Flashcards
1
Q
regression
A
-very common for us to attempt to predict variance I an outcome (DV) from one or more predictors (IV)
- aims to tell us: whether our model is a good fir (does it explain a good amount of variance, whether there are significant relationship between a predictor variable and outcome variable and the direction of these relationships
2
Q
R2
A
- the proportion of total variation (SST) that is explained by the regression (SSR) is known as the coefficient of determination
- often referred to as R2
- this value can range between 0 and 1 and the higher it is the more accurate the regression model is
- it is often referred to as a percentage e.g. R2=.7 means 70% of variance is accounted for
3
Q
adjusted r2
A
- interpreted in the same way as R2 and is always lower
- an adjustment based on the number of predictors in the model
-useful because by adding new predictors R2 will inevitably increase
4
Q
model
A
- R2 and adjusted R2 are used to evaluate model fit
- Whether it is significant or not is determined by an ANOVA (analysis of variance)
- We get F statistics, degrees of freedom and a p value for the model
- The F statistic for a regression is calculated using the Mean square for the regression (not the sum of squares of the model SSR) and the mean square of the error (not the sum of squares for the model SSE) –> F= MSR/MSE
- Degrees of freedom are the number of predictors (k) and N-k-1
○ So if I had 100 participants and three predictors my df would be 3,96
Using the F and the df a p value for the model can be estimated
5
Q
linear regression
A
- The overall regression model doesn’t tell you information about specific predictors
- you do not know the direction of the association between variables (positive or negative association)
- In order to understand individual predictors it is necessary to look at the individual regression coefficient
6
Q
regression coefficient
A
- A regression coefficient, B/b, is the number of units the DV changes for each one unit increase in the IV
○ B=2.00 means for each unit increase of the IV the DV increases by two units
○ B=-.01 means for each one unit increase of the IV the DV decreases by .01 units - In isolation this value has limited use, and must be interpreted alongside its standard error (SE)
- SE= how much regression coefficient deviates across cases
- Ideally the standard error is small, meaning the regression coefficient is precise
- So logically the larger the regression coefficient is compared to the SE the larger the t statistic will be and the smaller p value calculated for the association
7
Q
standardised regression coefficients
A
- Beta (β) values are also commonly reported
- They explain the association between each IV and DV in terms of standard deviation changes
β =.50 means that for every one standard deviation increase in the IV there is a .50 standard deviation increase in the DV - The most useful property of the beta value is it allows simple comparison of the strength of the association between your Ivs and DV, the higher the beta the stronger the association is
- It is notable that a standardised regression coefficient is just a different way of expressing the same information as an unstandardised regression coefficient so they have exactly the same p value
8
Q
simple and multiple regression assumptions
A
- Normally distributed continuous outcome
- Independent data i.e. not a within subjects design, data input needs to come from a different participant each time
- Interval/ratio predictors
- Nominal predictors with 2 categories (dichotomous)
- No multicollinearity for multiple regression
- Careful of influencing cases
9
Q
what to do in a simple regression
A
- load in data
- run the regression
- confidence intervals and beta coefficients
10
Q
full write up
A
- a simple regression was carried out to investigate the relationship between…
- significant, variance, adjusted r2, f stat and the p value.
-significance of predictor, b stat, standard error, p value and confidence intervals