Linear Regression Flashcards

1
Q

What is a simple linear regression?

A

It predicts ONE variable from another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can a significant value for a coefficient (p < .05) tell us about magnititude and effect?

A

NO. That tell us whether estaimtes are significantly different from ZERO but not about magnitude of effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does a p value really tell us in a coefficient table?

A

If predicted outcome variable is significantly different to zero, more than just by chance. It’s just a YES or NO.

It does not tell us about magnitude of this effect though.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In the coefficient output, for a simple linear regression, when we are looking to see what is the value of Y when X is 0, we are looking at the intercept. What coefficient value do we look to?

A

Unstandardised B next to the “intercept” word in the output.

The intercept is a CONSTANT value - remember this.

And, remember, constant unstandardised coefficient also tells us about the slope! so next to the "variable" beneath the "intercept" word,  that figure tells us the direction of the relationship. 
Standardised beta (b) would give magnitiude of effect. 

Remember our model has this formula – positive affect (outcome variable) is predicted by first, the intercept, which in this output is the CONSTANT UNSTANDARDISED B 2.853, that is the value of Y where X is 0. Sometimes its called b naught

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In the coefficient output, for a simple linear regression, when we are looking to see what is the value of Y when X is 0, we are looking at the intercept. What coefficient value do we look to?

A

Unstandardised B next to the “intercept” word in the output.

The intercept is a CONSTANT value - remember this.

And, remember, constant unstandardised coefficient also tells us about the slope! so next to the "variable" beneath the "intercept" word,  that figure tells us the direction of the relationship. 
Standardised beta (b) would give magnitiude of effect. 

Remember our model has this formula – positive affect (outcome variable) is predicted by first, the intercept, which in this output is the CONSTANT UNSTANDARDISED B 2.853, that is the value of Y where X is 0. Sometimes its called b naught

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is it better to use the standardised coefficient instead of unstandardised coefficient when looking at effect size?

A

Because standardised shows us for every 1 SD the predictor variable changes, X amount SD the outcome variable changes. As opposed to how many units change (Unstandardised). It helps to use SDs when comparing models as units will always be in SDs rather than arbritrary units depending on measures/variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

So, we can get effect sizes from standardised coefficients. How can we get effect sizes by examining variance?

A

Through looking at the r squared.

R squared indicated the proportion or percentage of total variance accounted for by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is R squared?

A

The squared CORRELATION between the ACTUAL DV scores and the PREDICTED DV scores.

Essentially it is the proportion of variance explained by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is another word for R?

A

Correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is another word for R2?

A

Squared correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The variance of an outcome variable is 5. The regression tells us the variance of the residuals is 4. We then substract residual variance from total variance, which leads us to…?

A

The variance explained by the model.

The model explained variance.

R2. Squared correlation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does 0 in R2 indicate?

What does 1 in R2 indicate?

A

0 indicates NONE of the variance is explaiend by the model

1 indicates ALL of the variance is explained by the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Do people consider a .25 r2 as small?

A

No.

.04 is considered small (4%)
.09 is medium (9%)
.25 is large (25%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Are effect sizes with r squared definitive or are they t shirt?

A

T shirt. no set rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

So what is adjusted r square?

A

As opposed to the r2 looking at proportion of variance explained by the model derived from data from a specific sample, ADJUSTED r square gives an estimate of the r2 in the population!

Meaning, how much variability would be explained if the MODEL was derived from the population rather than the sample.

It is more conservative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why would an adjusted r square be important - why can’t you just use the r squared provided on estimate for the population?

A

Because the regression model might overfit your particular data set. Therefore it may not work as well with other samples as it does with YOUR data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why can r2 be expected to vary?

A

Because sample correlations vary around the population correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

True or false: the sample becomes less represenative as the sample size decreases

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is sampling error?

A

The discrepancy between sample and population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why does sampling error increase as sample size decreases AND as the number of predictors increase ?

A

Regarding predictors, becuase there is error associated with each predictor

And because the sample isnt representative of the population, smaller sample size no good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The R2 is likely to overestimate the size of the effect because:

A. Sampling error decreases as sample size decreases
B. Sampling error increases as sample size decreases
C. Sampling error increases as the number of predictors increase
D. B and C

A

D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Regression chooses the ____ therefore it is prone to overfitting the data

A

Best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is failure to replicate the r2 called?

A

Shrinkage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How is shrinkage best evaluated?

A

Cross Validation Study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

If there is a large effect size in a regression model, does this mean the same model will represent that effect size in a different model?

A

NO. Because an effect size from one sample is overestimated because of overfitting isues. Therefore next model you run on a new sample might have smaller effect size. Or one that is not statistically signicant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does it mean if another model is run on a new sample, that has a smaller effect size to the previous model?

A

SHRINKAGE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does a cross validation study assess?

A

How generalisable your model is

Estimated shrinkage in your r2 value and adjusted r2 value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Is small data or large data going to highlight odd things about data?

A

Small data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is word to describe a large discrepancy between r2 and adjusted r2?

A

Shrinkage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What does shrinkage indicate?

A

The regression model does not generalise well to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

A difference of about 0.5% between r2 and adjusted r2 is probably acceptable. The larger the difference is, the less our model will _____

A

generalise

More than a few percent (+3%) between r2 and adjusted r2 is considered unaccpetable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Are there guidelines about how much shrinkage is too much?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Some people argue that shrinkage can be evaluated as ___% being acceptable if r2 is .50. What is the advantage of this?

A

5%.

More leeway for shrinkage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the most useful effect size to be looking at for multiple regression?

A

F2! Unique variance for predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How did f2 come about?

A

It’s based on R2. IT tells us the unique effect of a variable on the outcome.

The f2 gives an effect size for the proportion of residual variance explained - for unique effects of a predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

When an overall model explains a lot of variance, would we expect the effect size for the same amount of unique variance go down or up?

A

UP.

SO, f2 gives an effect size for the PROPORTION of residual variance explained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is the difference between linear regression and multiple regression?

A

If two or more explanatory variables have a linear relationship with the dependent variable, the regression is called a multiple linear regression.

Linear is one predictor variable with a dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

When we run a regression, we hope to be able to _____ the ____ model to the _____?

A

Generalise the sample model to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Why do assumptions need to be met in order to be able to generalise to the population?

A

Because violating these assumptions can affect how well the regression model fits the data and how well the regression model can be generalised .Calls into question the validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Does population mean everybody or just the people you are interested in?

A

The latter

41
Q

What are the assumptions in a regression that we have to meet?

A

L.I.N.E

42
Q

What do regression plots show us?

A

If data is linear or not

43
Q

What is a partial regression plot and why is it important in multiple regression?
(residuals vs. predicted)

A

It’s a plot of fitted values not about prediction.

If you have multiple predictors in your model, you may just want to look at one.

BUT This plot is NOT the effect of X on Y. This is just a plot of FITTED VALUES.

This is the plot of the fitted values on the X axis. On the Y axis we have the residuals, here showing ranges for regular residuals and standardised residuals (error), how far away observed data points are from line of best fit.

44
Q

On the residuals vs predicted plot, is the data standardised?

A

Yes

45
Q

WHy do you not want to see the data fitting a line when looking at residuals in partial regression plot?

A

We want to see a horizontal line going through 0 on the Y axis to see if residual data points are randomly dispersed around horizontal axis. Which means model probably OK for linearity.

46
Q

IF there was a clear pattern in the residuals vs. predicted, what could you do to deal with this?

A

Try transforming the data with Log etc.

47
Q

Why is it not usually a great idea to transform data with Log in terms of what this would mean for the variable in the model?

A

Because the model would then be about the transformed variable than the actual one. And there may not be a significant relationship between untrasnformed predictor and outcome variable. So perhaps look at a different test to linear regression.

48
Q

The tricky assumptions for regression THAT involvec the Error terms are

Independent errors
Normally distributed errors
and Homoscedasticity

How do we check this?

A
  1. To check error terms are not correlated, you can look at the correlation coefficient I think mentioned in correlations the other week…
  2. For normally distributed error, look at partial regression plot (residuals vs. predicted) and see if there is an absence of a pattern. Want them to be random, normally distributed with mean of 0.
    BUT the predictors do NOT need to be normally distributed, just improves chances of this assumption being met
  3. Homoscedasticity - for each value or level of the predictor, variance of error term is constant (residuals should have the SAME variance). This is looked at visuallt
49
Q

True or false: predictors have to be normally distributed for assumption of normal distribution being met

A

False - know this

50
Q

Why is it that when assessing assumptions of error and checking normally distributed errors visually, why won’t we see the error term data not perfectly fitting a straight line?

A

Because it’s an error term, which represents what is left over therefore it won’t be exactly on the line or an error term would be zero. The closer the better, yes, but each individual data point has its own error.

The linear equation has error term with subscript i. I’s out come (Y) is = the intercept, slope, plus i’s resisual for error.

51
Q

Can histograms and QQ plots show us about whether errors are normally distributed?

A

Yes

52
Q

What do we expect to see when looking at homoscedasiticty of residuals?

A

That at each level of the predictor, resisuals have the same variance. So equal vairances. Heteroscedasticity = unequal

53
Q

The residuals at value X should be the ____ at each level of the predictor to meet asumption for homoscedasticity?

A

SAME

54
Q

What do the residuals do?

A

They tell us how good the model is and if it’s modelling what we think it is modelling.

55
Q

What are predicted values?

A

The estimated outcome values

56
Q

What are residuals?

A

The deviations of observed from predicted values

57
Q

Why are standardised predicted and residual values preferred?

A

They allow for easier interpretation (why?)

58
Q

The most commonly reported statistics for a multiple regression, historically, was:

A

unstandardised coefficient, the standard error, and the p-value

59
Q

What are the most commonly reported statistics for multiple regression now?

A

STANDARDISED coefficient, AND confidence intervals instead of the standard error

60
Q

What does a smaller confidence interval indicate/

A

A more precise estimate of the true population value, whereas a wide CI indicates more uncertainty above the true value, usually due to sample sie.

61
Q

What is the common threshold for 95% CI?

A

1 - .05

62
Q

What is preferred to be reported in results for multiple regression?

A. unstandardised coefficient
B. standardised coefficient
C. standard error
D. confidence intervl and standardised coefficient

A

D

63
Q

For categorical predictors, is it best to report standardised or unstandardised beta?

A

Unstandardised (B) because 1 unit = difference between male and female as an example.

Whereas Beta (b) which refers to 1 SD change, is more difficult to show difference between male and female

64
Q

A multiple regression is used to predict values of an outcome from ____ predictors

A

several

65
Q

Is it always best to report standardised (b) effect sizes?

A

No. Unstandardised (B) can be useful for categorical predictors due to 1 unit change indicating more hepful information than 1 unit change.

66
Q

Why will each X variable’s coefficient in a multiple regression equation be different?

A

Because in a MR, each coefficient is adjusted for all the other predictors in the model. MR takes into account variance explained by other IVs in the model.

67
Q

Why, in the equation for multiple regression, is the e(i) not included for prediction?

A

Because when formulating the prediction equation we are PREDICTING, and using the data we can not predict what someones error term would be because we don’t know it - it does not exist and we are just using a regression line to predict the score. We are not observing a score and seeing how far off the model is - we did that with observed data. So this is prediction.

68
Q

What does Y hat (triangle above) refer to?

A

Predicted score. It means it wasn’t something observed, but something you think will happen if you use the model you hope it predicted. But Y(i) is actual observed scores.

69
Q

Why can’t we compare the unstandardised coefficients (B) to see which predictor has most influence on the outcome variable?

A

SCALE issue.

Because unstandardised coefficients do not have a mean of zero, and are no z score versions therefore can not compare directly. As predictors might be measured on different scales, standardised coefficients (beta values) are the z score version of B, which means they all have the same mean of zero and a SD of one, so can compare them directly

70
Q

Why can we compare the standardised regression coefficient directly and therefore see which predictor has the biggest effect on the DV?

A

Because the standardised co efficient data values are Z score version of the b values (the unstandardised values) Which all have the same mean (0). Standard deviation, in other words, is 1. So can compare directly.

71
Q

WHen looking at a coefficient table, which is the data to look at regarding seeing which variable is the best predictor of the DV in the model?

A

The standardised coefficient. IT tells us how much outcome changes for each increase in 1 SD OF THE PREDICTOR.

72
Q

What does multicolliniearity show us?

A

When two predictor variables overlap alot, so HIGH intercorrelations between preditor variables.

If you have a VIF more than 10, something is multicollinear
Tolerances

73
Q

What would a high VIF (over 10) suggest?

A

That perhaps these variables, given their overlap, are measuring something similar conceptually

74
Q

Why might we include multiple indicators or variables into a latent or combined variable?

A

Multiple ways of measuring the same conceptual thing can give rich data, and help explain some construct with more varaince.

75
Q

What does a squared semi partial correlation (sr2) tell us?

A

The proportion of variability in the outcome uniquely accounted for by that predictor. So, just like correlations, its is the unique shared vriance of that predictor taking into account shared variance of the other predictor.

76
Q

What do we use to calculte f2?

A

sr2 and r2

77
Q

If your sample size is small in a regression model, and there are quite a few predictor variables, why is this problematic?

A

Eah time you had a predictor variable you decrease your degrees o freedom. This means your r2 value will appraoch 1, and r2 appraoching 1 meansyour model will explain 100% of the variance. But our model will probably fitting noise. It might just be a model specific to your sample and not generalisable to the population

78
Q

Just like Pearson correlations are tested for significance, regression equations should also be tested for signifiance to show if predictions are significantly better than chance. How do we compute this?

A

By computing an F ratio. A

79
Q

Just like Pearson correlations are tested for significance, regression equations should also be tested for signifiance to show if predictions are significantly better than chance. How do we compute this?

A

By computing an F ratio.

80
Q

What does a significant F ratio indicate?

A

That the equation predicts a significant proportion of the variability in the Y scores (i.e more than would be expected by chance alone)

It examines whether total model variance accounted for is significantly greater than 0.

81
Q

True or false: Total variance is ALWAYS greater than 0 because R2 cannot be less than 0, but greater than 0 does NOT indicate significance

A

True

82
Q

The ratio of variance accounted for by the regression model DIVIDED by the ratio of the total model variance is also known as..

A

R squared

83
Q

When looking at the ANOVA table, what does the F ratio show?

A

The F ratio compares the variance predicted by the model with the variance that’s left over. The residual variance. Or the variance not predicted by the model.

84
Q

The F ratio in an ANOVA table, when looking at the output, is calculated using mean squares (MS). How are mean squares calculated?

A

By taking the sum of squares and dividing my degees of freedom.

85
Q

True or false: we expect our model, sum of squares, to be much greater than residual or error sum of squares

A

TRUE. Because this means a good model

86
Q

When looking at the output for a regression, if we had to find out if the model overall resulted in a signfiicantly better prediction of X than simply using the mean, what would we look at?

A

The ANOVA table - specifically the F ratio and the P value.

87
Q

What makes up total variance in data of a whole model?

A

Whatevr variance the model we made explains AND variance in data that is leftover.

88
Q

What is another word for variance in data leftover for the model

A

residual variance

89
Q

If after running a regression model there is a better prediction than just using the mean, what would we expect the models sum of squares to be in comparison to the resiual or error sum of squares?

A

Expect model sum of squares to be GREATER than residual sum of squares

90
Q

If we looked at the anova table output and wanted to see if it was a good model, what would we hope to see when looking at the sum of squares?

A

That it is greater than the error sum of squares

91
Q

Do the outcomes in regression have to be continous?

A

Yes

92
Q

Can both predictors be categorical in a regression?

A

Yes

93
Q

If all of your predictors are categorical, you should use ANOVA. True or False.

A

True

94
Q

Predictor variables in a multiple regession should be:

A. Continuous
B. Dichotomous/Binary (0,1)
C. Recoded if categorical
D. All of the above

A

D

95
Q

is gender a nominal or ordinal categorical variable?

A

Nominal

96
Q

What should you do with a nominal variable like gender in a regression?

A

Dummy code. 0 and 1. If 1 refers to woman, then first variable recoding is Woman YES NO. Then Man, YES NO

97
Q

If dummy coding for gender, and 1 refers to a woman, what would the first variable recording be?

A

Woman YES NO

98
Q

When dummy coding for gender, how many variables do you create?

A

Three. one is reference group, and two is man or woman. These will go into multiple predictors.