Lecture 5: Research Questions for Predictions II Flashcards

1
Q

What does the OLS estimator do?

A

It guarantees that the sum of squared residuals will be the smallest possible, given the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is R-squared?

A

An effect size measure of the strength of prediction in a regression analysis, sometimes called the coefficient of determination (SSreg /SStotal).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the possible values of R-squared?

A

0 to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What data do we require to calculate the CI of R-squared?

A
  1. The estimated R-squared value
  2. Numerator df of the F-statistic
  3. Denominator df of the F-statistic
  4. Desired level of confidence (usually 95%)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between the observed R-squared value and the adjusted R-squared estimate?

A

In small samples and/or many IVs, the observed R-squared value will be larger than the true population. The car package contains an adjusted R-squared estimate that is less biased, so we report both values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What determines the size of each partial regression coefficient?

A

The scaling/metric of each independent variable (expressed by the size of the SD). If the scaling differs, we cannot use the relative size of each value to say which is the stronger predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two ways to make meaningful comparisons between IVs?

A
  1. Transform b (from y = bx + e) to be expressed as standardised partial regression coefficient. This is only for arbitrary scaling (not for age, or RT). We use the lm() function to do this.
  2. Use semi partial correlation as an effect size estimate. This removes all effects from the remaining IVs from the focal IV. The squared semi partial correlation indicates the proportion of variation in the DV uniquely explained by each IV. We can use the srCorr function to do this.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the four statistical assumptions of the linear model?

A
  1. Independence of observations: The scores obtained are independent and not duplicated
  2. Linearity: Scores on the DV are an additive linear function of scores on the set of independent variables (checked with scatterplots)
  3. Constant variance of residuals (homoscedasticity): The variance of the residual scores are the same for any score on the IV
  4. Normality of residual scores: The distributions of the residual scores are normally distributed (checked with histograms, qqplots or boxplots)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly