Lecture 5: Research Questions for Predictions II Flashcards

Question 1

Q

What does the OLS estimator do?

Answer

A

It guarantees that the sum of squared residuals will be the smallest possible, given the data.

Question 2

Q

What is R-squared?

Answer

A

An effect size measure of the strength of prediction in a regression analysis, sometimes called the coefficient of determination (SSreg /SStotal).

Question 3

Q

What are the possible values of R-squared?

Question 4

Q

What data do we require to calculate the CI of R-squared?

Answer

A

The estimated R-squared value
Numerator df of the F-statistic
Denominator df of the F-statistic
Desired level of confidence (usually 95%)

Question 5

Q

What is the difference between the observed R-squared value and the adjusted R-squared estimate?

Answer

A

In small samples and/or many IVs, the observed R-squared value will be larger than the true population. The car package contains an adjusted R-squared estimate that is less biased, so we report both values.

Question 6

Q

What determines the size of each partial regression coefficient?

Answer

A

The scaling/metric of each independent variable (expressed by the size of the SD). If the scaling differs, we cannot use the relative size of each value to say which is the stronger predictor.

Question 7

Q

What are the two ways to make meaningful comparisons between IVs?

Answer

A

Transform b (from y = bx + e) to be expressed as standardised partial regression coefficient. This is only for arbitrary scaling (not for age, or RT). We use the lm() function to do this.
Use semi partial correlation as an effect size estimate. This removes all effects from the remaining IVs from the focal IV. The squared semi partial correlation indicates the proportion of variation in the DV uniquely explained by each IV. We can use the srCorr function to do this.

Question 8

Q

What are the four statistical assumptions of the linear model?

Answer

A

Independence of observations: The scores obtained are independent and not duplicated
Linearity: Scores on the DV are an additive linear function of scores on the set of independent variables (checked with scatterplots)
Constant variance of residuals (homoscedasticity): The variance of the residual scores are the same for any score on the IV
Normality of residual scores: The distributions of the residual scores are normally distributed (checked with histograms, qqplots or boxplots)

Lecture 5: Research Questions for Predictions II Flashcards

(8 cards)