lecture 3 - simple linear regression Flashcards

1
Q

correlation - how to you calculate pearsons

A
  1. Calculate covariance between the X and Y variables, and then standardize
  2. Convert the X and Y scores to z-scores (standard scores), then divide by n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is r (correlation

A

r is the change in SD units of Y that occurs for every 1 SD change in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

correlation vs regression

A

Correlation: is there a relationship between 2 variables?

Regression: how well does one variable predict the other variable?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is required when predicting regression

A

Prediction requires calculating a line of best fit (an equation)

You can then use this equation to obtain a best-fit estimate for any new data point (X) within the range of the original data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

variables
dv
iv
-simple linear vs multiple regression

A

Dependent variable (DV) or
criterion variable or
response variable
= the variable that you are
trying to predict

Independent variable (IV) or predictor variable or regressor
= the variable that you are trying to predict from

Simple linear regression = 1 predictor variable
Multiple regression = 1+ predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the regression equation / straight line

A

Y = a + bX

a = intercept
b = slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

error in regression

A

In psychological research, we never get such perfect relationships (because determinants of any psychological variable are very complex and because of error in instruments).
A more likely scenario: Y = a + bX
+ error

Goal: find a regression line that provides the best prediction possible
i.e., a regression line that minimizes error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the best regression line in terms of sum of squared deviations

A

The best regression line is the line that minimizes the sum of squared deviations
(i.e., a line that satisfies the “least squares” criterion)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

best fit line / regression line steps

A

Deviations (i.e., residuals) = predicted value minus observed value.
Step1: for each data point, calculate the deviation, then square it.
Step2: across the dataset, add up all deviations (→ sum of squared deviations).
Best fit: the equation that produced the smallest SSERROR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

smaller least squares = ….

A

Smaller Least Squares = less prediction error
= relatively better fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

poor fitting lines generate ______ predictions

A

Remember: Poor-fitting lines generate poor predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to calculate r

A

step 1 : convert X and Y into z-scores
step2: multiply z(X) by z(Y)
step 3 : add up
step 4 : divide by n -1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the values of a (the intercept) and b (the slope) that satisfy the Least Squares criterion

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

explained variance

A

How “good” are these predictions? How good is the fit of the regression model?

How much variance does X explain in Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do we see how good the predictions are / the fit of the regression model

A

Calculate how much variance there is in Y in total: Sum of squares Y (SSTOTAL)
Calculate how much variance X can explain: Sum of squares X (SSREGRESSION)
Calculate how much variance is not explained: Sum of squares Residual (SSERROR)

SSTOTAL = SSREGRESSION + SSERROR
SSY) (SSX) (SSRESIDUAL)

Goal:
SSX as high as possible + SSERROR as low as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to calculate regression using explained variance

A

Step1: calculate the difference
(deviation) between each score
and the mean
Step2: square the deviations
Step3: add up
use equation

17
Q

what is the easier way to adress how much varince x ecplains

A

Calculate R2 (the coefficient of determination):

18
Q

F statistic

A

After you compute a regression equation and R2, you still have to determine whether the model
accounts for a statistically significant amount of variance.
→ SPSS output:
you get an F statistic, which tells you whether the predictions of the model are
better than if you were to use only the sample mean to predict performance.

F =
what the regression can explain (i.e., SSREGRESSION ) /////
what the regression cannot explain (i.e., SSERROR )

19
Q

reporting linear regression from SPPS output

A

“The number of years of education [i.e., variable X] significantly predicted memory
scores [i.e., variable Y], b=3.98, accounting for 57% of the variance in memory scores,
F(1, 15)= 19.21, p<.05. This supports/confirms the hypothesis that…”
If there is no relationship:
“There was no relationship between [variable X] and [ variable Y], b=xxx, p>.05, R2 = xxx,
suggesting that…