Week 7 Ch. 19 Bivariate Regression Hills Flashcards

0
Q

What is the outcome variable?

A

This is the variable that we want to predict - the outcome or criterion.
E.g. Knowledge of environmental issues. The DV… Not a strictly correct term.

The variable May be age… The predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What is prediction based on in multiple regression?

A

Prediction is based on creating the best line of fit or regression line…a line that is as close as possible to all the points on a scatter plot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Scatter plot axis

The predictor is plotted on …. Axis.

A

The predictor is plotted on the X axis (horizontal). Eg age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Scatter plot axis

The criterion or DV is plotted on the ….. Axis.

A

The criterion is plotted on the vertical axis or Y axis.

E.g. Knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The line of best fit indicates what?

A

The line of best fit - regression line - is the line on the scatter plot that is closest on average to all observation points. It is the line that allows the best possible prediction of Y scores from knowledge of X scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Perfect correlation - how does it look on the scatter plot?

A

A perfect correlation of +/-1 has all of the ,points falling on the regression line.
The smaller the correlation, the more inaccurate the prediction.
P.249 Hills

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we make a prediction using the regression line?

A

A line is drawn perpendicular to the x-axis.
At the point where this line meets the regression line, another line is drawn perpendicular to the y axis to give the best possible prediction (Y hat thingy) of the person’s knowledge score.
P.249 Hills.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Residuals…. what are they?

Another annoying term for a basic concept.

A

Residuals are errors in prediction.
They are the the difference between actual Y scores and the predicted Y scores….
(Y - Y hat).

Y hat is the Error of prediction where the intercept of lines from x across to regression line and y … You know what I mean. :)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What criterion is used when calculating the regression line?

A

When calculating the regression line, the LEAST SQUARES CRITERION is used.

If all the residuals are added, they will sum to ‘0’ because there is ‘as much’ above the regression line as there is below.
The least squares regression line is calculated so that it minimises the sum of the squared residuals.
P.249

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The equation for a regression line.

A

The equation for the line of best fit:

Y(hat) = a + bX

a is the intercept (constant)
b b-weight (or regression coefficient) is the slope of the line (the amount by which Y increases for every 1 unit increase in X)

Linear regression is a technique for calculating the values a and b for the least squares regression line.
P.250

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

BETA

A

Note that Beta in the SPSS table is the ‘standardised b weight’,
calculated using the standard, not raw scores!
With the regression line passing through the origin (0 on the y-axis).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

SPSS table headed COEFFICIENTS

A

The column for “unstandardised coefficients std error’

Gives the standard error of the ‘b-weight’ and ‘constant’ respectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ASSUMPTIONS of Bivariate Regression (Linear Regression)

A
Similar assumptions to correlation analysis:
# relationship between variables must be linear
# distribution should be equal across the range of X scores... Homoscedastic NOT Heteroscedastic.
# there should be no restricted range on one or both variables (see p.237 Hills)
# outliers are a serious problem as they distort the correlations... Usually need to be deleted but need to report these.
# be aware of extreme groups or combining of groups with different means
# participants should be randomly sampled (p.238) and independent of one another.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The STANDARD ERROR of the ESTIMATE (Standard error of prediction)

A

The standard error of the estimate, which is the final figure given in the SPSS Model Summary table, is similar to standard deviation in univariate distributions, and corresponds to the average amount of error in predicted Y scores.
When normally distributed we can conclude that 68% of actual Y scores will be within one standard error (i.e. +/- 4.31 points) of the predicted Y (hat) score.

P.252 Hills
The higher the correlation, the smaller the standard error of prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Percentage of Variance

A

Scores on any variable vary about the mean.
When two variables are correlated, we can EXPLAIN or ACCOUNT for or MORE CORRECTLY predict part of the variance in one from knowledge of the other and vice-versa.
P.253 Hills.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a valuable way to interpret the strength of a correlation?

A

A valuable way to interpret the strength of a correlation is though the use of r2’.
This indicates the proportion of variance in one variable explained by the other, and vice-versa.
P. 253 Hills