Simple Regression Flashcards

1
Q

Explain what the regression line can be used for

A
  • Prediction

- Estimating the magnitude of effects of the predictor on the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define the regression line

A

A straight line drawn through a scatterplot of two variables that comes as close to the data points as possible

Line of best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Method of least squares

A

Method used to find the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the intercept in regression analysis?

A

Point at which the regression line cuts through the Y-axis or a in the regression equation

E.g. with no practice at all on a test, the score would be 2.45

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Slope

A

Another name for the regression coefficient or b in the regression equation

The number of units that the regression line moves on the Y-axis for each unit it moves along the x-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the linear regression equation and what can it be used for?

A

y = a + b * x

The value of y is equal to a (intercept) plus b (slope) multipliedW by the value of x for the given point

Use to predict how a case with a given score on x will score on Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do you compare the line of best fit with when assessing the significance of the effects of the predictor on the outcome?

A
  • A regression line that is flat
  • A line based on the mean value of the outcome
  • A line indicating that the value of Y is always the same regardless of changes in the value of X
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the value of the regression coefficient when the regression line is flat?

A

0

Implies that a line based on the mean sees the two variables as having no relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define what is meant by model sum of square (SSm)

A

The portion of total variance that the regression line accounts for

The difference between the total variance in Y scores and the variance in Y scores accounted for by the regression line.

Obtain by calculating the difference between the mean and each value of Y as predicted by the regression line, then square each difference and finally calculate the sum of all squared differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When performing a simple regression, what does the F-value in the ANOVA table show?

A

The ratio between the portion of total variance accounted for by the regression line and the variance not accounted for by the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

R-square

A
  • Also known as coefficient of determination
  • The proportion of variance in Y explained by X
  • The variance explained by the regression line divided by the total variance in Y to be explained.
  • Proportion of total variance in Y explained by the regression line/model (SSm), relative to how much variation there was to explain in the first place (SSt)
  • Correlation coefficient squared
  • SSM/SSt
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Adjusted r square

A

Adjusted measure of R square accounting for possible overestimation.
Reduced value for R squared attempting to make an estimate of the value of R squared in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Total sum of squares (SSt)

A

The line based on the mean of Y scores and its residuals

Calculate by calculating the difference between each actual value of Y and the corresponding predicted value of Y, then square each difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the difference between SSR, SSM and SST

A

SSR (Sum or squared residuals) - Variance in Y that is not explained by the regression line. How well a linear model fits the data. Uses the differences between the observed data and the model.

SSM (Model sum of squares) - Variance in Y that is explained by the regression line. Uses the differences between the mean value of Y and the model.

SST (Total sum of squares) - Total variance in Y to be explained by the regression line, cannot account for. Represents the degree of inaccuracy when the best model is fitted to the data. Uses the differences between the observed data and the mean value of Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When performing a simple regression, what does the coefficients table tell you?

A

Provides further information about the magnitude of the effects of X on Y

Beta =The standardised regression coefficient
B on constant = Value of the intercept
B on variable = Value of the slope (regression coefficient)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Beta

A

Refers to how much the value of the outcome increases or decreases as the value of the predictor increases of 1 standard deviation unit.

In a scatterplot describing the relationship between the two standardised variables. Beta is the slope of the regression line.

  • Slope refers to the number of SDs that the regression line moves on the Y-axis for each SD it moves along the X-axis.
  • Express in SD units to make it more comparable. You then know which one is impacted more strongly.
17
Q

When is the value of beta the same as the value of r?

A

Value of beta is the same as the value of r in simple regression, but not in multiple regression.

18
Q

Provide an example of the relationship between b, r, beat and r-square

A

The value of b depends on the steepness of the slope while the value of Beta depends on how closely clustered around the line the data points are.

19
Q

Discuss causality in regard to simple regression

A

If you find that one variable has an effect on another, this does not mean that changes in one determine/are the cause of changes in the variable.

It simply allows us to guess the effect of a variable and whether they are related. It is not the determinant.

There is a difference between correlational studies and experiments. A correlational study is where you measure the variables of interest without any form of manipulation.

In order to assume causality you need to do an experiment. This is when the IV is manipulated to observe the effects on the DV. Then you can say that it has caused something.

20
Q

Provide an example of when you can claim causation from correlational evidence

A

If you are confident that the found relationship between X and Y is not due to the fact that a third variable determines both X and Y.

21
Q

Residual

A

The difference between the Y value of the actual case and the Y value that the case would take if lying on the line
- Difference between what the model predicts and the observed data

If you calculate each residual, then square it and add them all up you obtain the sum of squared residuals (SSr)

The line with the lowest SSR is the line of best fit.

22
Q

Explain why you need to assess the goodness-of-fit of the regression line

A

The regression line is the best line available for fitting the data. This means that it allows you to make the best possible predictions about how a case with a given score on X will score on Y

But also need to assess how well the regression lines fits the actual data so that we know how accurately the values of X can predict values of Y.

23
Q

How do you assess the goodness of fit of the regression line?

A

Look at how much more variability in the outcome variable the regression line is able to explain by comparison with the line based on the mean. Then divide this amount by the variance unexplained by the regression line.

This is a way of assessing how well the model fits the observed data

24
Q

What is the difference between a simple and multiple regression?

A

Simple regression is when you have one predictor variable whereas in multiple regression you have several predictors.

25
Q

When performing a simple regression, what is a b-value?

A

Tells us the gradient of the regression line and the strength of the relationship between a predictor and the outcome variable.