W7 - Regression Flashcards

1
Q

What is regression analysis used for?

A

To predict one variable based upon the score in the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is regression concerned with?

A

Prediction of 1 variable from a RELATED variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In what 3 ways does regression + correlation analysis differ?

A

In its purpose

How variables are described

The inferential tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 ways in which regression + correlation analysis differ

What is meant by in its purpose

A

When talking about correlations = talk about relationships, associations + correlations.

Regression analysis = Clearly talking about prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3 ways in which regression + correlation analysis differ

What is meant by how variables are described

A

For correlation analysis = makes no difference which variable is on X or Y axis of scatterplot.

Regression analysis = Independent always on X axis, dependent variable always on Y axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 ways in which regression + correlation analysis differ

What is meant by the inferential tests

A

Correlation analysis = Primarily interested in the r value.

Regression analysis = Interested in 3 bits of info;

  1. r^2 value (shared variance between 2 variables)
  2. Intercept (a)
  3. Regression coefficient (b)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the regression line on a graph do?

A

Minimises the vertical deviations of the points from the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does it result in when the vertical distance between the line + points on a scatter plot are minimised?

A

We are minimising the error of prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

For a scatter plot with regression analysis, what is also known as the dependent variable on the y axis?

A

Predicted variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

For a scatter plot with regression analysis, what is also known as the independent variable on the x axis?

A

Predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the bivariate regression equation?

A

Algebraic equation expressing the prediction of 1 variable by another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is the bivariate regression equation usually written?

A

Y = a + bX

Also written as:

Y = bX + c

^^ Where C=a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

BIVARIATE REGRESSION EQUATION

What does the Y represent?

A

Dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

BIVARIATE REGRESSION EQUATION

What does the a represent?

A

Constant (Y intercept)

Where the line of best fit would cross through the y axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

BIVARIATE REGRESSION EQUATION

What does the X represent?

A

Independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

BIVARIATE REGRESSION EQUATION

What does the b represent?

A

Regression coefficient (slope)

= Change in y / change in x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Steps to perform regression analysis

A

1+2. Consider Null + Alternative hypothesis |(make sure to include WILL or will NOT predict)

  1. Select level of significance
  2. Collect + summarise data
  3. Check assumptions
  4. Run Statistical test
  5. Interpret significance of result
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Steps to perform regression analysis

Give a null hypothesis example for:

Sum of skin fold measures + body fat %

A

There’s no significant relationship between the variables.

More specifically, the sum of skin folds will NOT predict % body fat.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Steps to perform regression analysis

Give a alternative hypothesis example for:

Sum of skin fold measures + body fat %

A

There is a significant relationship between the variables.

Most specifically, the sum of 5 skin folds will predict % body fat.

20
Q

Steps to perform regression analysis

  1. Select level of significance

Give an example using:

Sum of skin fold measures + body fat %

A

If our probability level is less than 0.05 (p<0.05), we are 95% confident that explained variability in DXA % body fat by sum of 5 skin folds is greater than one might expect by chance if there was no explained variability in the population.

21
Q

Steps to perform regression analysis

  1. Assumptions for bivariate regression
A

Need to ensure data is parametric

22
Q

What comes under data being parametric?

A

Normal distribution

Homogeneity of variance

Interval/ratio (continuous)

Independence

Linearity

Residual values are normally distributed

23
Q

What is the vertical distance between the data points + the line in scatter plots also known as?

A

Residual distance

24
Q

Does each point on a scatter plot have a vertical/residual distance?

A

YES

25
Q

What does R represent

A

Simple Pearson correlation coefficient

26
Q

What does R^2 represent

A

Coefficient of determination

27
Q

What must you do when running a statistical test for regression analysis?

A

Find:

  • R^2 value
  • a value
  • b value
28
Q

In SPSS what can be found in the analysis of variance box in a table?

A

Titles:

  • F
  • Sig.
29
Q

Where can the a + b values be found in the SPSS output?

A

Under the unstandardised coefficients in the coefficients box

In column B

30
Q

What in the coefficients box in SPSS outputs, tells us whether the b-value (slope) is significantly different from 0?

A

The t statistic + sig. value columns at the end.

31
Q

List ways in which regression analysis can be used in Exercise + sport sciences

A

Predicting skill perf from self-efficacy

Predicting obesity risk from daily PA levels

32
Q

Way in which regression analysis can be used in the real world

A

Pre-London 2012, regression tech were used to predict no. of GB medals

33
Q

What is the SD of the residual/vertical distances known as?

A

Standard Error of the estimate (SEE)

34
Q

How is a prediction interval created using SEE?

A

z-score (i.e 1.96) x SEE

35
Q

What is a regression equation used for?

A

To estimate the value of the DV based on the value of the IV

36
Q

In bivariate regression, if the slope of the regression line was 2.1, this would mean….

A

That for every increase of 1 on the X axis there is an increase of 2.1 on the Y axis

37
Q

What does variance of a single variable represent?

A

The avg amount that the data may vary from the mean

38
Q

How do you calculate the exact similarity between the patterns of differences of the 2 variables in the single-variable case?

A

Square the deviations

  • to eliminate the problem of +ive + -ive deviations cancelling each other out.
39
Q

How do you calculate the exact similarity between the patterns of differences of the 2 variables in the 2-variable case?

A

Multiply the deviation for 1 variable by the corresponding deviation for the 2nd variable.

To get the cross-product deviations.

40
Q

What is covariance?

A

The avg sum of combined deviations

41
Q

What does a +ive covariance indicate?

A

That as 1 variable deviates from the mean, the other variable deviates in the same direction.

42
Q

What does a -ive covariance indicate?

A

That as 1 variable deviates from the mean, the other deviates from the mean in the opposite direction.

43
Q

Between what values must the correlation coefficient lie between?

A

-1 + 1

44
Q

What is the spearman’s correlation coefficient?

A

Non-parametric statistic

Useful to minimise the effects of extreme scores or the effects of violations of the assumptions.

45
Q

In regression analysis, what assumptions do we need to check to see if the data is parametric?

A

Normality

Linearity

Homogeneity of variance

Independence

Interval/ratio data

Normally distributed residuals

46
Q

Which assumptions can be examined using a scatterplot graph?

A

Linear relationship

Homogeneity of variance