Regression and Correlation Flashcards

1
Q

What is correlation?

A

Correlation quantifies the strength of the association between two quantitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Pearson’s correlation coefficient a measure of?

A

The scatter underlying a linear trend between two quantitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is linear regression?

A

It studies the linear relationship between two quantitative variables when one (dependent variable) is modelled as depending on the other (independent variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A linear regression model allows predictions about the dependent variable to be made among individuals. T/F?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Correlation models allows predictions about the dependent variable to be made among individuals. T/F?

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation is quantified on a scale from -1 to 1. T/F?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the assumptions in calculating correlation?

A

Independent observations
Bivariate Normal distribution
Relationship between X and Y is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The value and units of measurements of variables is unimportant for measuring correlation. T/F?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The value and units of measurements of variables is unimportant in regression models. T/F?

A

False - this is significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which variable is classified as X and which as Y is significant in correlation. T/F?

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which variable is classified as X and which as Y is significant in correlation. T/F?

A

True - Y should be the dependent variable. and X should be the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is the calculation for a CI or hypothesis test for correlation not the same as for tests of associations?

A

Because the sampling variability for correlation does not follow a Normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can a straight line be described mathematically?

A
Y = alpha + beta X
(alpha = y-intercept)
(beta = gradient of line)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In a regression model, what is a residual?

A

The vertical distance of a data point from the line of best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is the best fit line plotted in linear regression models?

A

The best fit line is taken as the. one which makes the sum of the squares of. the residuals as small as possible. I.e. it minimises the variance of the residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the other name for the line of best fit in linear regression models?

A

The Least Squares Linear Regression Line

17
Q

In the Minitab/SPSS output for a linear regression model, what does the row labelled ‘constant’ represent?

A

This gives a test of the null hypothesis that the intercept of the population regression line is 0

18
Q

In the Minitab/SPSS output for a linear regression model, what does the row labelled ‘weight’ represent?

A

This gives a test of the null hypothesis that the slope of the regression line is 0

19
Q

In the Minitab/SPSS output for a linear regression model, what does the column labelled ‘SE coefficient’ represent?

A

The standard errors of the coefficients, allowing calculation of the CIs of the coefficients

20
Q

In the Minitab/SPSS output for a linear regression model, what does the quantity ‘s represent?

A

The standard deviation of the points around the regression line

21
Q

In the Minitab/SPSS output for a linear regression model, what does the row labelled ‘R-sq’ represent?

A

The coefficient of determination which tells of the percentage of the variability in Y which is explained by variation in X

22
Q

What are the assumptions for linear regression?

A

Constant variance - the spread of the response Y, about its average value is the same for all values of X
Linearity - the average of the response, Y, is a linear function of the explanatory X
Independent observations
Normality of residuals
Error free values for x - for each pair of observations, the predictor x needs to be known with no error and the response y is a random observation

23
Q

X and Y must follow a normal distribution in linear regression. T/F?

A

False

24
Q

A prediction interval (in linear regression) will be wider than the confidence interval. T/F?

A

True

25
Q

What is a prediction interval?

A

An estimate of the interval in which a future observation will fall