Lecture 36- Correlation Flashcards

1
Q

What does the correlation coefficient (r) summarize?

A

The strength of a linear relationship between variables as well as the direction of this relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you interpret r i.e. what values mean what?

A
  • r is always between -1 to +1
  • A positive r value means Y and X increase together
  • A negative r value means as Y increases X decreases (and vice versa: basically what ever one variable does the other variable is doing the opposite thing).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does r=0 mean?

A

There is no linear relationship between the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a strong/ weak relationship look visually?

A
  • Weak= more scatter

- Strong= Points clustered heavily around the line of best fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Calculate the correlation coefficient using the data on slide 694 and the equation found here (don’t need to memorize)?

A

Answers on slide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What function in r calculates the correlation coefficient?

A

cor(x,y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you set up data in r?

A

x=c(data)
y=c(data)

Note: can use = or a backwards arrow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is S subscript xy?

A

The sample covariance between x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can the correlation coefficient ‘r’ be rewritten as?

A

S(subscript xy)/ Sx times Sy

Note: Sx and Sy are the sample standard deviations for the x and y variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can a correlation coefficient be used for prediction? Why or why not?

A

No, because its not a model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is meant by the statement that the correlation coefficient is symmetric in variables?

A

Correlation between x and y is the same as correlation between y and x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is R^2?

A
  • The coefficient of determination: how well does our regression model describe the data
  • Is the squared correlation between the observed and predicted responses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you interpret R^2 i.e. what does the numbers mean?

A
  • Close to 1= regression model describes the data well
  • Low value (close to 0) indicates a regression that describes the data poorly

(can only be between 0 and 1, not such thing as a negative R squared value because squaring by nature removes negative signs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the total sum of squares describe in contrast to R^2?

A
  • Total sum of squares (TSS)= overall variation in the response variable
  • R^2 is instead the proportion of variation in the response that is explained by the predictor variable i.e. how good our model is
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the residual sum of squares (RSS)?

A
  • The total variation of the data points about the regression line i.e how far are our measured y values from the prediction (according to our fitted model)
  • In other words RSS is the variation not explained by the regression model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is ESS equal to? What does ESS stand for?

A

ESS= TSS- RSS

ESS= explained sum of squares. The amount of variation explained by the regression model

17
Q

Complete the sentence: correlation does not equal…

A

causation

18
Q

Find R squared for the stress data on slide 702

A

Answers on slide