Chapter 4 - Correlation Flashcards
What is Bivariate Data?
✯ Bivariate data is data with two variables:
explanatory variables and response variables.
What is an Explanatory Variable?
✯ The explanatory variable is the independent variable. This is plotted on the horizontal axis (x).
What is a Response Variable?
✯ The response variable is the dependent variable. This is plotted on the vertical axis (y).
What is a Casual Relationship?
✯ When two variables are directly linked, we say that two variables have a causal relationship.
Why is it important to distinguish between casual relationships and correlation?
Correlated variables may not necessarily have a casual relationship.
When can we use a linear model to model bivariate data?
✯ A linear model can be used to model bivariate data if it has:
- Strong/Perfect Positive Correlation
or…
- Strong/Perfect Negative Correlation.
What is a Regression Line?
✯ A regression line is the line of best fit.
What is the form of a regression line?
✯ A regression line of y on x is of the form:
Y = ax + b (where a & b are constants)
We can only use the regression line to estimate values in which scenario?
- Estimate Y given x
or
- Estimate X given y
- It can only be used to estimate y given x and NEVER x given y.
What does the coefficient of b tell us?
(Y = ax + b)
✯ The coefficient b tells us the change in y for each unit change in x.
How does the the sign of the coefficient of b impact the correlation?
(Y = ax + b)
✯ If the data is positively correlated, b will be positive.
✯ If the data is negatively correlated, b will be negative.
Consider y = 5x+2 , 1 < x< 10.
If x = 2, will our value be interpolated or extrapolated?
What does this tell us about the reliability of the data?
✯ When we plug x = 2 into our equation, we get y = 12. This is an interpolated value because we use x = 2 which lies in the interval 1 < x< 10.
✯ Interpolated values are reliable.
Consider y = 5x+2 , 1 < x< 10.
If x = 20, will our value be interpolated or extrapolated?
What does this tell us about the reliability of the data?
✯ When we plug x = 20 into our equation, we get y = 102. This is an extrapolated value because we use x = 20 which lies outside the interval 1 < x< 10.
✯ Extrapolated values are unreliable.