Chapter 4 Describing the Relationship Between Two Variables Flashcards
Least Squares Regression
The procedure used to determine if there is a linear relationship or correlation between two variables.
Least Squares Regression Line
The straight line that “best fits the data points” plotted in a scatter diagram. The line that has the least sum of its squared errors.
Least Squares Regression Procedure
Step 1: Construct a scatter diagram (explanatory variable on horizontal axis, response variable on the vertical axis)
Step 2: Determine the mathematical measures of linearity and the equation of the Least Squares Regression Line
- Correlation Coefficient
- Coefficient of Determination
Step 3: Plot and examine the residuals (difference between the regression line and the actual data)
Positively Associated (or Correlated)
Occurs whenever the value of one variable increases, and the value of the other variable increases also.
—- In other words, if the trend has a positive SLOPE, the variables are positively associated.
Negatively Associated (or Correlated)
Occurs if whenever the value of one variable increases, the value of the other decreases.
—- In other words, if the trend has a negative SLOPE, the variables are negatively associated (or correlated).
NOT Linearly Associated (or Correlated).
Occurs If the trend of the data points shows neither a positive or negative slope, but rather a more or less random pattern
Linear Correlation Coefficient (LCC)
A measure of the strength and direction of the linear relation between two quantitative variables. Denoted as “r” for samples
PROPERTIES of the LCC
-1 ≤ r ≤ 1
If r = +1, a perfect positive linear relation exists. (The closer to +1, the more positive the association)
If r = -1, a perfect negative linear relation exists. (The closer to -1, the more negative the association)
If r = 0, no evidence of linear relation.
Coefficient of Determination
Measures the proportion of total variation in the response variable that is explained by the least-squares regression line. Denoted by R^2.
R^2 = (r) times (r)
Residual
On a scatter plot, the difference between the observed value of y and the value of y on a candidate least squares regression line (y^). Denoted (y - y^).
The Slope-Intercept Form of the Equation of a Straight Line Applied to Linear Regression
y^ = mx + b, where “x” is the explanatory variable, y^ the estimation (or prediction) of the response variable, “m” is the slope of the line and “b” is the y-intercept of the line.
Properties of the Coefficient of Determination (R^2)
0 ≤ R^2 ≤ 1
If R2 = 1, the regression line explains 100% of the variation in the response variable.
If R2 = 0, the regression line has no value.
Is the Linear Correlation Coefficient a resistant measure of linear association?
The LCC is not a resistant measure of linear association.
Does an LCC near zero mean that there is no relation between two variables?
It just means that there is no linear correlation between the variables.
Properties of the Linear Correlation Coefficient
The linear correlation coefficient is always between −1 and 1, inclusive. That is, −1≤r≤1.
If r=+1, then a perfect positive linear relation exists between the two variables. See Figure 4(a).
If r=−1, then a perfect negative linear relation exists between the two variables. See Figure 4(d).
The closer r is to +1, the stronger is the evidence of positive association between the two variables. See Figures 4(b) and 4(c).
The closer r is to −1, the stronger is the evidence of negative association between the two variables. See Figures 4(e) and 4(f).
If r is close to 0, then little or no evidence exists of a linear relation between the two variables. So a value of r close to 0 does not imply no relation, just no linear relation. See Figures 4(g) and 4(h).
The linear correlation coefficient is a unitless measure of association. So the unit of measure for x and y plays no role in the interpretation of r.
The correlation coefficient is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient.