Chapter 4 Describing the Relationship Between Two Variables Flashcards

1
Q

Least Squares Regression

A

The procedure used to determine if there is a linear relationship or correlation between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Least Squares Regression Line

A

The straight line that “best fits the data points” plotted in a scatter diagram. The line that has the least sum of its squared errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Least Squares Regression Procedure

A

Step 1: Construct a scatter diagram (explanatory variable on horizontal axis, response variable on the vertical axis)
Step 2: Determine the mathematical measures of linearity and the equation of the Least Squares Regression Line
- Correlation Coefficient
- Coefficient of Determination
Step 3: Plot and examine the residuals (difference between the regression line and the actual data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Positively Associated (or Correlated)

A

Occurs whenever the value of one variable increases, and the value of the other variable increases also.
—- In other words, if the trend has a positive SLOPE, the variables are positively associated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Negatively Associated (or Correlated)

A

Occurs if whenever the value of one variable increases, the value of the other decreases.
—- In other words, if the trend has a negative SLOPE, the variables are negatively associated (or correlated).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

NOT Linearly Associated (or Correlated).

A

Occurs If the trend of the data points shows neither a positive or negative slope, but rather a more or less random pattern

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Linear Correlation Coefficient (LCC)

A

A measure of the strength and direction of the linear relation between two quantitative variables. Denoted as “r” for samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

PROPERTIES of the LCC

A

-1 ≤ r ≤ 1
If r = +1, a perfect positive linear relation exists. (The closer to +1, the more positive the association)
If r = -1, a perfect negative linear relation exists. (The closer to -1, the more negative the association)
If r = 0, no evidence of linear relation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Coefficient of Determination

A

Measures the proportion of total variation in the response variable that is explained by the least-squares regression line. Denoted by R^2.
R^2 = (r) times (r)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Residual

A

On a scatter plot, the difference between the observed value of y and the value of y on a candidate least squares regression line (y^). Denoted (y - y^).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Slope-Intercept Form of the Equation of a Straight Line Applied to Linear Regression

A

y^ = mx + b, where “x” is the explanatory variable, y^ the estimation (or prediction) of the response variable, “m” is the slope of the line and “b” is the y-intercept of the line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Properties of the Coefficient of Determination (R^2)

A

0 ≤ R^2 ≤ 1
If R2 = 1, the regression line explains 100% of the variation in the response variable.
If R2 = 0, the regression line has no value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is the Linear Correlation Coefficient a resistant measure of linear association?

A

The LCC is not a resistant measure of linear association.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Does an LCC near zero mean that there is no relation between two variables?

A

It just means that there is no linear correlation between the variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Properties of the Linear Correlation Coefficient

A

The linear correlation coefficient is always between −1 and 1, inclusive. That is, −1≤r≤1.

If r=+1, then a perfect positive linear relation exists between the two variables. See Figure 4(a).

If r=−1, then a perfect negative linear relation exists between the two variables. See Figure 4(d).

The closer r is to +1, the stronger is the evidence of positive association between the two variables. See Figures 4(b) and 4(c).

The closer r is to −1, the stronger is the evidence of negative association between the two variables. See Figures 4(e) and 4(f).

If r is close to 0, then little or no evidence exists of a linear relation between the two variables. So a value of r close to 0 does not imply no relation, just no linear relation. See Figures 4(g) and 4(h).

The linear correlation coefficient is a unitless measure of association. So the unit of measure for x and y plays no role in the interpretation of r.

The correlation coefficient is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Using the LCC to Determine Whether a Linear Relation Exists Between Two Variables

A

If the correlation coefficient is positive and greater than the critical value, then the variables are positively associated.

If the correlation coefficient is negative and less than the opposite of the critical value, then the variables are negatively associated.

17
Q

Time-Series Data

A

Data collected over a period of time (say from 2000 through 2017).

18
Q

Lurking Variable

A

An explanatory variable that was not considered in the study, but affects the response variable.

In addition, lurking variables are typically related to explanatory variables considered in the study.

19
Q

Least-Squares Regression Criterion

A

The least-squares regression line minimizes the sum of the squared errors (or residuals). This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line, yˆ (read “y-hat”). We represent this as “minimize∑residuals2”.

20
Q

Key Ideas about the Least-Squares Regression Line

A

The least-squares regression line, yˆ=b1x+b0, always contains the point (x-bar, y-bar).

Because sy and sx must both be positive, the sign of the linear correlation coefficient, r, and the sign of the slope of the least-squares regression line, b1, are the same.

The predicted value of y,yˆ, is an estimate of the mean value of the response variable for any value of the explanatory variable.

21
Q

Rounding the Slope and y-Intercept

A

Throughout the course, we agree to round the slope and y-intercept to four decimal places.

22
Q

Residual Plot

A

A scatter plot where the explanatory variable is plotted on the horizontal axis and the corresponding residual is on the vertical axis.

23
Q

What does it mean if the residual plot shows a discernable pattern?

A

The response and explanatory variables may not be linearly related.

24
Q

Univariate Data

A

Data in which a single variable was measured for each individual in the study.

25
Q

Bivariate Data

A

Data in which two variables are measured on an individual.

26
Q

Response Variable

A

The variable whose value is can be explained by the value of the explanatory or predictor variable.

27
Q

Explanatory Variable

A

The variable whose value is thought to play a role in determining the value of the response variable.

28
Q

Predictor Variable

A

Another name for the explanatory variable.

29
Q

Scatter Diagram

A

A graph that shows the relationship between two quantitative variables measured on the same individual. The explanatory variable is plotted on the horizontal axis, and the response variable is plotted on the vertical axis.

30
Q

Outside The Scope of the Model

A

The practice of using the regression model to make predictions for values of the explanatory variable that are much larger or much smaller than those observed.

31
Q

Deviation

A

Differences between the value of two quantitative variables.

32
Q

Total Deviation

A

The deviation between the observed and mean values of the response variable, denoted y - y-bar.

33
Q

Constant Error Variance

A

A requirement of the least-squares regression model that states the spread of the residuals should remain fairly constant when plotted against the explanatory variable.

34
Q

Outlier

A

An observation that is inconsistent with the overall pattern of the data.