chapter 4: scatterplots and correlation Flashcards
The plan phase of a two-variable scatterplot scenario might be phrased like this:
Make a scatterplot with “[variable 1]” as the explanatory variable and “[variable 2]” as the response variable. Describe the form, direction, and strength of the relationship.
To add a categorical variable to a scatterplot:
use a different plot color or symbol for each category.
True or false: Our eyes can be fooled by changing the plotting scales or the amount of space around the cloud of points in a scatterplot
true
correlation
measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.
True or false: Correlation makes no distinction between explanatory and response variables.
True
Evaluating r.
The correlation r is always a number between −1 and 1. Values of r near0 indicate a very weak linear relationship. The strength of the linearrelationship increases as r moves away from 0 toward either −1 or 1.Values of r close to −1 or 1 indicate that the points in a scatterplot lie close to a straight line.
When examining a relationship between two quantitative variables, we look at:
form, direction, strength, and outliers.
univariate data
data that includes a single variable
multivariate data
data that includes two or more variables
bivariate data
data that has only two variables
You describe a scatterplot with these criteria
direction, form (curved, straight, oscillating) and strength (scatter from the mean)
𝑠x
standard deviation of an individual
r
correlation
ranges between -1 and +1
-1 ≤ r ≤ 1
closer to ones is stronger +/- association
only makes sense with linear data