Scatterplots & Correlation Flashcards
Bivariate data
For each individual studied, we record data on two variables.
Examine if there is a relationship between the two variables - impact of one variable on the other
scatterplot is used to display
quantitative bivariate data.
Each variable makes up one axis. Each individual is a point on the graph.
response vs explanatory varyable
(bivarate data)
response (dependent) variable measures an outcome of a study.
explanatory (independent) variable may explain or influence changes in a response variable. plotted on the x (horizontal) axis of the scatterplot.
Overall pattern of the relationship for scater plot
Form - linear, non linear , no relation
Direction- pos, neg, no direction
Strength - how closely the points fit the “form”
Outlier scaterplot
Outlier is a data value that has a very low probability of occurrence (i.e., it is unusual or unexpected).
In a scatterplot, outliers are points that fall outside of the overall pattern of the relationship.
Two or more relationships can be compared on a single scatterplot when we use
different symbols and/or colors for groups of points on the graph.
correlation coefficient is a measure of
the direction and strength of a relationship.
calculated using the mean and the standard deviation of both the x and y variables.
r - treat x and y the same because
can switch and data will be the same
correlation coefficient calculatoin
r ranges from
−1 to +1
Strength is indicated by the absolute value of r
Direction is indicated by the sign of r (+ or –)
r and outliers
is not resistant
r is only for
linear relationships
can have relationship - just not linear
two variables independent r =
independent r=o
BUT r=0 DOES NOT mean two vartables are indepedent