Unit 2 Flashcards
what does the response variable measure?
the outcome of a study
also known as dependent variable
what does the explanatory variable attempt to explain?
the observed outcome of the study
also known as independent variable
what is an example of the explanatory variable and a response variable?
explanatory variable- mothers height
response variable- daughters height
the variation in daughters height and how its explained in the mothers height
where is the explanatory variable and response variable located on a graph?
explanatory variable: x- axis
response variable: y-axis
how can we measure the degree of linear association between two quantitative variables?
by determining the correlation
what is correlation?
a statistical measurement of the relationship between two quantitative variables
what does correlation measure?
the strength and direction of a linear relationship
what happens when two variables are positively associated?
as values of X increase, values of Y also tend to increase
the graph will have points that are plotted from the bottom left to the top right
what happens when two variables are negatively associated?
as values of X increase, values of Y tend to decrease
the graph will have points that are plotted from the top right to the bottom left
what happens when two variables have no association?
points are randomly scattered throughout the scatterplot
what is the correlation coefficient?
r
what does correlation measure?
the direction and strength of a LINEAR relationship
r ranges from -1 to +1
when is direction positive?
is positive when individuals with higher X values tend to have higher values of Y
strength
how closely the points follow a straight line
weak positive association
0 –> 0.5
moderate positive association
0.5 –> 0.8
strong positive association
0.8 –> 1
weak negative association
-0.5 –> 0
moderate negative association
-0.8 –> -0.5
strong negative association
-1 –> -0.8
*positive and negative correlation r
can the correlation coefficient r be used if a scatterplot shows a curved relationship?
no,
correlation coefficient only describes linear relationships
does changing the units in x or y change the value of r?
no
does r have units?
no
what is x bar (x̄)?
x sample mean
what is y bar (ȳ)?
y sample mean
what is Sx?
x sample standard deviation
what is Sy?
y sample standard deviation
what variables does r require?
quantitative variables
does it make a difference which variable you denote x and which you denote y to?
no
what numbers does r lie between?
-1 and +1 inclusive
what does positive r imply?
positive linear association
what does negative r imply?
negative linear association
what does b0 stand for?
y-intercept
what does b1 stand for?
slope (rise/ run)
what does the regression line do?
summarizes the relationship between two variables only when one of the variables helps predict or explain the other
what does regression describe?
a relationship between an explanatory variable and a response variable
what does correlation measure?
the direction and strength of a linear relationship between two quantitative variables
do slope and correlation have the same sign? explain
yes,
when r= (+), slope will also be (+) because the line is going in an upwards direction
when r=(-), slope will also be (-) because the line is going in a downwards direction
what is the equation for slope?
Y2 - Y1 / X2- X1
what is y hat (ŷ)?
the predicted value of the response variable for x, a certain value of the explanatory variable
what is the value of the intercept definition?
the predicted value of the response variable y when x=0, however this prediction MAY NOT BE RELIABLE
how is the slope interpreted?
for every one unit increase in the explanatory variable (independent, x) the response variable (dependent, y) is predicted to either increase or decrease by the value of b1 (depending on if b1 is positive or negative)
what is the least squares regression line definition?
the line that minimizes the sum of the squares of the residuals
what is the residual?
the distance from our observed y value to our line
what does the least squares regression line predict?
what y may be predicted to be for various values of x
what is extrapolation?
the equation is only correct for the range of values that were included in the study
for example, when you fill in an x value in the least squares regression line that isnt apart of/ within the data, the equation is no longer correct
what is the equation for residual?
residual = observed y - predicted y
residual = yi - ŷi
points that fall above our least squares regression line have what sign for a residual?
positive
what is r^2?
the coefficient of determination
tells us how well our response variable is actually being explained by our explanatory variable
what should we automatically think when we see “percentage of variation” or “fraction of variation”?
r^2
what is an outlier, in respect to residuals?
an observation with an extremely large residual
if there is a point to the extreme left or right of data, what is that called?
influential observation
what are points that are outliers in the y direction known as?
outliers
what are points that are outliers in the x direction known as?
influential observations
what is a lurking variable?
a variable that is not being studied but it may influence the relationship between the two variables being studied
regression and correlation are used for what type of association?
linear
are regression and correlation affected by extreme outliers?
YES
does extrapolation yield reliable predictions?
no (its too risky)
what does a strong correlation NOT imply?
a cause and effect relationship
do correlation and regression imply causation?
no
what is confounding?
when the effects of the explanatory variable on the response is mixed up with the effects of other explanatory variables on the response