stats vocab Flashcards
Explanatory Variable
Attempts to explain the observed outcomes (x-independent)
Response Variable
measures the OUTCOME of a study (y-dependent)
Scatterplot
graph of ordered pairs of numbers consisting of the independent variable, x, and the dependent variable y. A visual way of describing the nature of the relationship between the variables.
Direction
positive or negative correlation
Form
Is it linear? Quadratic? Exponential?
Strength
How close do the points follow the form. (connects to form)
Outliers
observation that lies outside the overall pattern of the other observations
Correlation
a statistical method used to determine whether a relationship between 2 variables exist
Correlation coefficient
a measure of
1. form
2. direction
3. strength of a relationship b/w 2 variables
Least Squares Regression Line (LSRL)
Line of best fit that represents the points of a scatterplot
Slope of LSRL
Shows the rate of change.
“for every 1 x, we more # of y”
Intercept of LSRL
The value of ^y (y hat) when x=0 called “starting point” of the LSRL
Coefficient of determination (r^2)
a measure of the proportion of variation in y that is explained to be the regression line using x as the predicting variable
High association does not imply Causation
Just b/c 2 varaibles end to increase and decrease together DOES NOT mean a change in ONE is causing a CHANGE IN THE OTHER.
Residual
“left over” variation in the response variable after fitting the regression line
“expectations vs. reality”
Influential observations
Individual points that are EXTREME in the x- direction may have a strong influence on the regression line
can be close to the LSRL
Extrapolation
Making predictions using the LSRL for values outside the observed range
MUST USE CAUTION
Lurking variable
A variable that has an important effect on the response. But is not included among the variables studied.
Experiment
y hat
predicted y-value
y-direction
individual points with large residuals are outliers
Residual plot
scatterplot of the regression residuals against the explanatory variable (x)
NO SYSTEMATIC PATTERN!!
Least squares criterion
The sum of the squares of the vertical distances, from the points to the line be made as small as possible
Regression line
also called a “best fitting line” is the line for which the sum of the squares of the residual is a minimum
equation of regression line
Y^= a + bx
Coefficient of non-determination
The unexplained variation, found by subtracting the coefficiernt of determination from 1.
r^2
coefficient of determination
Residual equation
y-y^
Simple relationships
has only 2 variables under study
Multiple relationships
many variables are under study
Subjective
it can be just your opinion or what you think, and not everyone may think alike
r
the average of the products of the standardized values of x and y
Range of correlation coefficient
-1< r < 1
Going away from r
stronger negative (left)
stronger positive (right)
towards center
weaker negative (left)
weaker positive (right)