Chapter 3 Flashcards
variable
measure that can have more than one value
correlation coefficient
is a mathematical index that describes the direction and magnitude of a relationship.
positive correlations
in which if the value of one variable goes up the value of the other rises also.
negative correlations
when the value of one variable rises, the value of the other falls.
The Regression Line
the best straight line through a set of points in a scatter diagram.
homogeneity of variance
if all random variables in the sequence or vector have the same finiteif all random variables in the sequence or vector have the same finite variance.
regression line describes
the best linear relation between the X and Y scores
Covariance
the extent to which knowing the value of one variable predicts the value of the other
regression coefficient
The slope of the regression line
slope
This describes how much change is expected in Y each time X changes by one unit
- X = bY
intercept
a, is the value of Y when X = 0. That’s where the regression line crosses the Y axis.
- a = Y – b X
residual
the difference between an actual score and the predicted score (predicted by the regression line)
principle of least squares
The best-fitting line keeps residuals to a minimum
Correlation
is a special case of regression in which the scores of both variables are in standardized, or Z, units
- the intercept is always 0 (must be continuous)
difference between regression and correlation
In correlation, both scores (X and Y) have been converted to Z scores, so they both have mean of zero. Thus the intercept between the X and Y axes will always be at 0.
Pearson product moment correlation coefficient (r)
is a ratio used to determine the degree of variation in one variable that can be estimated from knowledge about variation in the other variable.
null hypothesis
there is no real relationship between the variables in question
criterion validity evidence
the relationship between a test score and some well-defined criterion, such as scores on a job aptitude test and actual job performance
Spearman’s rho (ρ)
is used to find the association between two sets of ranks (second, third, fourth, etc.)
dichotomous
variables that can have only two values, like yes-no, correct-incorrect
Biserial correlation
expresses the relationship between a continuous variable and an artificial dichotomous variable.
point biserial correlation, phi (φ) coefficient.
If one variable is a continuous variable and the other is a true dichotomous variable (can have only one of two possible values
tetrachoric correlation.
if both variables are artificially dichotomous variables
Residual:
The difference between the observed and the predicted values
Standard Error of Estimate:
The standard deviation of the residuals
Coefficient of Determination:
The correlation coefficient squared`
Coefficient of Alienation:
is a measure of the nonassociation between two variables!
Shrinkage
the amount of decrease observed when a regression equation is created for one population and then applied to another.
Cross Validation
Using the regression equation derived using one group of subjects to predict performance in a different group of subjects
The Correlation-Causation Problem:
Just because two variables are correlated does not mean that one of them caused the variation in the other!
Third Variable Explanation
Some third (unobserved) variable caused the variation in both of the other variables.
Restricted Range
If the variability of a variable is extremely restricted, significant correlations may be difficult to find even if they may actually be there.
Multivariate analysis
considers the relationship among combinations of three or more variables
Discriminant analysis
the linear combination of variables that provides the maximum discrimination between categories
Factor Analysis:
correlation between every variable and every other variable.