Stats (Correlation/prediction) Flashcards
correlation techniques are used to determine the degree of ______ between two or more variables and to make predictions about status or scores on one or more criteria based on status or scores on one or more ______
association; predictors
A scattergram illustrates the relationship between two variables. The wider the scatter of data points in the scattergram, the ______ the correlation between the variables
lower
a correlation is a number that indicates the average degree of association between variables. The choice of a coefficient is based primarily on the scale of measurement of the variables being correlated. For example, the Pearson r is used when both variables are measured on a(n) ______ scale, while _______ is used when both variables are ranks
interval or ratio; Spearman rho
the _____ correlation coefficient is appropriate when one variable is a true dichotomy and the other is measured on an interval or ratio scale, and the ______ correlation coefficient is appropriate when one variable is an artificial dichotomy and the other is measured on an interval or ratio scale
point biserial; biserial
The Pearson r and coefficients derived from it range in value from _______
-1.0 to +1.0
The magnitude of the coefficient indicates the _____ of the relationship, while the sign indicates its ______
strength; direction
Use of the Pearson r is based on three assumptions:
- there must be a _______ relationship between variables
- there must be and ________ range os scores on both variables
- there must be ________, or the same range of Y scores at every X value
linear; unrestricted; homoscedasticity
a large correlation coefficient for two variables alone cannot be interpreted as evidence of a(n)______ relationship between X and Y but can be interpreted in terms of shared variability. This is done by the _____ coefficient
causal; squaring
if the correlation coefficient for X and Y is .30, this means that ____% of variability in Y is explained by variability in X
9
regression analysis is the technique that makes it possible to use predictor (X) score to predict or estimate a ______ (Y) score
criterion
an assumption underlying the use of regression analysis is that the relationship between X and Y can be described by a ______
straight line
the position of the regression line in a scattergram is identified using the _______ criterion, which locates the regression line so that error in prediction is minimized
least squares
A high school counselor uses a battery of tests to help high school juniors and seniors choose a college major. Which of the following multivariate techniques would be most helpful in this situation?
discriminant function analysis
(Discriminant function analysis is used to predict or estimate a person’s status on a single nominal criterion from two or more predictors.)
Multiple regression
used to predict status on a single continuous criterion from scores on two or more predictors.
Path analysis
a causal modeling technique that is an extension of multiple regression and is used to test a theory about the causal relationships among a set of variables.
Canonical correlation
the appropriate technique when two or more predictors will be used to predict status on two or more continuous criteria.
Eta is used to:
determine the correlation for two variables that have a nonlinear relationship.
The correlation coefficient for Test A and Test B is -.40. This means that ___% of variability in Test A scores is shared in common with Test B scores.
16
(A measure of shared variability is obtained by squaring the correlation coefficient: -.40 squared is .16. Therefore, 16% of variability in Test A scores is shared with (or accounted for by) variability in Test B scores.)
To determine the degree of association between gender and attitude toward abortion when the attitude measure represents an interval scale, you would use which of the following correlation coefficients?
point biserial (The point biserial coefficient is the appropriate coefficient when one variable represents a true dichotomy and the other is measured on an interval or ratio (continuous) scale.)
The Pearson r
the appropriate correlation coefficient when both variables are measured on an interval or ratio scale.
The phi coefficient
used when both variables are true dichotomies.
The contingency coefficient
used to determine the degree of association between two nominal variables.
When using path analysis, you are:
confirming a model involving one-way causal flow between a set of observed variables.
LISREL
more complex than path analysis and not only includes two-way paths but also takes into account both observed variables and the latent traits those variables are believed to measure.