Correlation Techniques Flashcards
____ ____ are used to describe the degree of association between two or more variables (or, put another way, the degree to which two or more variables co-vary) and are often used to ____ ____ about ____ ____ or ____ of ____ based on ____ or ____ on ____ ____ or ____ of ____.
Correlational Techniques; Make Predictions; One Variable or Set of Variables; Status or Performance; Another Variable or Set of Variables
The psychologist in Study #3, for example, could calculate a correlation coefficient for product knowledge test scores and yearly sales and then, if the coefficient is sufficiently large, use a regression equation to predict the future sales of job applicants from their scores on the product knowledge test. When correlational techniques are used for the purpose of ____, the X (independent) variable is often referred to as the ____, while the Y (dependent) variable is called the ____. Correlational techniques are divided into two basic types — ____ and ____.
Prediction; Predictor; Criterion; Bivariate and Multivariate
____ ____ techniques are used to describe or summarize the degree of association between two variables and include ____ and ____ ____.
Bivariate Correlation; Scattergrams and Correlation Coefficients
____: The degree of association for two variables can be depicted in a ____, which is also known as a scatter diagram or scatterplot. In a scattergram, the X (predictor) variable is placed on the ____ ____, while the Y (criterion) variable is located on the ____ ____. In Study #3, if product knowledge is measured with a 10-item test and sales success is measured in terms of dollar amount of sales, the product knowledge and sales success of 35 salespeople could be presented.
Scattergrams; Scattergram; Horizontal Axis; Vertical Axis
Each data point in the scattergram corresponds to the two scores obtained by a single person. When data points are widely scattered, this means that the variables have a ____ ____. Conversely, when there is a narrow scatter of data points, this indicates a ____ ____.
Weak Relationship; Strong Relationship
A ____ ____ summarizes the degree of association between variables with a single number. As indicated in Table 5, there are several correlation coefficients, and the selection of a coefficient is based on the ____ of ____ of the ____ being ____.
Correlation Coefficient; Scale of Measurement; Variables; Correlated
Correlation Coefficients
Pearson r; Spearman rho; Point Biserial; Biserial; Eta
The Pearson r and correlation coefficients derived from it range in value from __ to __. The magnitude of the coefficient indicates the ____ ____. The closer the coefficient is to __ or __, the ____ the ____. The sign of the correlation coefficient (+ or -) indicates the ____ ____. When there is a positive (direct) correlation between X and Y, the value of Y ____ as the value of X ____. Conversely, when there is a negative (inverse) correlation, the value of Y ____ as the value of X ____.
-1.0 to +1.0; Relationship’s Strength; -1.0 or +1.0; Stronger the Relationship; Relationship’s Direction; Increases; Increases; Decreases; Increases
____: Use of the Pearson r and most other correlation coefficients require that ____ ____ be met. Violation of one or more of these assumptions can produce an ____ or ____ ____ ____.
Assumptions; Three Assumptions; Inaccurate or Misleading Correlation Coefficient
____: The first assumption is that there is a ____ ____ ____ the ____. In other words, in a scattergram, the relationship between X and Y can be summarized by a ____ ____. If the relationship is nonlinear, the Pearson r will ____ the ____ of ____.
Linearity; Linear Relationship Between the Variables; Straight Line; Underestimate the Degree of Association
____ ____: The second assumption is that there is an ____ ____ of ____ on ____ ____. This means that the data have been collected from people who are ____ regarding the ____ ____ by _ and _. If there is a restriction in range (if the people are homogeneous), the Pearson r is likely to be an ____.
Unrestricted Range; Unrestricted Range of Scores on Both Variables; Heterogeneous; Characteristics Measured by X and Y; Underestimate
____: The third assumption is that the range of Y scores is about the same for ____ ____ of _ — i.e., that there is ____. For example, if the range of Y scores is 10 at low values of X, the range should also be about 10 at moderate and high values of X. The difference between homoscedasticity and heteroscedasticity is illustrated in Figure. Violation of the assumption of homoscedasticity does not necessarily result in a coefficient that is too low or too high but produces a coefficient that does not represent the ____ ____ of ____.
Homoscedasticity; All Values of X; Homoscedasticity; Full Range of Scores
You might encounter a question that asks what it means when “the range of Y scores at every value of X is equal to the total range of Y scores. “ This statement [s similar to the first sentence in the above description of homoscedasticity, but it is not identical to that sentence. Both describe homoscedasticity, but this statement refers to a particular kind of homoscedasticity that occurs when there Is either a very wide scatter of data points in the scatterplot or when all of the data points fall on a horizontal line. The answer to this question is that it means that there is a _ (or ____ _) ____ ____.
0 (or Near 0) Correlation Coefficient
____ of a ____ ____: A correlation coefficient can be interpreted in several ways.
Interpretation of a Correlation Coefficient
____ of ____: A correlation coefficient can be interpreted directly in terms of ____ of ____. The closer the coefficient is to either -1.0 or + 1.0, the ____ the ____ ____ ____; the closer it is to 0, the ____ the ____.
Degree of Association; Degree of Association; Stronger the Association Between Variables; Weaker the Association
Note that the correlation coefficient is sometimes erroneously interpreted in terms of causality. However, it is the ____ ____ that permits causal inferences, not the way in which the data are analyzed or described. When a ____ ____ ____ has been conducted, a researcher can infer a cause-effect relationship when the correlation coefficient is sufficiently large. However, a large coefficient alone does not mean that variability in one variable ____ variability in the other variable.
Research Method; True Experimental Study; Causes
____ of ____: Whenever a correlation coefficient represents the degree of association between two different variables, it can be squared to obtain a ____ of ____, which provides a measure of ____ ____. Put another way, the squared correlation coefficient indicates the proportion of variability in Y that is ____ ____, or ____ ____ ____, variability in X. For example, if the correlation coefficient for sales success and product knowledge is .60, then 36% (.60 squared = .36) of variability in sales success is accounted for by product knowledge. The remaining 64% is ____ ____, which might be due to such factors as attitude toward the company, work-related motivation, previous sales experience, and sales territory.
Coefficient of Determination; Coefficient of Determination; Shared Variability; Explained by; Accounted For By; Unexplained Variability
Keep in mind that a bivariate correlation coefficient should be ____ to obtain a measure of shared variability only when it indicates the ____ of ____ ____ ____ ____ ____. As noted in the Test Construction chapter, when a correlation coefficient is a ____ ____, which is the correlation of a measure with itself, the coefficient is ____ ____. Instead, it is interpreted directly as a measure of “____ ____ ____.” The ____ of a correlation coefficient indicates whether it is a coefficient for two different variables or a single variable: If the subscript contains two different letters or numbers (e.g., “xy”), it represents the correlation between ____ ____ ____. When the subscript contains the same letters or numbers (e.g., “xx”), it is a ____ ____.
Squared; Degree of Association Between Two Different Variables; Reliability Coefficient; Never Squared; True Score Variability; Subscript; Two Different Variables; Reliability Coefficient
____ ____: Correlation coefficients can be evaluated to determine if they are statistically significant by comparing the ____ ____ to the ____ ____ ____. The magnitude of the critical value is determined by ____ (the level of significance) and the ____ ____. The smaller the sample, the ____ the ____ ____ must be to be ____ ____. For example, when the level of significance is .05 and the number of observations is 10, the correlation coefficient must be at least .63 to be statistically significant. In contrast, when the number of observations is 50, a correlation of only .28 is significant.
Hypothesis Testing; Obtained Coefficient; Appropriate Critical Value; Alpha; Sample Size; Larger the Correlation Coefficient; Statistically Significant
____ ____: Investigators are often interested in correlation because their goal is to use a ____ to ____ or ____ ____ on a ____. ____ ____ is the technique that allows such predictions to be made when there is one predictor (X) and one criterion (Y). An assumption underlying regression analysis is that there is a ____ ____ ____ _ and _, and, therefore, that the relationship can be described by a ____ ____. The scattergram for product knowledge test scores and dollar amount of sales (Figure 13) reveals that there is a linear relationship between these variables, and, Consequently, their relationship can be described by a ____ ____ (“line of best fit”).
Regression Analysis; Predictor to Predict or Estimate Performance on a Criterion; Regression Analysis; Linear Relationship Between X and Y; Straight Line; Regression Line
The technique used to locate the regression line in a scatterplot is referred to as the ____ ____ ____, which locates the line so that the amount of error in prediction is minimized. The regression line or its formula (the regression equation) is then used to make ____ ____ _ ____ on ____ on _.
Least Squares Criterion; Predictions About Y Based on Information on Y
Example: The psychologist in Study #3 assesses the degree of association between product knowledge and yearly sales by administering the product knowledge test to a sample of 35 current salespeople and determining each salesperson’s sales for the previous year from employment records. The correlation coefficient for test scores and sales is statistically significant, so the psychologist decides to use regression analysis to facilitate hiring decisions in the future. She does this by using the ____ ____ to predict the yearly sales of job applicants from their product knowledge test score.
Regression Equation
The degree of predictive accuracy when using a regression equation is directly related to the ____ of the ____ ____. Unless the coefficient is equal to + 1.0 or —1.0, there will be some ____ in ____. Consequently, the standard error of estimate is used to construct a ____ ____ around a predicted _ ____ so that the score is not “____.” (The standard error of estimate and confidence intervals are described in the Test Construction chapter.)
Magnitude of the Correlation Coefficient; Error in Prediction; Confidence Interval; Y Score; Overinterpreted
Correlational techniques are used to determine the degree of (1) ____ between two or more variables and to make predictions about status or score(s) on one or more criteria based on status or score(s) on one or more (2) ____. A scattergram illustrates the relationship between two variables. The wider the scatter of data points in the scattergram, the (3) ____ the correlation between the variables.
(1) association; (2) predictors; (3) lower