Correlation Flashcards
What is correlation?
a statistic that measures the relationship between two variables
What are the different characteristics of correlation?
- direction (positive or negative)
- form (linear or non-linear)
- strength or consistency (magnitude)
What is the form of relationship of correlation?
do the data fit a linear or non-linear form
What is the consistency or strength of the relationship?
measured by the numerical value of the correlation
What is higher absolute value?
closer to 1.00 means that it is stronger, more consistent relationship between variables
What is perfect correlation?
identified by a correlation of 1.00
What are the different components of a scatterplot?
- direction (positive or negative)
- strength (weak, moderate, strong)
- linearity (linear or nonlinear)
What does the value of r^2 mean?
the coefficient of determination which measures the proportion of variability in one variable that can be determined from the relationship with the other variable
What are outliers?
an individual with X and/or Y values that are substantially different from the values obtained for the other individuals in the data set
What are the different types of correlation?
- Pearson
- Spearman rho
- Kendall’s Tau
- Point biserial
- Biserial
-Phi
When do you use Pearson?
both variables are continuous ( are least interval or ratio)
When do you use Spearman rho?
- skewed data, non-linear relationships
- ordinal data, the “Pearson of ranked data”
When do you use Kendall’s Tau?
- ordinal data, better than Spearman for small samples
- better when there are many ties among ranks
When do you use Point biserial?
continuous variable (interval or ratio data) and natural binary variables (ex: yes/no coded as 0 and 1)
When do you use biserial?
continuous variable (interval or ratio data) and a binary variable with underlying continuity (e.g., test score converted to pass/fail)
When do you use Phi?
two binary (two categorical/nominal) variables
What is the Pearson Correlation?
measures the degree and direction of the linear relationship between two continuous variables
What does “r” represent?
correlation as a sample statistic
What does “p” (pho) represent?
correlation as a population parameter
What is the sum of products (SP)?
- determines whether a correlation coefficient is positive or negative
- measures the amount of covariability between two variables
What will happen the larger the covariance?
the closer the data points will fall to the regression line
What happens when all data points for X and Y fall exactly on a regression line?
the covariance equals the total variance, making the formula for r equal +1.0 or -1.0
What is the denominator of the formula for r?
the total variance
What the numerator of the formula for r?
the covariance which is the proportion of total variance that is shared by X and Y
What happens the farther the data points fall from the regression line?
the smaller the covariance will be compared to the total variance in the denominator, resulting in value of r closer to 0
What is partial correlation (first-order partial correlation)?
measures the relationship between two variables while controlling the influence of a third variable by holding it constant
What is zero-order correlation?
the relationship between 2 variables (while ignoring the influence of other variables)
What is semi-partial correlation?
the relationship between 2 variables after removing a third variable from just the one variable
What does the numerator in the partial correlation formula do?
subtracts from the original correlation the product of the correlations of each member of the pair of interest with the control variable
What does the denominator in the partial correlation formula do?
standardizes the numerator in terms of the amount of variance left “unexplained” in the variables of primary interest after the impact of the control variable on each of them has been taken into account