16- Association and Correlation II Flashcards
What is Covariation?
means “Varying together” or “Varying jointly”
it is the cross product for each observation of deviations from means an
Are signs important for Covariance?
Yes, they tell us if it is a high or low covariation as well at the direction (positive or negative).
1) Linear points with positive slope = High positive covariation
2) Lin
Characteristics of Covariation
1) Magnitude of covariance is dependant on the units of x and y 2) Mutliple pairs of variables are not directly comparable 3) To be able to compare multiple pairs you would need a standard form of covariance
What is Correlation?
A dimensionless measure that indicates the nature and degree of linearity between two variables
It is a standardized form of covariance, ranges from values between -1 t
What are the Correlation ranges for interpretation?
- High Perfect +ve = +1.00
- Moderate +ve = +0.50
- Weak +ve = +0.25
- No Correlation = 0.00
- Weak -ve = -0.25
- Moderate -ve = -0.50
- High Perfect -ve= -1.00
What is the Pearson’s Correlation Coefficient?
The covariance of x and y divided by the product of the population standard deviations
Purpose of Pearson’s Correlation Coefficient
to help quantify associations that we see on a scatter plot/chart or map
Assumptions of Pearson’s Correlation Coefficient
- Variables must be interval or ratio
- Data pairs must be selected randomly from the population
- Relationship between X and Y is linear
- Constant Variation (homoscedasticity)
- The variables X and Y must share a joint bivariate normal distribution
Pearson’s Correlation Coefficient Values
- Values -1 and +1 correspond to a strong negative and strong positive linear relationship between the variables X and Y
- Value of 0 indicates no linear relationship exists between X and Y ( they are independent)
Spearman’s Correlation Coefficient
Computes the linear correlation on the ranks of xy
Properties of Spearmans Correlation Coefficient
- Relaxes normality and linearity assumption
- Data can be ordinal
- Measures the difference between the ranks
- Coefficient is interpreted the same way as Pearson’s (value of 0 = No association)
Kendalls Correlation
measures the strength of the monotonic relationship between X and Y
It is resistant to the effect of a small number of outliers and is used
When to use Spearmans?
When…
* Data Types: analyzing relationships between ordinal, interval and ratio variables
* Assumption of Monotonicity: robust against non-linear relationships however Spearmans is well-suited for detecting monotonic relationships, including both positive and negative monotonic trends
* Larger Sample Sizes: more powerful and efficient when you have larger sample sizes
When to Use Kendall’s?
- Data Types: when data is strictly ordinal
- Ties in Data: effective in handling tied values (when two or more data points have the same rank)
- Smaller Sample Sizes: works well with limited data
- When you Want to Emphasize the Relative Order: assesses association based on the orfer or ranking of data points rather than the actual values
Possible Issues with Correlations
- Non-Linear Associations
- Correlation does not equal causation
- Spatial Aggregation Impacting Analysis