Correlation and Linear Regression Flashcards
In what situation is linear regression correlation a useful method of comparison
With continuous data
What is correlation?
Correlation tells us whether there is any association between two continuous variables and what is the strength of their association
What is linear regression?
Quantifies the relationship between two variables when one of them depends on another. This allows the mean of one variable to be estimated for a given value of the other
What sort of hypothesis testing can be done with linear regression?
Can carry out a t test and be easily extended to incorporate adjustments for baseline imbalances with a continuous outcome
What is the standard method of calculating correlation?
Pearson’s correlation coefficient / product moment correlation coefficient
Describe the notation of Pearson’s correlation coefficient
p (rho) for the population value and the estimated value is r
What is Pearson’s Correlation Coefficient?
Measures the scatter of the points around an underlying linear trend and can take a value from -1 to +1
How do we interpret r
Positive if higher values of one variable are associated with higher values of the other variable
The correlation is negative if values of one variable decrease as the values of the other increases.
The closer the points are to an underlying trend the higher the correlation. Conversely the greater the spread of points around an underlying linear trend the lower the correlation
How does r change with restriction of the range of a variable
When the range of one of the variables is limited r is weaker. This means that comparison of r in different studies may be misleading is ranges of variables are not comparable.
How do you calculate a valid confidence interval for r?
Must have bivariate normal distribution
How does bivariate data look on a scatter graph?
Elliptical
What condition must be satisfied to produce a valid hypothesis test for correlated data?
At least one variable should be normally distributed
How do we calculate the correlation coefficient for non-parametric data?
Spearman’s rank-order correlation coefficient (denoted as ps for the population value and rs for the estimate)
How does Spearman’s rank-order correlation coefficient calculate correlation?
Rank correlation does not specifically assess linear association but assesses more generally whether there is a monotonic relationship between the variables.
What are the two variables called in linear regression?
Variable x is the explanatory variable (dependent or outcome)
Variable y is the response variable (independent or predictor)