Correlation and Regression Flashcards
What is the maximum correlation coefficient that is possible?
1.0
the datapoints are on a perfect line
What is the purpose of Correlation?
evaluate the relationship between two variables (x) and (y): f.e. GFR and age
-GFR in the future can be predicted with an equation
What can be described by the Correlation line?
-quantitatively describes the
-Strength (described by the correlation coefficient, how close to 1, how close are the data points to the line
-Direction: the positive or negative slope of the line
What is the term for the correlation coefficient?
Pearson product-moment coefficient of
correlation (r)
Which type of data are used to describe correlations?
-interval or ratio data
in other words: continuous data
Which type of data is used for Spearman rho (r2)
Different type of correlation coefficient
Ordinal data
-Ranked
General mantra
Correlation does not equal Causation
Which conditions are required to establish Causality?
-Controlled conditions
-Randmizedion (RCT), Placebo
-9 Bradford Hill criteria (discussed early in the semester):
biologic credibility of the association, logical time
sequence (cause precedes outcome), a
dose-response relationship, and
consistency of findings across several studies
Interpretation of Correlations: R-value
-0.25 indicates little to no relationship
-0.25-0.5 indicates a fair degree of relationship
-0.5-0.75 indicates a moderate to good relationship
-0.75-1.0 is considered good to excellent relationship
What is the Coefficient of Determination (r^2)
-indicates the percentage of the total variance in the Y scores, that can be explained by the X scores
-it is explanatory
f.e.: GFR on the y-axis and age on the X-axis -> r=0.9 so the r^2 is 0.81 -> 81% of the Y variable (GFR) can be explained by the changes of the X variable (age)
Linear VS Curvilinear
Coefficient r is a measure of linear relationships only
-curvilinear relationships, are not described accurately by the linear correlation coefficient
Interpretation of Correlation
-2 variables should not be interpreted solely based on the correlation coefficient r
-variables should be plotted and see whether a linear or curvilinear relationship exists and whether an r value is appropriate
-the assosiation of 0.4 is not twice as strong as 0.2
-the difference in association between 0.5 and 0.6 is not the same as in 0.8 and 0.9
Correlation Matrix
-analyzing several variables at one time
-presenting corellation coefficients for all pairs of variables
-each variable will be regressed with each one to see if any of the variables is related to one another
Significance of Correlation Coefficient
-the observed correlation is one of an infinite number of possible correlations -> obtained from a random sample of the population
-> subject to sampling error, or BIAS
->need to be tested for statistical SIGNIFICANCE
What would be the Null Hypothesis when determining the Correlation Coefficient?
H0 states that r=0 -> no correlation