Lecture 12 Flashcards
Paired samples
Independent observations of matched data.
Measure variable twice on the same subject ex before and after treatment.
Matched paired samples (age and sex)
*****More likely to find significance between subjects than if two randomly selected “unpaired” samples
Unpaired samples
Measure a result once and compare between separate subjects.
Less likely to reveal a significant relationship
LARGER SAMPLE SIZE NEEDED WITH UNPAIRED THAN PAIRED.
Paired data with tails
Two tailed test: More conservative approach. Ensures that either of two outcomes are covered.
One tailed test: Less robust. Assumes only one outcome likely. Rarely a safe assumption.
Paired data with outliers
Extreme data points without obvious measurement induced errors.
Need common sense.
Relative strength of relationship between variables top of pyramid to bottom
Top Causal= predictors. Strongest. Parametric. Correlation Association Random chance- not repeatable Bottom
Association is not necessarily
Causitive
Most common test for correlation (association) and direction
Pearson’s product moment correlation coefficient (r)
Assumptions for Pearson’s coefficient (r)
- The population from which the sample is drawn is normally distributed. (If not, use non parametric test of correlation)
- The two variables are structurally independent. (one not forced to vary with the other)
- Only a single pair of measurements should be made on subjects.
- Every r value (sample) should be accompanied by a p-value or confidence interval which the “true” R value (population) is likely to lie.
For a parametric test, use the r value (sample) accompanied by a p value. For a non-parametric test, what coefficient do you use
r sub s for non-parametric instead of r
perfect correlation value
1
When to use Pearson’s r vs spearman’s rank
Pearsons: r. Normally distributed. Parametric.
Spearman’s: r sub s for non-normal distribution. Non parametric.
Correlation does not allow for ___ or ___
Prediction or causal relationships. Only shows that there is a relationship between the two.
How to interpret correlation
- 9-1 very high positive correlation
- 0.9 to -1 very high negative correlation - 7-0.9 high
- 5-0.7 moderate positive/negative correlation
- 3-0.5 low positive or negative correlation
- 00-0.30 negligible correlation
Regression analysis
Statistical modeling to estimate relationships among variables.
Used for prediction in parametric tests only.
Linear regression
mathematical equation allows target variable to be predicted from the indecent variable. Continuous variables, linear relationship.
Slope intercept equation