Lecture 12 Flashcards
Paired samples
Independent observations of matched data.
Measure variable twice on the same subject ex before and after treatment.
Matched paired samples (age and sex)
*****More likely to find significance between subjects than if two randomly selected “unpaired” samples
Unpaired samples
Measure a result once and compare between separate subjects.
Less likely to reveal a significant relationship
LARGER SAMPLE SIZE NEEDED WITH UNPAIRED THAN PAIRED.
Paired data with tails
Two tailed test: More conservative approach. Ensures that either of two outcomes are covered.
One tailed test: Less robust. Assumes only one outcome likely. Rarely a safe assumption.
Paired data with outliers
Extreme data points without obvious measurement induced errors.
Need common sense.
Relative strength of relationship between variables top of pyramid to bottom
Top Causal= predictors. Strongest. Parametric. Correlation Association Random chance- not repeatable Bottom
Association is not necessarily
Causitive
Most common test for correlation (association) and direction
Pearson’s product moment correlation coefficient (r)
Assumptions for Pearson’s coefficient (r)
- The population from which the sample is drawn is normally distributed. (If not, use non parametric test of correlation)
- The two variables are structurally independent. (one not forced to vary with the other)
- Only a single pair of measurements should be made on subjects.
- Every r value (sample) should be accompanied by a p-value or confidence interval which the “true” R value (population) is likely to lie.
For a parametric test, use the r value (sample) accompanied by a p value. For a non-parametric test, what coefficient do you use
r sub s for non-parametric instead of r
perfect correlation value
1
When to use Pearson’s r vs spearman’s rank
Pearsons: r. Normally distributed. Parametric.
Spearman’s: r sub s for non-normal distribution. Non parametric.
Correlation does not allow for ___ or ___
Prediction or causal relationships. Only shows that there is a relationship between the two.
How to interpret correlation
- 9-1 very high positive correlation
- 0.9 to -1 very high negative correlation - 7-0.9 high
- 5-0.7 moderate positive/negative correlation
- 3-0.5 low positive or negative correlation
- 00-0.30 negligible correlation
Regression analysis
Statistical modeling to estimate relationships among variables.
Used for prediction in parametric tests only.
Linear regression
mathematical equation allows target variable to be predicted from the indecent variable. Continuous variables, linear relationship.
Slope intercept equation
When to use the slope intercept equation
When determining linear regression
Multiple regression
Not linear relationship between two or more independent (co-variables). May be quadratic or higher in nature.
Probability and confidence is defined by
standard deviation. Defines probability limits
SD probability limits
Approx 2 (1.96) SD above and below the mean defines points within which 95% of observations lie.
Statistically significant P value
p < 0.05
Statistically highly significant P value
p < 0.01
Obtaining a significant or highly significant outcome means you should ___ the null
Reject.
3 reasons you might fail to reject the null
- No difference between groups
- Too few subjects to demonstrate a difference existed. Small sample size.
- Logical fallacy: Arbitrary assumption for cut off values. reality is values fall on a continuum.
Confidence intervals allow for a ___ of response values in the form of a ____, given repetition of the study.
Confidence intervals allow for a continuum of response values in the form of a range of responses expected, given repetition of the study.
The chance of a real difference given a CI lies between the
upper and lower limits. Difference is not statistically significant if either limit overlaps the value in question.
CI can be applied to almost all statistical tests to help us understand if the evidence is:
Strong, weak, or definitive.
Do you want a narrow or wide CI?
Narrow- greater precision.
Wider usually due to small sample size
CI. Differences in populations. How do you know if there is no difference in populations?
The CI crosses zero or contains zero
CI. Differences in ratios. How do you know if there is no difference in ratios?
The interval contains one or crosses one.
CI on left side of null value
Results show statistically significant decline
CI on right side of null value
Results show statistically significant improvement.
Relative risk RR
Risk in treatment group/risk in control group
Relative risk reduction (RRR)
percentage by which the risk of adverse event is reduced in the experimental group compared with the control group.
(risk in controls - risk in experimental)/risk in controls
Ex: reduced the death rate by 20%
Absolute risk reduction (ARR)
Absolute amount by which a negative outcome is reduced comparing experimental with control group.
Ex: over time. Produced absolute reduction in deaths of 3%. Increased survival rate from 84% to 87%
NNT number needed to treat
Number of subjects who would need to be treated to prevent one adverse outcome.
Reciprocal of the ARR (absolute risk reduction)
Presented with a CI
Ex: 34 people needed to enroll to avoid one death
1/ARR=
NNT number needed to treat
Hawthrone effect (observer effect)
Participants know they are being observed.