31-10-23 - Interpreting evidence 1 Flashcards
Learning outcomes
- Be familiar with the normal distribution and its percentiles as well as the concept of skew.
- Understand how to calculate odds and risk ratios, relative and absolute risk reductions, number needed to treat (NNT)
- Understand the meaning of a 95% confidence interval around an estimate and how to use it to interpret research findings.
- Be familiar with the concept of hypothesis testing and statistical significance including the Null Hypothesis and p-values
- Be familiar with basic statistical tests such as t-tests and chi-square tests, when their use is appropriate and how to interpret the results of such tests
- Be aware of the concept of multiple testing and the how to use the Bonferroni correction to reduce ‘false positive’ results (i.e Type I error)
- Be aware of non-parametric tests for comparing means
- Be aware of extensions to the t-test for comparing more than two groups: 1- and 2-way ANOVA.
- Be familiar with the concepts of correlation and regression
Why do we need Statistics to interpret evidence?
Recap types of data (in picture)
Describe how to calculate risk in a control and treatment group
What is the formula for risk?
Comparing risk between groups.
Describe the following formulas:
* Absolute Risk Reduction (ARD)
* Relative risk (risk ratio)
* Number needed to treat (NNT) – number needed to treat for favourable outcome
What is relative risk independent of?
What must we do when using relative risks?
- Relative risk is independent of the original prevalence
- Can be misleading –always state baseline (absolute) risks as well as relative risks
When are odds ratios used?
Odds example part 1 (in picture)
Odds example part 2
Odds example part 3
Describe the formula for odds ratio.
What can this provide association between?
What is baseline risk (in picture)
If odds are equal in case and control group, What does Odds ratio (OR) equal?
What is OR similar to?
When is OR is a good approximation to the RR?
What is OR independent of? What is OR used for?
- If odds are equal in case and control group OR=1
- Similar to risks but must remember they are not the same
- If events are rare then OR is a good approximation to the RR
- Like RR they are independent of baseline risk (prevalence)
- Used in some types of regression (logistic) and therefore found in the literature frequently
What is a population?
What is a sample?
When are samples used?
What must samples do?
- Population
- Theoretical concept to describe the group of individuals of interest to the research question e.g. 13 year old girls, diabetics in the UK, men aged 15-25 who attempt suicide
- Sample
- In practice we can’t take measurements on every individual. We take a sample –preferably a random sample, that is representative of the population in which we are interested
- Usually much smaller than the population in which we are interested
- Must summarise the sample using basic statistics
Describing ‘Central Tendency’. Describe the mean, median, and proportion
Describe the median and interquartile range
How do means and medians compare to each other?
- Mean –uses all data but can be influenced by outliers
- Median –not influenced by outliers, but doesn’t use all data (less informative)
What is standard deviation a measure of?
Mean + 1SD
Describe normal and skewed distributions
What % of observations are within 1SD and 2SD?
- Around 68% of observations within 1SD of mean
- Approx. 95% of observations within 2 SD of mean (actually 1.96)
Estimating from samples.
What is the mean and prevalence in a sample used for?
How does sample size affect confidence?
- Estimating from samples
- In practice we usually have a sample of individuals
- Use the mean of the sample to ESTIMATE the ‘true’ mean of the population
- Use the prevalence in a sample to ESTIMATE the ‘true’ proportion in the population
- Prevalence is the proportion of a particular population found to be affected by a medical condition at a specific time
- E.g sample of 100 patients with asthma used to estimate rate of inhaler use in Scotland
- We will have more confidence that the sample mean/prevalence is a good estimate of the population mean/prevalence if the sample is large
- Larger samples –more confidence
From sample to population.
How good is a sample mean as an estimate of the population mean?
What is the standard error of mean (SE)?
What does a large and small SE indicate?
- From sample to population
- How good is a sample mean as an estimate of the population mean?
- If we took repeated samples, the variability of the sample means could be measured
- This is called the standard error of the mean (SE)
- A large SE indicates that there is much variability in sample means; that many lie a long way from the population mean
- A small SE indicates there is not much variability between the sample means
Why is SE always smaller than SD?
What can we also calculate the standard error of?
How does sample size affect SE? What is the formula for SE (in picture)?
- SE is always smaller than SD because there is less variability between sample means than between individual values.
- Can also calculate the standard error of a proportion, rate, odds ratio etc
- Larger samples lead to smaller SE
- Formula for SE (in picture)
What is a confidence interval?
What is the formula for a 95% confidence interval?
- 95% Confidence interval = sample mean +/- 1.96*SE
- We are only 95% confident
- 5% of the time the confidence interval WILL NOT include the true mean (based on a single sample)
- 95% is an arbitrary choice
Calculating upper and lower limit values (in picture)
What can there be variability between?
- There can be variability between people and within people
Statistical testing and interpretation of results example part 1
Example part 2; Null hypothesis and research hypothesis
Example part 3: Statistical testing and interpretation of results
Example part 4: Statistical testing and interpretation of results
Example part 5: Odds ratio and interpretations
Example part 6: Interpretation of results
Statistical tests for comparing groups.
What 2 comparisons do we conduct?
What is the question we ask?
What is the answer we ideally want?
What test do need?
What do we need to consider when choosing a test?
- Statistical tests for comparing groups
1) Comparisons
* Comparing our results with a gold standard
* Comparing one sample with another after an intervention
2) Question
* When is a difference STATISTICALLY SIGNIFICANT?
* i.e When do we reject the Null hypothesis?
3) Answer
* Ideally want a simple Yes/No answer
- Want a test statistic that will allow us to make a decision
- Do we have enough evidence to REJECT the null hypothesis
- Many tests available, skill is in knowing which is appropriate for your outcome
- Important to understand type of data e.g. Categorical or continuous, ordinal etc
- Important to think about the distribution of the outcome –normal or non-normal
What are 2 statistical tests for comparing groups?
- 2 statistical tests for comparing groups:
1) T-test – allows us to statistically compare means between two groups
* 1 dependent continuous variable (e.g height)
* 1 independent binary categorical variable (e.g. sex)
2) Chi-square-test – allows us to statistically compare frequencies
* 1 dependent categorical variable (e.g. alternative drug types)
* 1 independent categorical variable (e.g. Deprivation category)
What are T-tests used to determine?
What are they also called?
What does it give a probability for?
- T-test is used to determine whether two means are significantly different from each other
- Also referred as Students t-test.
- Gives a probability (p-value) that such a difference in means (or a greater difference) would be found by chance, IF THE NULL HYPOTHESIS IS TRUE
- E.g compare the height of men and women, compare mean from your data with published literature, compare blood pressure readings before and after exercise
What is a one-sample T-test?
- A one sample t-test is a comparison of a single mean with a hypothesized value
T-test and P value example part 1
Part 2: Hypothesis testing
Part 3: P-value
Part 4: Hypothesis testing