Lecture 5 - Key Statistical Concepts Flashcards
What is a population?
All possible observations of an experimental/study variable
What is a sample?
A selection of observations taken from the population
What is the role of statistics?
To select a sample to get information from this sample which we will use to generalise the population
What are the 2 types of error that can influence results in a study?
Chance (random error)
Bias (systemic error)
What causes random error and how can it be reduced?
Error due to sampling variation
Reduced as sample size increased
What is bias?
The difference between the true value and the expected value
What is the difference between bias and random error?
Bias is a form of systematic error (error in study design)
Increase sample size does not reduce bias
What is Truth Vs Observed random variation?
When the true probability is by chance different to what is observed
True probability of getting a tail when flipping a coin is 0.5
But if you flip a coin 10 times you may get 7 tails by random chance
What is a hypothesis?
It is a statement that an underlying truth of scientific interest takes a particular quantitative value
E.g: Prevalence of TB in a given population is 2 per 10,000 people
What is hypothesis testing?
When you calculate the probability of getting an observation as extreme as or more extreme than one observed assuming that the hypothesis is true
If this probability is very small it is considered that the observation and the stated hypothesis are incompatible
Calculated probability = p-value
What is the p-value?
The probability of getting an observation as extreme as or more extreme than the one observed if the hypothesis is true
So if a p value = 0.05 an extreme value iis a value that is at the bottom or type 2.5% of the distribution
What p-value do we typically reject the null hypothesis at the 5% significance level?
P less than 0.05
There is strong evidence to reject the null hypothesis
What do you do with the null hypothesis if p is larger or greater than 0.05?
Not enough evidence to reject the hypothesis
Doesn’t mean that the hypothesis has been proven
What is a 95% confidence interval?
It’s means that if 100 samples where taken, 95 of the confidence intervals will contain the true value
About confidence intervals.
95% confidence interval is the range within which we can be 95% certain that the true value of the underlying truth really lies
The range is centred on the observed value because it is always our best guess at the true underlying value