Statistical Inference Flashcards
Define Normal distribution
A probability distribution that is symmetric around the mean
What is the significance of 2 standard deviations in Normally distributed data?
If data is Normally distributed, then 95% of the data points will be contained within 2 (actually 1.96) standard deviations of the mean
What are the main assumptions when conducting statistical inference?
That the sample is representative of the population
The the sample follows a Normal distribution
How can the assumption of Normal distribution be assessed?
The mean and the median should be approximately equal
The boxplot should be symmetric
95% of data should lie within 2SDs of the mean
Normal probability plot should be linear
What is the standard error of the mean?
The standard error is the standard deviation of the sample means
What is meant by a 95% confidence interval?
CIs are an estimate of how well we have measured a mean / other variable
A 95% CI in a Normally distributed data set means that we expect 95% of the sample means to lie within 1.96 standard errors of the true population mean
What is the relevance of the width of a 95% CI?
The width of a CI is a measure of how precisely we have measured a variable - the narrower the interval, the more precise the estimate.
How can we minimise standard error?
By increasing sample size