Standard errors and confidence intervals Flashcards
Histogram of means vs histogram of individual heights?
Histogram of sample means will be much less dispersed. As increase sample sizes (but same NUMBER of samples) this will become even less dispersed.
Quantifying how the distribution of sample means becomes more concentrated as the sample size increases?
Standard deviation for the means of samples of size 10 will be 1.54cm, say, and 0.20cm for size 1000. This figure measures how precisely the sample mean estimates the population mean and is called the SEM or SE.
Equation for standard error?
SE = σ/√n. In reality, use s not σ. =SD for the distribution of sample means.
Significance of √ in SE?
SE of sample size 1000 will be 10-fold less than sample size 10. This is different to SD which does not get bigger or smaller with sample size, just becomes a better estimate (hence why it is used, not range for inference). Also means to half SE must quadruple sample size.
Quoting mean ± SE?
Bad practice as suggests that μ must be within m±SE. Much better to use 2SE (95%) as intervals will be wider.
Premise of confidence intervals?
Note that 95% of values lie within μ±2σ. As the distribution of sample means (theoretical) has mean μ and SE=SD, there is a 95% chance that the sample mean m is between μ±2SE, which is equivalent to saying 95% chance that μ is within m±2SE. This spread will be wider for smaller sample sizes.
Confidence intervals and interval estimates?
95% CIs also called interval estimate of μ. Distinct from m, which is a point estimate. Interval estimate better reflects the uncertainty. Significance clear: if trial result was that A gave BP reduction compared to B of -1mmHg, hard to interpret. If CI turned out to be -3, 5 then no material difference likely; if CI were -30, 32 then either could be considerably better.
SE and practicalities of sample size?
Means can collect large sample and get an SE as small as experimenter chooses. However, only worth getting it to stage that is clinically significant i.e. who cares if BP differs by 1?
Sample means of normal and non-normal data?
As shown, means of samples from normal data have a normal distribution (basis of SE). However, means of samples from variables that are not normal often have a very close-to-normal distribution. Described by Central Limit Theorem.
Central premise of hypothesis testing?
That any difference between the two values (in this case treatment means) is due to chance (i.e. that the populations are identical, and therefore should differ only by sampling error. Important to note that just because a test says that difference could be due to chance, does not means that it is due to chance.
What actually is a P value?
If a difference between samples is large, measures the probability of the observed difference occurring if the null is true. Result is either that have seen an unlikely event or that the null is false.
Type 1 error rate?
Same as the P value (probability of making the error of stating that there is a difference between the two groups when there is not). I.e. FPR? Incorrectly reject the null hypothesis when should not have done.
Student’s T-test?
m1-m2/SE. Uses premise that most values of m1-m2 will fall within 2SE if null is true, so ratio of m1-m2/SE bigger or smaller than 2 will be unusual.
Type 2 error?
False negative (say there is not a difference between the two groups when there is i.e. failing to reject a false null hypothesis).
Understanding the result of a T test?
Get a ratio; plot it on distribution of statistics if null is true; shaded area is % of all t-statistics that are more extreme than the observed value.