SE, CI and T-test Flashcards
SE
o/ root n. Variability of u from u standard error of the mean (Analytic tool)
As the sample size increases the dispersion of the sample mean reduces
CI
Confidence that calculating CI for 100 samples 95/100 would contain the true population mean. u = point estimate of m hence u is estimated using an interval
CI = (m - 2(s/rootn)
Why is SE used
SD of a hypothetical distribution which is never observed
Inferential tool measuring the precision of estimates in population. SD = descriptive measuring the dispersion of the data
Why is mean +/-SE in correct
95% of values lie between u +/- 2o
Precision of SE
It should be noted that the SE decreases as the sample size increases, because the denominator in the ratio n
σ gets larger.
Standard deviation, which does not have a tendency to get larger or smaller as n increases – the
sample standard deviation, s, simply becomes a better estimate of σ as n increases.
However, the square root in the formula means that the SE does not decrease withsample size as quickly as might be hoped: in order to halve the SE the sample size must quadruple.
Central limit theorem
The distribution for a sample will get closer and closer to a normal distribution as the sample size increases even if the original population isn’t normal itself
Increasing width of CI
Higher interval ie 99% large than 95%, low sample size = larger width
Null hypothesis
usually u1=u2 or u=0 ie no effect of treatment to wild type control
Population assumed to be identical
Why hypothesis test
We know the difference between the two groups and can compute a SE from. Aim to prove that the difference between the groups is not down to random chance
NB if hypothesis testing concludes a difference may be due to chance it doesn’t mean it is due to chance
Logic of hypothesis testing
Null hypothesis = false or seen an event that occurs with a given probability p
Value of P small say that event happens 5% of the time due to random chance
Small p value
p =0.05 provides evidence against null hypothesis. Smaller p value = more confident
Type I error
Incorrect rejection of the null hypothesis hence concluding a relationship exists when infact there is no causal relationship. Controlled by the level of significance set for test
False positive
Type II error
Failure to reject the null hypothesis hence a relationship exists. Controlled by power of the experiment
False negative
T-test
reliable assessment = difference in means/SE
NB This ignores: precisely how likely are particular values of the above ratio, how is the SE calculated and how are large positive and large negative values handled.
Paired T-test
Linked data format must be conserved dependent - absolute differences used d (mean of sample differences)/SE
Assumed normal distribution of differences. SE = variation of the sample differences hence much lower than the SE used in unpaired t-test
Unpaired t-test
Independent data. Assumes both samples have normal distribution with common SD - o
Unpaired t-test null hypothesis
u1=u2
NB
If the populations are non-Normal, but the departure from Normality is slight then this violation of the assumptions is usually of little consequence
b) The assumption of equal standard deviations may seem to constitute a rather severe restriction usually true in practise
c) It should be remembered that a Normal distribution is characterised by its mean and standard deviation: a test that assessed the equality of means and paid no heed to the standard deviations would not be all that useful. It is often of interest to assess whether or not the samples are from the same population and assuming that the means are the same does not specify that the populations are the same unless
the standard deviations are also the same.
Pooled estimate of SE
Estimate of variance in two separate populations with different means but similar SD. Increased precision over SD hence SE of individual
Inferential statistics ie t-tests
Allow generalisation findings beyond sample testing
P=0.2
Does not prove NH is true (just: it provides no evidence against NH
One sample t-test
Assumes u=0
Assumptions of [aired t-test
- The dependent variable must be continuous (interval/ratio).
- The observations are independent of one another.
- The dependent variable should be approximately normally distributed.
- The dependent variable should not contain any outliers.
Z-score
Difference between mean and data points in SD