Data Analysis Flashcards
Why should you graph original data and not averages?
You can see for yourself where’s it the average is
You can see for yourself what the spread looks like
You can see if there is anything odd/interestiing about the data eg outliers, evidence of skewing etc
How can you find true confiemdence intervals?
Using a T test
Why do many sets of experimental observations for approximately to a normal distribution?
The mathematical explanation is the central limit therom
Why/what data may result in a normal distribution
If a variable is affected by a lot of different random factors
Each has a small effect
The effects are additive
The distribution will approximate to a normal distribution
Sd
A statistical tool that tells you how tightly (close together) all the variations examples are clustered around the mean in a set of data
It is a more sophisticated indicator of the precision of a set of given measurements
What does it mean if effect size is not large enough compared to variation?
It means that random variations in readings might account for the difference between treated and controlled
What do statistical tests such as T tests do?
They ask how does variation compare with effect size
Why should error bars be treated as a rough indication?
With low density samples, they underestimate the uncertainty
Error bars are not additive
All statistics only give a rough indication of confidence
Why is it wrong to think that if error bars don’t overlap the result is significant and Vice Versa
Biological significance is best shown by effect size, not by statistical significance
The idea that error bars just touching is equivalent to p=0.05 does not apply to small sample numbers
Idea is just silly
Confidence intervals
These get round the problem that SEM error bars are not additive.
By using a t test, we can get the best estimate, even when we have a small number of observations
The T test can calculate the 95% C.I for the difference between the means
What is 95% C.I
C.I interval gives us an idea of how much larger or smaller and tells us that 95% of time the real answer should be within a his interval.
What does a small p value <0.05 mean?
It indicates strong evidence against the null hypothesis, so you reject null hypothesis. This means that the probablity of the data due to only chance/random variation is less than 5%
What does a large p value >0.05 indicate ?
It indicates weak evidence against the null hypothesis so you fail to reject the null hypothesis
Advantages and disadvantages of confidence intervals
Adv;
-combine numerical information in effect size, statistical confidence and possible variation in the ‘real’ effect size
-ideal for simple comparisons such as treated vs control
-now the preferred approach in clinical research and epidemiology
Diaadv:
Harder to apply to more complex experiments, eg more than one control, more than one treatment
How can the effect size be compared with the variation? For statistical reliability
a) by eye from the raw data
b) using sem error bars
c) more precisely using a 95% confidence interval
CI is more precise than error bars and more informative than a p value
For sample sizes above 20 how do we treat the confidence intervals
We treat them as weak to moderate as long as the error bars don’t overlap