Biostatistics Flashcards
Examples of categorical data
Male/female
Pregnant/Non pregnant
Smoker/non smoker
Blood types
non-smoker/ex-smoker/light smoker/heavy smoker/ect
What is the likert scale
Type of categorical data
“strongly agree, agree, neither, disagree, strongly disagree”
A histogram is used to display categorical data - true or false
False - a bar chart is used because the data is NOT continuous
To describe the central tendency of categorical data, what value is used
Median or Mode (NOT MEAN)
What is the measure of central tendancy of continuous data
Mean, median and mode
In a box and whisker plot, what percentile does each line/part of the box represent
Bottom line of box - 25th percentile
Middle line - 50%
Top line of box - 75th percentile
What is the equation of varience
Varience = (value of one observation - mean)^2 / n-1
What is the equation for standard deviation
Square root of varience
What % of data will lie within 2 SD of the mean
95%
On a normal distibution graph (bell curve), what point signifies the mean and standard deviation, respectfully
The peak of the belll curve is the mean
Either side of this is the SD
What is a 95% confidence interval
A range which contains the true population mean with a probability of 0.95
99% of the data will be within 3 SD of the mean
A positively skewed distribution is skewed in what direction
Left
What does the P value describe
The probability of having observed our data when the null hypothesis is true
We reject the null hypothosis when the P value is what
less then 0.05
In hypothesis testing you are always trying to prove the null hypothesis
True
What are 2 dangers of P vales
Over reliance of use of P-values
Misinterpretation of P values
We are more likely to find a statistical difference in a one sided test - true or false
True
What is type one error
When you choose to reject the null hypothosis when you shouldnt
What is a type 2 error
Usually occurs due to small sample sizes
What is the main difference between paramtric and non-parametric data
In parametric you are assuming normal distribution.
There are little assumptions in non-parametric tests
What are some assumption in T tests
Each pair of observations is completed unrelated to every other pair
Normally distributed
Roughly equal SD of both groups (<2)
What is the null hypothesis usually in parametric tests
No difference between the values
T tests and one way ANOVA are parametric tests - true or false
True
What is the non-parametric equavilent to a T test
Mann Whitney Test
What measure of central tendancy is used in Mann Whitney Tests
Median
When is a one way ANOVA test used
To compare the continuous variable in three or more independant groups
How do you calculate risk between two categories
Number of people (people who got flu after vaccine)/total number of people in the whole category (everyone that got the vaccine)
When would you use a fishers test over a chi square test
If the sample size if no big enough for the chi square, use the fisher
If there is a P value < 0.05 in a chi square test, what does this tell us about the risk between the two values
There is evidence that the who categories related in risk (e.g eyestrain is related to work occupation)
What does it mean when you get a P value of 0.32 in a fishers test
Cannot reject the null hypothesis - no difference between the categories
What is the difference between the SEM and SD
SEM measures how far away the sample mean is from the true population mean
SD measures scatter of data, or dispersion of individual data plots from the sample mean
How is SEM claculated
SEM = SD / √n
What type of data would we use a unpaired T test for
2 groups - unpaired, paramentric data
What type of data would we use a paired T test for
2 groups - paired data, parametric
What type of data would we use a Man-Witney Test for
2 groups - independant, non-parametric data
What type of data would we use a oneway ANOVA for
3+ groups - independant, parametric data
What type of data would we use a repated measured ANOVA for
3+ groups - related, parametric data