Biostats Flashcards
In biostatistics name the steps in study design. There 5 steps.
1) Design of studies–> sample size/selection of study participants/role of randomization
2) Data collection variability –> important patterns in data are obscured by variability.
3) Inference -> draw conclusions from limited data
4) Summarize –> what summary measures will best convey the results
5) Interpretation –> what do the results mean in terms of practice, the program and the population
What are the 4 types of data in biostatistics?
1) Binary (Dichotomous) data: yes/no answers
2) Categorical Data: either nominal (no ordering) or ordinal (ordering)
3) Continous Data: blood pressure, weight, etc
4) Time to event data: time in remission
There are different statistical methods for different types of data. What two methods are used for binary data?
Fishers Exact Test
Chi-Square Test
what method is used for continous data?
2 sample t test
wilcoxon rank sum (nonparametric) test
How would you calculate the mean of a sample (sample average)?
Add up data and then divide by the sample size
What is the difference between population and sample in regards to data?
Population –> the entire group about which you want information (all women ages 30 and 40)
Sample –> a part of the population from which we actually collect information; used to draw conclusions about the whole population.
How is population vs sample mean differentiated when it comes to statistical symbols?
Population Mean –> Mu
Sample Mean –> X
The median number is the middle number. What happens when the sample size is an even number?
Average the two middle numbers
what are ways in which spread of the distribution can be explained?
Min and Max
Range –> min - max
sample standard deviation (SD)
Why would a researcher feel it appropriate to make a histogram?
Way of displaying the distribution of a set of data by charting the number of observations whose values fall within pre defined numerical ranges
How would one go about making a histogram?
Divide the data into equal intervals
Count the number of observations in each class
Draw the histogram
Label scales
Generally, now many intervals should you have in a histogram?
depends on the same size , n
usually the guideline is the square root of n
What are other types of histograms?
frequency histogram
relative frequency histogram
relative frequency polygon
(note see lecture page 9 for images)
There are several shapes of distribution when plotting data, explain what right skewed and left skewed and symmetrical means
Symmetrical --> right and left sides are mirror images (mean = median = mode) Left Skewed (negatively skewed) --> long left tail; mean long right tail; mean> median (ex: hospital stays)
Describe in general terms what probability density refers to?
smooth idealized curve that shows the shape of the distribution in the population
What are some features of a normal (gaussian) distribution
symmetric
bell shaped
mean = median = mode
(mean is the center) (SD is the spread)
what does the 68–95-99.7 Rule mean?
In any normal distribution, approximately;
68% of the observations fall within one standard deviation of the mean
95% of the observations fall within two standard deviations of the mean
99.7% of the observations fall within three standard deviations of the mean
What is a Z score?
Tells how many standard deviations from the population mean you are
Z = observation - population mean / SD
What are the standard Z scores?
Z= 1 –> observation lies one SD above the mean
Z=2 –> observation lies two SD above the mean
Z = -1 –> observation lies one SD below the mean
Z= -2 –> observation lies two SD below the mean
If female heights, mean = 65 , s =2.5 inches
what is the Z score for 72.5 inches and 60 inches?
Z= 72.5 Z = 72.5 - 65/2.5 = +3.0 SD above the average Z= 60 Z = 60-65/ 2.5 = -2.0 SD below the mean