Intro to Biostats Flashcards
a research perspective which states there will be no difference between the comparison groups
null hypothesis H0
statistical perspectives that can be taken by the researcher in their alternative hypothesis
superior
noninferior
equal
alpha error
type I error
rejecting the null when you should accept it
false positive
beta error
type II error
accepting null when you should reject it
false negative
define power
statistical ability of a study to detect a true difference when it exists
“accuracy”
the ______ the sample size, the greater the ability of _________ .
greater
ability to detect a true statistical difference
increase in power
the smaller the difference between group, that is required to show a statistical difference, then the greater _________ is needed.
greater sample size is needed
when determining sample size you should anticipate …….?
drop outs and lost to follow up
so oversample in the beginning to compensate
bell curve percentages based on standard deviation
1 stdD = 68%
2 stdD = 95%
3 = 99.7%
probability value = ?
p value
the probability value is selected before or after the study starts?
before
if the p value is lower than the alpha value, then we say?
alpha value = 5%, 1%, etc.
we say it is statistically significant
relate p value to a statistically significant test
the p value is lower than alpha
so we reject the null (not accept)
relate a p value less than the alpha of 5%, and the risk of type I error
p value is lower, we reject the null
therefore, the risk of experiencing a type I error is acceptably low = less than 5%
at 95% confidence and p value of 0.005, what is the risk of error?
.5% risk of being wrong
relate a p value of 0.01% and 3 groups
there is at least one significant difference between the 3 groups
typically between control and the most extreme group
should baseline data be statistically significant or not different?
should show no statistical difference
to show that our experiment groups are not different so final results will show a difference only if my intervention caused it
a p value of 0.91, what is your chance of being wrong?
91% chance of being wrong when you say there is a statistical difference (type I error)
if you claim a difference you have a 91% chance of being wrong
when do we want p values to not be statistically different
- when comparing baseline characteristics at start
2. When using a levene’s test
3 primary level for variables - data types
nominal
ordinal
interval/ratio
3 key attributes of data measurement
order/magnitude
consistency of scale (equal distance)
rational absolute zero
nominal
no order
no consistency of scale
simply work w/ no quantitative characteristics
any question that only has 2 categories is always what type of data?
nominal
ordinal
has order
no consistency of scale
ex. pain scale, stress levels, happiness ratings
disagree, somewhat disagree, neutral, etc.
interval/ratio data
has order
has consistency of scale
ratio has absolute zero
interval data
arbitrary zero value
0 does not mean absence
temperature
ratio data
has an absolute zero
0 = absence
ex. 0 heartbeat = dead
after data is collected, we can appropriately go _____ in specificity/detail of data measurement levels, but never ____ .
go down
but never up
in terms of nominal, ordinal, interval, ratio
measures of dispersion/spread
mean, median, mode
outliers
min/max and range
IQR
difference between variance and standard deviation
variance is the distance from the mean of one particular value
standard deviation represents a % of data being this far from the mean
relate bell graph to percentiles
broken into 4 25% sections about the median (=50th percentile)
IQR
interquartile range
Q1 - Q3 = IQR
25th - 75th percentile = IQR
statistical tests used on normally distributed data is called ?
parametric tests
positively skewed
tail pointing to the right/positive direction
mean > median
negatively skewed
tail pointing to the left/negative direction
mean < median
if the data is not skewed then how are the mean and median related?
they should be the same/ almost the same
what are the 3 ways to tell if the data is skewed?
- are the mean and median the same?
- what does the graph look like?
- what is the skew value?
skewness value
if data is not skewed it will be as close to zero as possible
can have pos./neg. values
kurtosis
a measure of extent to which the data clusters about the mean
normal distribution, kurtosis = 0
positive kurtosis
= higher clustering about the mean
negative kurtosis
= less clustering about the mean
discrete vs. continuous data
discrete is solid numbers whilst continuous can have decimals
required assumptions for interval/ratio data
- normally distributed
- equal variances
- randomly derived and independent
levene’s test
tells us if interval/ratio data is normally distributed w/ equal variances or not
what if interval data is not normally distributed?
just use a non-parametric test
or transform the data using z-scores (log transformations)
variables required when interpreting a p value
- is it significant
- who was higher/lower
- by how much?
include all three, no specific order
______ must be equal in order to pick an interval test.
variances
levene’s test is used to assess whether ______ are equal between ?
variances between all groups
before running a levene’s test you need? and why
need the null hypothesis stating there is no difference
we want the p value to come back not significant to prove the variances are equal
if we prove they’re equal data can then be treated as interval data
number of siblings is an example of _____ data
interval data
define confidence interval
an interval around the p value that we are %% confident that the true difference is within this range
a CI that includes reducing and increased risk
means that it is not significant because a significant test cannot show both directions