research stats midterm Flashcards
what is biostatistics?
the statistics of medicine, health sciences and public health
define target population
larger population to which results will need to be generalized
define accessible population
actual population of subjects available
define sample
subgroup of accessible population which allows results to be generalized
define parameter
statistical characteristic of population
define statistic
statistical characteristic of sample
define descriptive statistic
describes sample shape, central tendency, variability
define inferential satistic
used to make inferences about a population
define central tendency
the central value
best representative value of target population
single value
define variability
spread of the data
define frequency distribution
the pattern of frequencies of a variable
3 measures of central tendency
mean - average
median - two equal halves
mode - most frequent score
describe skewed to the right
tail faces right
positive skew
mean > median/mode
describe skewed to the left
tail faces left
negative skew
mean < median/mode
when is mean best to use?
numeric, symmetric data
not good for skewed
when is median best to use?
skewed data
not effected by extremes
when is mode best to use?
nominal or ordinal
common in surveys
advantages to mean
easy to calculate and interpret
dont need to arrange values
all values represented
all algebraic formulas possible
disadvantages to mean
cant be used with categorical data
cant calculate if data missing
affected by extremes
advantages to median
easy to calculate
not affected by extremes
can be used with ranked data
disadvantages to median
tedious in large data set
problematic with even number of observations
doesnt account for all values
advantages of mode
easy to understand and fine
not affected by extremes
easy to ID in data set and in frequency distribution
mode is useful for categorical data
disadvantages of mode
not defined if no repeats
not based on all values
unstable when data has small number of values
sometimes could have 2+ or no modes
when would you choose median over mode?
distribution is skewed
researcher is using ordinal data
define range, percentiles, quartiles
R - max-min
P - divides into 100 parts
Q - four parts
define interquartile range
difference between 25th and 75th percentile
used with median
describe box plot
min
1st quartile
median
3rd quartile
max
define standard deviation
reported same units as raw scores
mean +/- SD
define variance
square of SD
coefficient of variation
used for interval and ratio data only
expressed as percentage
unitless so good for comparing scales
constant and predictable characteristics
68% +/- 1SD
95% +/- 2 SD
99% +/- 3 SD
define a z-score
standardized score based on normal distribution
z = SD units
z = score - mean / SD
define sampling error
sample mean will not equal the population mean. the difference is called sampling error
how well does the sample represent the population?
z scores for CI calculations
90% = z 1.65
95% = z 1.96
99% = z 2.58
central limit theorem
will approach mean is N increases
define point estimate
single value the is best estimate
define confidence interval
range of values that we are confident contains parameter
how would you increase precision (narrow) in CI?
larger sample size
less variance (lower SD)
lower selected level of confidence to 90%
CI equation
CI = mean +/- (z) SEM
define null hypothesis
no difference or relationship
will with reject or fail to reject