STATs to learn Flashcards
statistics enables
informed decisions
first step of any medical stats
should be to get a clear understanding of the background info of the study and clarify objectives
second step of medical stats
formulate problem in statistical terms
how is diagnostic uncertainty qualified
using conditional probabilities
example of condition probabilities if the risk of having breast cancer is 1/10
- 9 probability that the lump is not cancer
0. 1 probability that the lump is cancer
RCTs require
active interventions by investigators
in observational studies, researchers play
a much more passive role
- treatment or exposure is not under control of the researcher due to ethical concerns of logistic complaints
observational studies are
cheaper and less time consuming then investigative studies
observational studies allows groups
which would often be excluded from clinical trials to be used
observational studies give a better estimate of
what actually happens in routine practice
i.e. patients in trials may be more compliant with treatment
a variable
is a set of characteristics which can be used to describe an aspect of a participant in a research study
the top of a bell curve shows
average- mean, median, mode
the width of a bell curve shows
variation
- sd
- iqr
histograms
shape of distribution , multiple modes, skewness and tail size
-outliers
box and whisker
compares location and variation in several groups e.g. outliers
scatter plot
dispels general form of relationships between 2 variables
bar chart
frequencies of categorical variable an cross tabulations of categorical variables
correlation
strength of association between two variables- quantified using pearson correlation coeffieicint
-extent to which on variable relies on another
regression
used to describe the relationship between a quantitative outcome and one or more predictor variables
regressions an be usedto
estimate mean scores of the outcome for subjects with specific profile of score on predictor
a=
intercept
- mean value of y, when the predictor is 0
- point on y axis which is crossed by the regression line
b=
the slope
-the predicted increase in outcome for each one unit increase in the predictor
what else can be factored into the regression equation
e = error/ residuals
errors/ residuals
are presumed to be equally distributed
- vertical difference between outcome value and the value predicted from the regression line
in small samples it is important that
residuals are normal
normal residuals guarantees
validity of CIs and p value
normality is checked by
plotting histograms
-ideally would be bell shaped
checking for constant variance
the amount of variation of residuals are the regression line should be constant and not depend on values of the predictor variance
- checked by scatter plot
normality checked via
scatterplot
goodness of fit
most subjects won’t fall on the regression line
“the extent to which predicted outcome scores are close to observed scores”
R2
the proportion of variation in one outcome that explained by predictor
coefficient of determination
R2
R2=
r x r
R2= 0
no variability explain
R2=1
100% of variability explained
adjusted R2=
an unbiased estimate of the fraction of variance explained, taking into account the SAMPLE SIZE AND NO. OF VARIABLES
confounding
where a third variable is associated with a response
consequences of confounding
- bias
- type 1
- type2
bias and confounding
will be relationship stronger or weaker
regression assumptions
- linearity
- normality
- constant variance (homoscedasticity)