Biostats. Flashcards

1
Q

labels/names

A

nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

data with only 2 outcomes

ex: yes or no

A

dichotomous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data that consists of names, labels or other nonnumerical data

A

categorial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

uses labels in an order

ex: poor, fair, excellent

A

ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

data that can take any value

ex: numbers

A

continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

values that are equally spaced

ex: age

A

interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

values that has an actual zero point

ex: blood alcohol level

A

ratio data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

central tendencies of continuous data is measured with….

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

variation/spread of the data

A

dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

with a normal [Gaussian Distribution], how does mean/median/mode relate?

A

mean = mode = median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If data has a right/positive skew [tail is to the right], what does this mean?

A

Mean > Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

If data has a left/negative skew [tail is to the left], what does this mean?

A

Mean < Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of dispersion

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

value below the point where a particular percent of scores or observations fall

A

percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does 95th percentile mean?

A

95% of values are below this number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

data from the 25th to 75th percentiles

A

interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

why is interquartile range used?

A

helps to ignore outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

calculates on average how far the mean is from other data points

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

square root of variance

larger = more spread out

A

standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

for skewed distributions what is the best ways to evaluate the data?

A

Median, range, interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

For normal distributions what is the best ways to evaluate the data?

A

mean and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

with a normal distribution, how much of the data data should be within 1 SD of the mean?
Within 2 SD of the mean?

A

1: 68.3%
2: 95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

process of using data obtained from a sample to make estimates about the characteristics of a population

A

statistical interference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is the basis of statistical interference?

A

random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
error that is due to chance and is not standardized
random error
26
large number repeated sampling = normal distribution
central limit theorem
27
standard deviation of a sampling distribution
standard error
28
what effect on standard error does a larger sample size have?
larger sample size = smaller standard error
29
95% of the sample menas should be within how many units of standard error?
1.96
30
determine how close the sample relates to the actual population - Are 95% of the samples within 1.96 SE units from the mean?
confidence intervals
31
there is no difference in the outcome between variable groups
null hypothesis
32
there is a difference in the outcome between variable groups
alternative hypothesis
33
when the null hypothesis is true and you reject it "you say there is a difference but there isn't" false positive
Type I Error
34
the probability of making a Type I error | typically 0.05
alpha error
35
failing to reject the null hypothesis when there is a difference between groups false negative
Type II Error/Beta Error
36
probability of correctly rejecting the null | 1 - beta
power
37
the greater the ability to NOT make a Type II error….
the larger the power
38
what increases the power of a study?
increased sample size larger effect size decreased variability sample data increased alpha
39
If p < alpha
reject the null hypothesis | there is statistical significance
40
If p > alpha
fail to reject the null | not statistically significant
41
compares in the menas of normally distributed continuous variables between two gorups; determines tif the means of two groups shows significantly different distributions
T-test
42
what is the non-parametric version of a T-test?
Mann-Whitney U Test
43
1 way analysis of variance | compares distribution of continuous variables among more than 2 independent groups
ANOVA
44
Problem with ANOVA?
can determine there is statistical difference among groups but cannot tell which group is different
45
Nonparametric version of ANOVa
Kruskal Wallis Test
46
compares ranks between groups rather than means | uses the h-statistic
Kruskal Wallis Test
47
T-test performed on a repeated measures two-group designs | same thing measured on patient at two different ties - pre and post assessments
paried T-test
48
what data is used with paired T-test?
dependent normally distributed continuous variables
49
what statistical test do you used with dependent but ordinal data?
Wilcoxan Matched-Pairs Signed Ranks Test
50
scatterplots are an effective way to convey info for
2 continuous variables
51
this assesses linear relationships between 2 continuous variables
Pearson correlation coefficient
52
(+) Pearson correlation coefficient
positive linear relationships | as x increase y increases
53
in Pearson correlation coefficient, the closer r is to -1 or +1
the stronger the relationship
54
(-) Pearson correlation coefficient
negative linear relationship | as x increases y decreases
55
Pearson correlation coefficient = 0
no linear relationship
56
compares association between rankings of 2 variables (non-parametric) rho
spearman correlation coefficient
57
(observed value - expected)^2 / expected value
Chi-square analysis
58
if any value for Chi-square test < 5, what test should you then do?
Fisher Test
59
Used to check difference between to or more percentages or proportions of categorical outcomes
chi-square test
60
relationship between two variables that are due to the presence of unmeasured variables
confounding
61
what ways can you account for confounding?
stratified analysis | multivariable analysis
62
measure of the relationship between two continuous variables - represented by scatterplots
correlation
63
formulae that forms a line - used to quantify a change in y based on a change in x
linear regression
64
3 ways to measure confounding
multiple linear regression logistic regression proportional hazards modeling
65
This looks at relationship between multiple independent variables and a single dependent variable
Multiple linear regression
66
amount of variance in the dependent variable that is predicted from the independent variable
R^2
67
The closer R^2 is to 1 ….
the better the model
68
same thing as linear regression for multiple continuous and/or categorical variables - dichotomous outcome
logistic regression
69
likelihood that an outcome will occur based on changes in the variables
odd ratio
70
in logic regression you evaluate the beta coefficient AND ___ for each independent variable
odd ratio
71
this looks at the relationship between multiple variables and the TIME to an event How long does it take to get a certain outcome?
Cox proportional hazards analysis
72
independent variables in Cox proportional hazards analysis can be either
continous or categorical
73
In Cox you evaluate the beta coefficient AND ___ for each independent variable
hazard ratio
74
compares the probability of an event occuring over a given time chance of an event occurring the treatment arm/chance in the control arm
hazard ratio
75
variation that occurs due to change with random sampling - affects the study and control equally
random error
76
error disproportionately affects one group
bias
77
bias that is introduced in the way in which participants are assigned to groups
selection bias
78
error that is due to differences in the way data is collected
measurement bias
79
participants do not accurately recall information
recall bias
80
Proper study design can control
confounding and limit bias
81
probability is the expression of ___ | = # of times an event occurs/total # of opportunities for occurrence
risk
82
ratio of probability an event occurs vs probability that anything else occurs
odds
83
multiplication law of probability
prob of A + B = (prob of A)*(prob of B)
84
addition law of probability
prob of A or B = (prob of A) + (prob of B) - (prob of (A+B))
85
focuses on describing the distribution of health conditions
descriptive epidemiology
86
compares groups to test hypotheses regarding potential causes and contributing facotrs
analytic epidemiology
87
existing cases of a disease in a given population | = # with disease/total population
prevalence
88
number of people in a population who have a disease over a given time period
period prevalence
89
3 determinants of prevalence
incidence of disease duration of disease entry/exist of cases
90
in a stable population what is the formula for prevalence
incidence * duration
91
number of new cases of disease in a given population during a specific time period
incidence
92
proportion of at risk people who get the disease
attack rate
93
proportion of people diagnosed with a given condition who die due to that condition
case fatality ratio
94
proportion of all deaths in a given time period that are due to a specific condition
proportionate mortality
95
of new events that occur during a time period/average population at risk
rate
96
the number of deaths per 1000
mortality rate
97
(total # of deaths/mid interval population) * 1000
crude mortality rate
98
types of observational epidemiological studues
cross sectional study cohort study case control study
99
types of experimental epidemiological studies
random control trial
100
describes prevalence of potential risk factors (exposures) and conditions (outcomes)
cross sectional study
101
strengths of cross sectional study
quick and not costly; usefull for developing hypotheses
102
limits of cross sectional study
can't determine relationshipd between variables | late look bias (cases of disease are longer duration)
103
group of people are followed over time to monitor the development of disease or health condition can be prospective or retrospective
cohort study
104
strengths of cohort study
useful for rare exposure can study multiple outcomes can measure risk of outcome in group
105
limits of cohort study
time and cost requirements not as good for rare outcomes loss follow up
106
ratio of incidence of disease in exposed persons to the risk in unexposed
relative risk
107
relative risk = 1
no difference in groups
108
relative risk > 1
exposed group as greater risk
109
relative risk < 1
exposed group has less risk
110
measure of how much disease is actually attributable to the risk factor
attributable risk
111
participants with a disease are compared to participants without a disease
case control study
112
strengths of case control study
useful for rare outcomes able to study multiple exposures less time and cost
113
limiations of case control study
not as good for rare exposures potential for recall bias challenges in selecting control only ESTIMATE risk
114
estimates relative risk when disease is relatively uncommon
odds ratio
115
additional amount of disease that is present in a population because of the presence of a risk factor
population attributable risk
116
randomization minimizes
confounding
117
blinding minimizes
selection and measurement bias
118
of patients who need to receive treatment to prevent one event occuring
number needed to treat (NNT)