Biostats. Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

labels/names

A

nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

data with only 2 outcomes

ex: yes or no

A

dichotomous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data that consists of names, labels or other nonnumerical data

A

categorial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

uses labels in an order

ex: poor, fair, excellent

A

ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

data that can take any value

ex: numbers

A

continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

values that are equally spaced

ex: age

A

interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

values that has an actual zero point

ex: blood alcohol level

A

ratio data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

central tendencies of continuous data is measured with….

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

variation/spread of the data

A

dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

with a normal [Gaussian Distribution], how does mean/median/mode relate?

A

mean = mode = median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If data has a right/positive skew [tail is to the right], what does this mean?

A

Mean > Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

If data has a left/negative skew [tail is to the left], what does this mean?

A

Mean < Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of dispersion

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

value below the point where a particular percent of scores or observations fall

A

percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does 95th percentile mean?

A

95% of values are below this number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

data from the 25th to 75th percentiles

A

interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

why is interquartile range used?

A

helps to ignore outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

calculates on average how far the mean is from other data points

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

square root of variance

larger = more spread out

A

standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

for skewed distributions what is the best ways to evaluate the data?

A

Median, range, interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

For normal distributions what is the best ways to evaluate the data?

A

mean and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

with a normal distribution, how much of the data data should be within 1 SD of the mean?
Within 2 SD of the mean?

A

1: 68.3%
2: 95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

process of using data obtained from a sample to make estimates about the characteristics of a population

A

statistical interference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is the basis of statistical interference?

A

random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

error that is due to chance and is not standardized

A

random error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

large number repeated sampling = normal distribution

A

central limit theorem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

standard deviation of a sampling distribution

A

standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

what effect on standard error does a larger sample size have?

A

larger sample size = smaller standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

95% of the sample menas should be within how many units of standard error?

A

1.96

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

determine how close the sample relates to the actual population - Are 95% of the samples within 1.96 SE units from the mean?

A

confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

there is no difference in the outcome between variable groups

A

null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

there is a difference in the outcome between variable groups

A

alternative hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

when the null hypothesis is true and you reject it
“you say there is a difference but there isn’t”
false positive

A

Type I Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

the probability of making a Type I error

typically 0.05

A

alpha error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

failing to reject the null hypothesis when there is a difference between groups
false negative

A

Type II Error/Beta Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

probability of correctly rejecting the null

1 - beta

A

power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

the greater the ability to NOT make a Type II error….

A

the larger the power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what increases the power of a study?

A

increased sample size
larger effect size
decreased variability sample data
increased alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

If p < alpha

A

reject the null hypothesis

there is statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

If p > alpha

A

fail to reject the null

not statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

compares in the menas of normally distributed continuous variables between two gorups;
determines tif the means of two groups shows significantly different distributions

A

T-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

what is the non-parametric version of a T-test?

A

Mann-Whitney U Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

1 way analysis of variance

compares distribution of continuous variables among more than 2 independent groups

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Problem with ANOVA?

A

can determine there is statistical difference among groups but cannot tell which group is different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Nonparametric version of ANOVa

A

Kruskal Wallis Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

compares ranks between groups rather than means

uses the h-statistic

A

Kruskal Wallis Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

T-test performed on a repeated measures two-group designs

same thing measured on patient at two different ties - pre and post assessments

A

paried T-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

what data is used with paired T-test?

A

dependent
normally distributed
continuous variables

49
Q

what statistical test do you used with dependent but ordinal data?

A

Wilcoxan Matched-Pairs Signed Ranks Test

50
Q

scatterplots are an effective way to convey info for

A

2 continuous variables

51
Q

this assesses linear relationships between 2 continuous variables

A

Pearson correlation coefficient

52
Q

(+) Pearson correlation coefficient

A

positive linear relationships

as x increase y increases

53
Q

in Pearson correlation coefficient, the closer r is to -1 or +1

A

the stronger the relationship

54
Q

(-) Pearson correlation coefficient

A

negative linear relationship

as x increases y decreases

55
Q

Pearson correlation coefficient = 0

A

no linear relationship

56
Q

compares association between rankings of 2 variables (non-parametric)
rho

A

spearman correlation coefficient

57
Q

(observed value - expected)^2 / expected value

A

Chi-square analysis

58
Q

if any value for Chi-square test < 5, what test should you then do?

A

Fisher Test

59
Q

Used to check difference between to or more percentages or proportions of categorical outcomes

A

chi-square test

60
Q

relationship between two variables that are due to the presence of unmeasured variables

A

confounding

61
Q

what ways can you account for confounding?

A

stratified analysis

multivariable analysis

62
Q

measure of the relationship between two continuous variables - represented by scatterplots

A

correlation

63
Q

formulae that forms a line - used to quantify a change in y based on a change in x

A

linear regression

64
Q

3 ways to measure confounding

A

multiple linear regression
logistic regression
proportional hazards modeling

65
Q

This looks at relationship between multiple independent variables and a single dependent variable

A

Multiple linear regression

66
Q

amount of variance in the dependent variable that is predicted from the independent variable

A

R^2

67
Q

The closer R^2 is to 1 ….

A

the better the model

68
Q

same thing as linear regression for multiple continuous and/or categorical variables - dichotomous outcome

A

logistic regression

69
Q

likelihood that an outcome will occur based on changes in the variables

A

odd ratio

70
Q

in logic regression you evaluate the beta coefficient AND ___ for each independent variable

A

odd ratio

71
Q

this looks at the relationship between multiple variables and the TIME to an event
How long does it take to get a certain outcome?

A

Cox proportional hazards analysis

72
Q

independent variables in Cox proportional hazards analysis can be either

A

continous or categorical

73
Q

In Cox you evaluate the beta coefficient AND ___ for each independent variable

A

hazard ratio

74
Q

compares the probability of an event occuring over a given time
chance of an event occurring the treatment arm/chance in the control arm

A

hazard ratio

75
Q

variation that occurs due to change with random sampling - affects the study and control equally

A

random error

76
Q

error disproportionately affects one group

A

bias

77
Q

bias that is introduced in the way in which participants are assigned to groups

A

selection bias

78
Q

error that is due to differences in the way data is collected

A

measurement bias

79
Q

participants do not accurately recall information

A

recall bias

80
Q

Proper study design can control

A

confounding and limit bias

81
Q

probability is the expression of ___

= # of times an event occurs/total # of opportunities for occurrence

A

risk

82
Q

ratio of probability an event occurs vs probability that anything else occurs

A

odds

83
Q

multiplication law of probability

A

prob of A + B = (prob of A)*(prob of B)

84
Q

addition law of probability

A

prob of A or B = (prob of A) + (prob of B) - (prob of (A+B))

85
Q

focuses on describing the distribution of health conditions

A

descriptive epidemiology

86
Q

compares groups to test hypotheses regarding potential causes and contributing facotrs

A

analytic epidemiology

87
Q

existing cases of a disease in a given population

= # with disease/total population

A

prevalence

88
Q

number of people in a population who have a disease over a given time period

A

period prevalence

89
Q

3 determinants of prevalence

A

incidence of disease
duration of disease
entry/exist of cases

90
Q

in a stable population what is the formula for prevalence

A

incidence * duration

91
Q

number of new cases of disease in a given population during a specific time period

A

incidence

92
Q

proportion of at risk people who get the disease

A

attack rate

93
Q

proportion of people diagnosed with a given condition who die due to that condition

A

case fatality ratio

94
Q

proportion of all deaths in a given time period that are due to a specific condition

A

proportionate mortality

95
Q

of new events that occur during a time period/average population at risk

A

rate

96
Q

the number of deaths per 1000

A

mortality rate

97
Q

(total # of deaths/mid interval population) * 1000

A

crude mortality rate

98
Q

types of observational epidemiological studues

A

cross sectional study
cohort study
case control study

99
Q

types of experimental epidemiological studies

A

random control trial

100
Q

describes prevalence of potential risk factors (exposures) and conditions (outcomes)

A

cross sectional study

101
Q

strengths of cross sectional study

A

quick and not costly; usefull for developing hypotheses

102
Q

limits of cross sectional study

A

can’t determine relationshipd between variables

late look bias (cases of disease are longer duration)

103
Q

group of people are followed over time to monitor the development of disease or health condition
can be prospective or retrospective

A

cohort study

104
Q

strengths of cohort study

A

useful for rare exposure
can study multiple outcomes
can measure risk of outcome in group

105
Q

limits of cohort study

A

time and cost requirements
not as good for rare outcomes
loss follow up

106
Q

ratio of incidence of disease in exposed persons to the risk in unexposed

A

relative risk

107
Q

relative risk = 1

A

no difference in groups

108
Q

relative risk > 1

A

exposed group as greater risk

109
Q

relative risk < 1

A

exposed group has less risk

110
Q

measure of how much disease is actually attributable to the risk factor

A

attributable risk

111
Q

participants with a disease are compared to participants without a disease

A

case control study

112
Q

strengths of case control study

A

useful for rare outcomes
able to study multiple exposures
less time and cost

113
Q

limiations of case control study

A

not as good for rare exposures
potential for recall bias
challenges in selecting control
only ESTIMATE risk

114
Q

estimates relative risk when disease is relatively uncommon

A

odds ratio

115
Q

additional amount of disease that is present in a population because of the presence of a risk factor

A

population attributable risk

116
Q

randomization minimizes

A

confounding

117
Q

blinding minimizes

A

selection and measurement bias

118
Q

of patients who need to receive treatment to prevent one event occuring

A

number needed to treat (NNT)