Biostatistics Flashcards

1
Q

Lecture objectives

A

1) review sampling, variables, and basic descriptive statistics (including measures of location and measures of spread)
2) Review the statistical distributions that are commonly used in biostatistics
3) Understand the application and usefulness of the t-test and the Chi-squared test
4) understand the usefulness of the 2x2 contingency table and how to derive various epidemiological study measures of association from its application

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lecture objectives part 2

A

5) Understand the usefulness and application of the 2x2 contingency table as a means to derive sensitivity, specificity, and related measures
6) Review scatter plots and their relation to the correlation coefficient
7) Understand the basics of simple linear regression, multiple linear regression, logistic regression and meta-analysis
8) Understand how to read and interpret a multiple linear regression table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Who was the 1st African American professional to practice in Memphis/”hero of the yellow fever epidemic”?

A

Dr. R H Tate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What was used as a yellow fever hospital?

A

peabody hotel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do I need to know this stuff?

A

prevalent in research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is population?

A

collection of persons or things to which we want to generalize a set of findings; largest collection of persons to which we have an interest at a particular time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is sample?

A

part of population; smaller collection of persons or things from a population used to determine generalities about the population of persons or things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is variable?

A

a characteristic that takes on different values in different persons, place, or things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are variable descriptors?

A

numeric, categorical, dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a numeric variable?

A

a variable that has values that describe a measurable quantity as a number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 categories of numeric variables?

A

discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is discrete?

A

a numeric variable that can only take on certain values and is characterized by gaps or interruptions in the values that the variable can assume, usually integer numbers ex: pts in a day, # of meds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is continuous?

A

a numeric variable that can technically be measured with unlimited precision and that is not characterized by gaps in values that the variable could assume, ex: IOP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a categorical variable?

A

a variable that is made up of groups of objects and that names distinct entities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are two categories of categorical variables?

A

ordered and unordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is ordered? aka ordinal

A

a categorical with a value variable that can take on a logical order, sequence or rank ex: exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is unordered? aka nomial

A

a categorical variable with a value that is not able to be organized in a logical order, sequence or rank ex: iris color

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is dichotomous?

A

a variable that consists of only two categories ex: diabetic or not diabetic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is independent variable?

A

the variable that is manipulated by the experimenter and that does not depend on any other variables aka predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is dependent variable?

A

the variable that is not manipulated by the experimenter and that does depend on the other variable aka outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are descriptive statistics/measures of location?

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are descriptive statistics/measures of spread?

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the normal IOP and the mean IOP?

A

normal 10-21 and mean 15.5 mmHg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the standard deviation for IOP?

A

2.75 mmHg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What percent of the population falls within 1 SD?
68%
26
What percent of the population falls within 2 standard deviations of the mean?
95%
27
What percent of the population falls within 3 SDs of the mean?
99%
28
What are noteworthy distribution examples?
normal and t (there are many many distributions)
29
What is normal distribution?
symmetrical with a central peak "bell curve"; defined soley and completely by the mean and the standard deviation/variance
30
What is t distribution?
similar in appearance to normal distribution; utilizes degrees of freedom (distribution changes with number of degrees of freedom)
31
The smaller the degree of freedom...
the lower the peak and the higher the tail
32
Where does a t distribution approach normal distribution?
approaches normal distribution with degrees of freedom greater than 30
33
What allows us to make inferences based on small sample sizes?
t distribution
34
What are "other" distributions?
chi-square, binomial, poisson
35
What are "other" distributions?
chi-square, binomial, poisson
36
What is the P-value?
describes the likelihood of observing certain data given that the null hypothesis is true
37
If the p value is larger than the pre-determined criteria, then we...
do not have evidence to reject the null hypothesis (aka the data is consistent with the null hypothesis)
38
What is p-value usually set at?
0.05 aka 5% (2 SDs)
39
A p-value is the probability of an observation...
arising by chance
40
A p-value is the probability of an observation...
arising by chance
41
What is the t-test used for?
to test whether two group means are different
42
If p value of trial is higher than chosen p value...
you cannot reject null hypothesis
43
If p value of trial is lower than chosen p value...
you can reject the null hypothesis
44
What p value is more conservatibe?
0.01
45
When is an independent t test used?
used when there are two experimental conditions w/ different participants assigned to each condition
46
What does an independent t test show?
establishes whether two means collected from independent samples differ significantly
47
What are other names for independent t test?
independent measures or independent samples t test
48
When is a dependent t test used?
used when there are two experimental conditions w/ same participants assigned to each condition
49
What does a dependent t test establish?
whether two means collected from the same sample differ significantly
50
What are other names for dependent t test?
matched pairs or paired samples t test
51
What are 2x2 contingency tables?
cumulative incidence, relative risk, odds, odds ratio, chi squared test for independence, attributable risk, population attributable risk
52
What is relative risk?
aka risk ratio RR, compares the risk of a health even (disease, injury, risk factor or death) among one group with the risk among another group
53
What is odds ratio?
OR compares the odds of a health event (disease, injury, risk factor, or death) among one group with odds among another group
54
What general 2 things is a 2x2 contingency table comparing?
exposures and outcomes
55
What are the two most widely used measures of association in epidemiology?
relative risk and odds ratio
56
What measure of association does a cohort study use?
relative risk
57
What measure of association does a case-control study use?
odds ratio assuming incidence is not known
58
T/F the odds ratio always underestimates the relative risk
false, odds ratio always overestimates RR -- overestimation is greatest when the outcome is common
59
When may relative risk and odds ratio be close/similar?
when the outcome is rare
60
What is a chi-squared test for independence?
tests the association between categorical variables using chi-squared distribution
61
What is the cumulative incidence in the exposed?
a/ (a+b)
62
What is the cumulative incidence in the unexposed?
c/(c+d)
63
What is the relative risk for the outcome?
(a/(a+b))/ | (c/(c+d)) aka cumulative incidence of exposed of unexposed
64
What is the odds in the exposed?
a/b
65
What is the odds in the unexposed?
c/d
66
What is the odds ratio?
ad/bc aka cross multiplication of odds
67
What do you do with the chi-squared "statistic"?
identify P value from table
68
If P value is less than .05 what happens?
reject the null hypothesis
69
Type I error
BAD, occurs when one rejects the Null hypothesis when the Null hypothesis is actually true aka rejection of a true null hypothesis
70
Optometry Type I error example
you conclude that a new glaucoma drug lowers IOP better than an old glaucoma drug, when in fact it does not
71
Type II error
occurs when one rejects the alternate hypothesis (fails to reject the null) when the alternative hypothesis is actually true aka not rejecting a false null hypothesis
72
Optometry Type II error example
you conclude that a new glaucoma drug does not lower IOP better than an old glaucoma drug when in fact it does
73
False positive VF
patient says it's there but it's not; field may look better than it actually is
74
False negative VF
patient says it's not there but it is
75
Sensitivity
the proportion of subject with the target condition who have a positive test result aka true positive/ (true positive + false negative)
76
Specificity
the proportion of subjects without the target condition who have a negative result aka true negative/ (true negative + false positive)
77
Positive predictive value
the proportion of subjects who test positive who actually have the target condition aka true positive/ (true positive + false positive)
78
Correlation coefficient
a summary value used to assess the strength of the correlation between two continuous variables
79
What is the most commonly used correlation coefficient?
Pearson's correlation coefficient "r"
80
What does a higher correlation mean?
two variables are changing together
81
Review scatter plots for various R values
1.00 straight positive line
82
What does a larger value of r mean?
stronger correlation
83
T/F correlation = causation
false, correlation does not equal causation
84
What is simple regression?
a linear model in which one outcome is predicted from a single predictor variable (an expansion of the correlation coefficient)
85
What is the equation of a line?
y=mx +b
86
In the equation of a line, what is y?
dependent variable
87
In the equation of a line, what is x?
independent variable
88
What is multiple regression?
a linear model in which one outcome is predicted from two or more predictor variables (expansion of simple regression)
89
Constant
the value of the dependent variable in a regression equation when its associated independent variable equal zero aka baseline levels
90
What is the constant graphically?
the y-intercept, the point at which the regression line crosses the y-axis
91
Beta-coefficient
the degree of change in the dependent variable for every 1-unit of change in a particular independent variable
92
Example of beta-coefficient b1=0.2001
this means a one unit increase in x is associated with a 0.2001 unit increase in y
93
Coefficient P value
tells us whether or not an independent variable is statistically significant
94
R^2 coefficient of determination
a way to measure how well linear regression line fits the data; the proportion of the variance in the dependent variable that can be explained by the dependent variables
95
What does a coefficient of determination range between?
0 to 1, 0 indicates the response variable cannot be explained by the predictor variable at all
96
Standard error
measures how well the linear regression line fits the data, the average distance that the observed values fall from the regression line
97
What does a smaller standard error mean?
the model fits the data better
98
What is useful for calculating the p-value and the confidence interval for its corresponding coefficient?
standard error
99
Logistic regression
no linear relationship between x and y (or x and probability)
100
What does a logistic regression model?
log (odds), scale is linear
101
What does the formula for logistic regression do?
use formula to calculate the probability that a given observation/dependent/independent variable relationship takes on a value of 1; formula predicts the log odds of the dependent variable taking on a value of 1; then use a predetermined probability threshold to classify the given observation/dependent/independent variable relationship as either 1 or 0
102
Continuous output use
linear regression
103
When is logistic regression popular?
in epidemiology because odds ratio is the natural parameter estimated in a case control study
104
Categorical output use
logistic regression
105
Meta-analysis when and why
used to combine results from different studies to see if overall effect is significant, makes the equivalent of one large study, often used when there are multiple studies with conflicting results
106
Meta-analysis how
decide which studies to include and exclude using objective criteria, find all the studies on the subject, extract the required info, do the meta-analysis statistic, interpret the results