Biostatistics Flashcards

1
Q

Lecture objectives

A

1) review sampling, variables, and basic descriptive statistics (including measures of location and measures of spread)
2) Review the statistical distributions that are commonly used in biostatistics
3) Understand the application and usefulness of the t-test and the Chi-squared test
4) understand the usefulness of the 2x2 contingency table and how to derive various epidemiological study measures of association from its application

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lecture objectives part 2

A

5) Understand the usefulness and application of the 2x2 contingency table as a means to derive sensitivity, specificity, and related measures
6) Review scatter plots and their relation to the correlation coefficient
7) Understand the basics of simple linear regression, multiple linear regression, logistic regression and meta-analysis
8) Understand how to read and interpret a multiple linear regression table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Who was the 1st African American professional to practice in Memphis/”hero of the yellow fever epidemic”?

A

Dr. R H Tate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What was used as a yellow fever hospital?

A

peabody hotel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do I need to know this stuff?

A

prevalent in research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is population?

A

collection of persons or things to which we want to generalize a set of findings; largest collection of persons to which we have an interest at a particular time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is sample?

A

part of population; smaller collection of persons or things from a population used to determine generalities about the population of persons or things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is variable?

A

a characteristic that takes on different values in different persons, place, or things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are variable descriptors?

A

numeric, categorical, dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a numeric variable?

A

a variable that has values that describe a measurable quantity as a number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 categories of numeric variables?

A

discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is discrete?

A

a numeric variable that can only take on certain values and is characterized by gaps or interruptions in the values that the variable can assume, usually integer numbers ex: pts in a day, # of meds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is continuous?

A

a numeric variable that can technically be measured with unlimited precision and that is not characterized by gaps in values that the variable could assume, ex: IOP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a categorical variable?

A

a variable that is made up of groups of objects and that names distinct entities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are two categories of categorical variables?

A

ordered and unordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is ordered? aka ordinal

A

a categorical with a value variable that can take on a logical order, sequence or rank ex: exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is unordered? aka nomial

A

a categorical variable with a value that is not able to be organized in a logical order, sequence or rank ex: iris color

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is dichotomous?

A

a variable that consists of only two categories ex: diabetic or not diabetic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is independent variable?

A

the variable that is manipulated by the experimenter and that does not depend on any other variables aka predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is dependent variable?

A

the variable that is not manipulated by the experimenter and that does depend on the other variable aka outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are descriptive statistics/measures of location?

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are descriptive statistics/measures of spread?

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the normal IOP and the mean IOP?

A

normal 10-21 and mean 15.5 mmHg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the standard deviation for IOP?

A

2.75 mmHg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What percent of the population falls within 1 SD?

A

68%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What percent of the population falls within 2 standard deviations of the mean?

A

95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What percent of the population falls within 3 SDs of the mean?

A

99%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are noteworthy distribution examples?

A

normal and t (there are many many distributions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is normal distribution?

A

symmetrical with a central peak “bell curve”; defined soley and completely by the mean and the standard deviation/variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is t distribution?

A

similar in appearance to normal distribution; utilizes degrees of freedom (distribution changes with number of degrees of freedom)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

The smaller the degree of freedom…

A

the lower the peak and the higher the tail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Where does a t distribution approach normal distribution?

A

approaches normal distribution with degrees of freedom greater than 30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What allows us to make inferences based on small sample sizes?

A

t distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What are “other” distributions?

A

chi-square, binomial, poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What are “other” distributions?

A

chi-square, binomial, poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the P-value?

A

describes the likelihood of observing certain data given that the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

If the p value is larger than the pre-determined criteria, then we…

A

do not have evidence to reject the null hypothesis (aka the data is consistent with the null hypothesis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is p-value usually set at?

A

0.05 aka 5% (2 SDs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

A p-value is the probability of an observation…

A

arising by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

A p-value is the probability of an observation…

A

arising by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the t-test used for?

A

to test whether two group means are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

If p value of trial is higher than chosen p value…

A

you cannot reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

If p value of trial is lower than chosen p value…

A

you can reject the null hypothesis

44
Q

What p value is more conservatibe?

A

0.01

45
Q

When is an independent t test used?

A

used when there are two experimental conditions w/ different participants assigned to each condition

46
Q

What does an independent t test show?

A

establishes whether two means collected from independent samples differ significantly

47
Q

What are other names for independent t test?

A

independent measures or independent samples t test

48
Q

When is a dependent t test used?

A

used when there are two experimental conditions w/ same participants assigned to each condition

49
Q

What does a dependent t test establish?

A

whether two means collected from the same sample differ significantly

50
Q

What are other names for dependent t test?

A

matched pairs or paired samples t test

51
Q

What are 2x2 contingency tables?

A

cumulative incidence, relative risk, odds, odds ratio, chi squared test for independence, attributable risk, population attributable risk

52
Q

What is relative risk?

A

aka risk ratio RR, compares the risk of a health even (disease, injury, risk factor or death) among one group with the risk among another group

53
Q

What is odds ratio?

A

OR compares the odds of a health event (disease, injury, risk factor, or death) among one group with odds among another group

54
Q

What general 2 things is a 2x2 contingency table comparing?

A

exposures and outcomes

55
Q

What are the two most widely used measures of association in epidemiology?

A

relative risk and odds ratio

56
Q

What measure of association does a cohort study use?

A

relative risk

57
Q

What measure of association does a case-control study use?

A

odds ratio assuming incidence is not known

58
Q

T/F the odds ratio always underestimates the relative risk

A

false, odds ratio always overestimates RR – overestimation is greatest when the outcome is common

59
Q

When may relative risk and odds ratio be close/similar?

A

when the outcome is rare

60
Q

What is a chi-squared test for independence?

A

tests the association between categorical variables using chi-squared distribution

61
Q

What is the cumulative incidence in the exposed?

A

a/ (a+b)

62
Q

What is the cumulative incidence in the unexposed?

A

c/(c+d)

63
Q

What is the relative risk for the outcome?

A

(a/(a+b))/

(c/(c+d)) aka cumulative incidence of exposed of unexposed

64
Q

What is the odds in the exposed?

A

a/b

65
Q

What is the odds in the unexposed?

A

c/d

66
Q

What is the odds ratio?

A

ad/bc aka cross multiplication of odds

67
Q

What do you do with the chi-squared “statistic”?

A

identify P value from table

68
Q

If P value is less than .05 what happens?

A

reject the null hypothesis

69
Q

Type I error

A

BAD, occurs when one rejects the Null hypothesis when the Null hypothesis is actually true aka rejection of a true null hypothesis

70
Q

Optometry Type I error example

A

you conclude that a new glaucoma drug lowers IOP better than an old glaucoma drug, when in fact it does not

71
Q

Type II error

A

occurs when one rejects the alternate hypothesis (fails to reject the null) when the alternative hypothesis is actually true aka not rejecting a false null hypothesis

72
Q

Optometry Type II error example

A

you conclude that a new glaucoma drug does not lower IOP better than an old glaucoma drug when in fact it does

73
Q

False positive VF

A

patient says it’s there but it’s not; field may look better than it actually is

74
Q

False negative VF

A

patient says it’s not there but it is

75
Q

Sensitivity

A

the proportion of subject with the target condition who have a positive test result aka true positive/ (true positive + false negative)

76
Q

Specificity

A

the proportion of subjects without the target condition who have a negative result aka true negative/ (true negative + false positive)

77
Q

Positive predictive value

A

the proportion of subjects who test positive who actually have the target condition aka true positive/ (true positive + false positive)

78
Q

Correlation coefficient

A

a summary value used to assess the strength of the correlation between two continuous variables

79
Q

What is the most commonly used correlation coefficient?

A

Pearson’s correlation coefficient “r”

80
Q

What does a higher correlation mean?

A

two variables are changing together

81
Q

Review scatter plots for various R values

A

1.00 straight positive line

82
Q

What does a larger value of r mean?

A

stronger correlation

83
Q

T/F correlation = causation

A

false, correlation does not equal causation

84
Q

What is simple regression?

A

a linear model in which one outcome is predicted from a single predictor variable (an expansion of the correlation coefficient)

85
Q

What is the equation of a line?

A

y=mx +b

86
Q

In the equation of a line, what is y?

A

dependent variable

87
Q

In the equation of a line, what is x?

A

independent variable

88
Q

What is multiple regression?

A

a linear model in which one outcome is predicted from two or more predictor variables (expansion of simple regression)

89
Q

Constant

A

the value of the dependent variable in a regression equation when its associated independent variable equal zero aka baseline levels

90
Q

What is the constant graphically?

A

the y-intercept, the point at which the regression line crosses the y-axis

91
Q

Beta-coefficient

A

the degree of change in the dependent variable for every 1-unit of change in a particular independent variable

92
Q

Example of beta-coefficient b1=0.2001

A

this means a one unit increase in x is associated with a 0.2001 unit increase in y

93
Q

Coefficient P value

A

tells us whether or not an independent variable is statistically significant

94
Q

R^2 coefficient of determination

A

a way to measure how well linear regression line fits the data; the proportion of the variance in the dependent variable that can be explained by the dependent variables

95
Q

What does a coefficient of determination range between?

A

0 to 1, 0 indicates the response variable cannot be explained by the predictor variable at all

96
Q

Standard error

A

measures how well the linear regression line fits the data, the average distance that the observed values fall from the regression line

97
Q

What does a smaller standard error mean?

A

the model fits the data better

98
Q

What is useful for calculating the p-value and the confidence interval for its corresponding coefficient?

A

standard error

99
Q

Logistic regression

A

no linear relationship between x and y (or x and probability)

100
Q

What does a logistic regression model?

A

log (odds), scale is linear

101
Q

What does the formula for logistic regression do?

A

use formula to calculate the probability that a given observation/dependent/independent variable relationship takes on a value of 1; formula predicts the log odds of the dependent variable taking on a value of 1; then use a predetermined probability threshold to classify the given observation/dependent/independent variable relationship as either 1 or 0

102
Q

Continuous output use

A

linear regression

103
Q

When is logistic regression popular?

A

in epidemiology because odds ratio is the natural parameter estimated in a case control study

104
Q

Categorical output use

A

logistic regression

105
Q

Meta-analysis when and why

A

used to combine results from different studies to see if overall effect is significant, makes the equivalent of one large study, often used when there are multiple studies with conflicting results

106
Q

Meta-analysis how

A

decide which studies to include and exclude using objective criteria, find all the studies on the subject, extract the required info, do the meta-analysis statistic, interpret the results