Biostat post Q2 Flashcards
censored data
In survival analysis, when the survival time for an individual is not observed because the individual was still alive at the end of the study or the subject is lost to follow-up.
chi-squared statistic
sum (O-E)^2 /E
The test statistic from a chi-squared test.
–> if the chi-squared statistic for a comparison of two proportions exceeds 3.84, then the difference is statistically significant with p
What is the primary limitation of the chi-squared test?
The chi-squared test gives a p-value but provides no estimate of the size of the effect.
After using the chi-squared test, how do you estimate the size of an effect?
–> Difference of proportions (risk difference)
–> Ratio of proportions (relative risk)
–> Risk of odds (odds ratio)
Contingency table
A rectangular array of numbers, typically cross-classifications of subjects into categories of two measurements
Independent Samples t-test
A statistical test for comparing the means on a continuous measurement for two separate groups of individuals.
–>For example, this test can be used to compare mean systolic blood pressure in males versus females.
Kaplan-Meier method
See Product-limit method

A procedure for estimating a survival function from survival data in the presence of censoring.
survival analysis
Statistical evaluation of time-to-event data. Examples include time to death, time to disease progression, and time until onset of disease.
survival curve
A graph of a survival function.
survival function
Describes the probability that an individual will survive beyond a specific point in time.
survival time
Data measured as the time until an event (such as death) occurs.
coefficient of determination
Usually denoted by R^2, in linear regression analysis,
–>This is the proportion of total variation in the dependent variable explained by the independent variable(s).
correlation coefficient
Usually denoted by r
–>this is a measure of the strength of a linear relationship between two variables.
–>The correlation coefficient is always between –1 and 1, where either extreme denotes a perfect linear relationship and a correlation of zero denotes no linear relationship.
–>Positive values of r denote a positive relationship, and negative values of r denote an inverse relationship
Cox regression
See Proportional hazards regression.
A regression analysis procedure used in survival analysis with censored data. The effects of independent variables are usually presented as risk ratios.
dependent variable
In a regression analysis, the dependent variable is the variable which is predicted by the model.
–>In linear regression, the dependent variable is continuous.
–>In logistic regression, it is dichotomous.
–>In proportional hazards regression, it is a survival time.
dichotomous variable
A categorical variable having only two levels (e.g., presence or absence of disease)
explanatory variable
Independent variable

A variable which is used to predict the dependent variable in a regression analysis.
independent variable
A variable which is used to predict the dependent variable in a regression analysis.
Indicator variable
An independent variable that takes on the values 0 or 1.
–>For, example, to indicate female gender, an indicator variable will be set to 0 for the males and 1 for the females.
intercept
In a linear regression analysis, the mean of the dependent variable when the independent variables are all set equal to 0.
least squares
A procedure, based on minimizing the squared error, for estimating the intercept and slopes in a linear regression analysis.
linear regression
A statistical analysis that predicts a continuous dependent variable using one or more independent variables based on a the equation of a line.
logistic regression
A regression analysis procedure used to predict the value of a dichotomous variable. The effects of independent variables are usually presented as odds ratios.
multiple linear regression
A linear regression analysis using two or more independent variables.
multivariate regression analysis
A regression analysis (linear, logistic or proportional hazards) using two or more independent variables.
non-linear
A relationship in which the scatterplot of a dependent variable and an independent variable is not well-approximated by a straight line.
proportional hazards regression
A regression analysis procedure used in survival analysis with censored data. The effects of independent variables are usually presented as risk ratios.
regression coefficient
An estimate of the intercept or slope in a regression analysis.
response variable
See Dependent variable.
–>In a regression analysis, the dependent variable is the variable which is predicted by the model.
–>In linear regression, the dependent variable is continuous.
->In logistic regression, it is dichotomous.
–>In proportional hazards regression, it is a survival time.
scatterplot
A 2-dimensional plot of a dependent variable (usually on the vertical axis) and an independent variable (usually on the horizontal axis). Each point on the plot represents on subject from whom both variables are measured.
simple linear regression
A linear regression analysis using one independent variable.
slope
In a linear regression analysis, the amount of change in the dependent variable when the independent variable increases by 1 unit.
univariate regression analysis
A regression analysis (linear, logistic or proportional hazards) using one independent variable.
Bradford Hill criteria
Characteristics that may indicate causal associations in biology:
–>strong association
–>dose-response relationship
–>consistent association
–>specific association
–>temporally correct association
–>biologically plausible association.
Early detection
Any action that advances the time of awareness that a disease is present.
lead time
The increased time from diagnosis to death (or other outcome) due to earlier diagnosis as opposed to later death.
length-biased sampling
In a screening program, the tendency to detect indolent disease with a relatively good prognosis.
Occult disease
Disease is detectable by testing but not evident by signs or symptoms.
pre-clinical detection period
The time interval when a disease can be found using screening techniques, but before symptoms would bring it to clinical attention.
Primary prevention
An attempt to avoid any manifestations of disease. (Lowering cholesterol in people without heart disease.)
pseudodisease
Subclinical disease that would not become overt before the patient dies of other causes, or which would never progress to clinical recognition.
Screening
The systematic examination of those who are apparently well (or who are apparently free of the target disease) to identify and treat subclinical disease (or predictors of future disease).
secondary prevention
An attempt to avoid progression of a disorder among individuals who already have some signs (or symptoms) of the target disease (e.g., lowering colesterol in heart attack patients).
subclinical disease
See occult disease.
Disease is detectable by testing but not evident by signs or symptoms.

target disease
A disease or condition that is targeted by a screening program.
target population
A population selected for screening.