Biostats test 4 Flashcards
What is binary logistic regression
prediction of a binary-valued DV on the basis of other variables, so response variable is not continuous, but binary-valued, as in:
yes/no, has/does not have, alive/dead, increased/decreased
values of outcome variable
failure (coded 0) or success (coded 1)
success means “has the property”
what do we try to predict with binary logistic regression
the probability of success as a function of covariates
p(success) = f(cov1, cov2, …)
Assumptions of binary logistic
- categories must be mutually exclusive (no overlap) and collectively exhaustive (all cases can be assigned)
- if so, for all cases, success or failure can be coded in the data
in logistic regression, for predicted probabilities to be meaningful, they must…..
lie between the values of 0 and 1
Assumptions of logistic regression
categories must be mutually exclusive (no overlap) and collectively exhaustive (all categories can be assigned)
Link function
transforms the dependent variable so outcome range can be restricted to be between 0 and 1
Similarities of BLR with OLS
- model building and its issues (colinearity, order of entry, influential cases)
Dissimilarities of BLR with OLS
- DV is binary (categorical), not continuous
- Interpretation of coefficients
- Assessment of model fit/quality of obtained model
How do we ensure the [0,1] restircuted outcome range for the predicted values
- link function (logit) is used to relate the linear model part to the outcome variable
- it transforms the predicted values so that the outcomes are restrained to fall in the meaningful 0 to 1 range
- Regression techniques that make use of some kind of link function are called Generalized Linear Models (GLM)
The logit function
- Natural logarithm of odds
- Logit = ln(odds) = ln(y hat/1-y hat)
The logistic regression model when combined with the logit (link) function
ln(y hat/1-y hat) = b0 + b1X1 + b2X2 + …
So the difference with OLS model is on the dependent variable side of the model
In Logistic Regression, rather than looking at the B coefficients themsleves, we look at
- ODDS RATIO = e to the bower of b, where e is the base of natural log ln
- change in the odds (of success) for a one-unit change in the predictor
Wald test
Gives the p-value of the odds ratio
Under H0, odds ration is
- Odds ratio = 1
What if we want to calculate the odds ratio for a non-unit size
multiply regression coefficient by the size before you raise e to the power of the coefficient
Multiplicative effect
The combined effect of predictors on the DV is a product of separate effects, so the effects of odds ratios multiply in binary logistic regression, while they add up in ordinary linear regression
From probabilities to classification - what is classification of cases based on?
The predicted probabilities for success. The default setting for classification as ‘success’ is a predicted probability for success > .50.
confusion matrix
the confusion matrix in your output gives predicted versus observed successes and failures, based on the cut-value (always think of a 2x2 table).
Classification errors
false positives and false negatives
The null model
starting point in the model building, a model without any actual predictors
if have to predict without knowing anything about predictors, the LARGEST CATEGORY WINS!
How to decide which covariates to include
in the end, need to classify cases as success or fail cases - two groups
Suppose you have a continuous covariate and success and fail groups differ on mean → covariate may help to classify as success or fail
Suppose you have categorical covariates, and success and fail groups have different distributions. Chi-square test of homogeneity will be significant → categorical covariate may help to classify as success or fail
Receiver operating characteristic curve
a plot that illustrates the performance of a binary classifier system for different discrimination methods
typically gives true positive rate (specificity) against the false positive (1 - sensitivity) rate at various cut value settings
allows to see how diff cut value settings affect that classification results of your classifier
lenient threshold (low cut value): sensitivity is up!
strict threshold (high cut value): specificity is up!
Optimum typically: combination of high sensitivity and high specificiity
A lenient (low) cut value leads to
higher sensitivity! Success easily detected (at the expense of increased false positives)
A strict (high) cut values leads to
higher specificity! Good at keeping false positives down, but less sensitive
ROC curve
area under curve is overall indicator of diagnostic accuracyw