Logistic Regression Flashcards
Contingency table assume independence of _______and _________ and compare __________ and __________ values to calculate __________ (for counts use ____ correction for continuity)
Contingency table assumes rows and columns are independent and compare observed and expected value to calculate chi^2
for counts use Yate’s correction for continuity
How is significance of odds ratio indicated using chi squared
- Chi^2 uses ‘expected values’ calculated assuming the two odds (A/C and B/D) are the same.
If odds were the same, the ratio =1 - chi squared assess the probability that odd ratio is one.
if p<0.05 then there is a significant change in the odds
How to calculate relative risk
RR = p(lung cancer)smokers/ p(lung cancer)nonsmokers
How to calculate odds
with a binary outcome variable there are two and only two possible outcomes
each outcome has a probability
the outcomes are mutually exclusive
odds = p/(1-p)
Relative risk indicates ________________, and is more ______ than odds ratio. We cannot easily calculate the _____ of RR
Relative risk indicates the probability of something happening (chances of dying is 6.2 times greater if you smoke than not),
relative risk is more intuitive than odds ratio,
however, can’t easily calculate the significance of Relative risk
what is odd ratio
express effects as changes in the odds. the chances that smokers die by age 60 are 31.6 times greater than non-smokers
Logistic regression use _______ DV and _________IV
binary DV and continuous IV
Why cant we use normal regression for binary DV and continuous IV
can get probability 1, can get impossible data points
on a technical note, expect the residuals to depend on the X value, therefore, cant trust the result
why is logistic regression good for binary DV and continuous IV
value restricted in the range (0..1)
input can take random values
use the logistic regression equation p(event)=1/(1+e^(-z))
Describe the relation between logistic regression and odds
- the logit is the logarithim of the odds
logit(p)=z= ln(p/(1-p)) = a1X1+ a2X2 +… +b
in other words, logistic regression is estimating the odds as a function of predictors - the ratio of the odds per unit increase is constant for all values, because the odds are an exponential (e^z)
How to determine significance of odds ratio in logistic regression
- Because the odds ratio is constant, every estimate is based on the fit of the logistic equation
- use the significance of the logistic regression to determine if the odds ratio is different from unity: non significant fit implies no change and IV has no effect
- significance of odd ratio given by significance of regression