Lecture 7 Flashcards
Why can’t you use regular regression for binary outcomes?
- because you can get values other than 0 or 1
- can have below 0 and above 1 and decimals
- this does not make sense when trying to interpret; cannot exptrapolate
What does logisitic regression involve?
- model the probability of predicting Y=1 (this is a continuous function ranging from 0-1)
- model: log odds of obtaining Y=1
- predict this as a regression
How do you calculate the odds and probability in logisitc regression?
- use the values in the formula to get log(odds)
- odds = e^(log(odds))
- P(Y=1) = odds/(1+odds)
How do you interpret odds and log(odds)?
- odds > 1: Y=1 more probable than Y=0
- log(odds) > 0: Y=1 more probable than Y=0
- odds=1 or log(odds)=0: equal chances of each
Why do we use log in logisitic regression?
- can put in any values from -infinity to infinity, yet:
- the function cannot go below 0 or above 1
How do you sub the regression equation into the log function?
1 / (1 + e^-(regression equation))
What is the link? What are the different types of links?
- link = function (f(Y)), sometimes mu
- identity link: mu = Y (linear model)
- logistic link: mu = log P(Y=1)/P(Y=0). For binary variables
- logarithmic link: mu = logY. For counts/frequencies, loglinear model
Why do we use links/functions?
- GLM allows linear techniques to be used on non-linear data
- when datasets do not conform to the assumptions of linear regression
What are the assumptions of logistic regression? What is not assumed?
- binary outcomes that are MUTUALLY EXCLUSIVE
- independence of observations (as usual)
- IVs can be continuous or categorical
- NOT normality, linearity, homoscedasticity
How do you interpret the SPSS output for logistic regression?
- Block 0: doesn’t tell you much, classification table tells you proportion of Y=0
- Block 1: look at R2 (Nagelkerke)
- % correct > how much correct classification the model has
- Exp(B) = the odds ratio, interpret as: odds increase by a FACTOR of this when the IV increases by one unit
- also look at the CI for Exp(B)
What is the difference between Cox and Snell’s and Nagelkerke’s R2 values?
- C+S: function of the likelihood ratio, does not have a maximum of 1
- N: adjusts C+S by taking it to the maximum possible value
Why do you have to use loglinear regression rather than X2?
- when there is a 3x3 not a 2x2 table
- X2 works with 2x2 only
What is Simpson’s paradox?
conclusions drawn from the margins of a table are not necessarily the same as those from the whole table
What are loglinear models based on?
counts or frequencies
3+ categorical variables
What is the formula for loglinear model? What do you actually test?
logF(MD) = sigma + lambda(M) + lambda(D) + lambda(MD)
- tests INTERACTION to see if the variables are associated
- test to see if the NON-SATURATED model is an accepatble fit
How does loglinear regression go about reaching a simpler model?
- it starts with a saturated model
- removed the highest order interaction and sees whether this affects the fit
What measure of fit is used in loglinear regression? What do you do with this?
- G2 (likelihood ratio statistic)
- X2 distribution
- saturated model has df=0, no probability (has a - in table)
- go through the tables and look at the deleted affect significance levels
- take out the non-significant (>.05) ones when you do your model selection
How do you interpret the loglinear regression?
- estimate value: if >0 then the likelihood increases, if less than 0 then likelihood deceases
- because they are in terms of log(odds)!
What are the assumptions of loglinear regression?
- each case in one cell and one cell only
- 5x as many cases as cells
- all cell frequencies should be >1 and only 20% less than 5
- normal standardised residuals, no obvious pattern when plotted against observed values
What is Wald’s test?
- a significance test
- it’s like the t-test in ANOVA
How do you calculate the EXPECTED cell counts in loglinear regression by hand?
- do e^(x) for each parameter that applies to that cell, then multiply all of these together
- remember to do one for the constant as well!!
OR you can add the relevant B values together, then take the e of this summed total
eg. if Senior, Male, Appraisal A: need parameters senior, male, A, seniorA, maleA, seniormale, seniormale*A
Why do you need to be careful when looking at logistic regression in terms of probability?
- the log function is not linear
- you cannot interpret the probability in a linear fashion
BUT: you can get linear prediction in terms of log(odds)
What is the key feature of the coding of binary variables in logistic regression?
- it is arbitrary!
- just need to be 0, 1 coded
Why can you not compare R2 values in logistic regression?
- the variance is a function of the proportion (mean)
- cannot be compared with R2 from linear regression
- cannot be compared with R2 for binary outcomes with diff. means
Explain the % correct
- for P(Y=1), if >.5 then correct
- for P(Y=0), if p
Why do we use log in loglinear models?
- because of the properties of logs
- log(AB) = log(A) + log(B)
What is the equation for logistic regression?
log(P(Y=1)/P(Y=0)) = alpha + b1X1 + b2X2 etc.
What do the graphs look like in logistic regression for log(odds), odds and probability?
- log(odds): linear
- odds: exponential
- probability: log (s-shaped) curve
What is the equation for the Generalised Linear Model?
f(Y) = alpha + b1X2 + b2X2 etc. + e
How do you calculate proportions in general? how does this translate to the loglinear model?
F(md) = N x p(m) x p(d)
- N = total number, p = proportion
TAKE LOG
Log(F(md)) = log(N x p(m) x p(d))
»> log(N) + log(p(m)) + log(p(d))
- then add interaction term (b/w m and d)
What are the estimate terms in in loglinear models?
- in log(odds)!!!!
How do you calculate, for example, “if you are male, odds of getting an A”? And how do you get an odds ratio for males vs. females?
- (sum of males with A)/(sum of males not with A - with B/C)
- odds ratio: males/females with the above equation
What is the nature of loglinear modelling? What does this mean?
- hierarchical
- if the 3 way is sig, then keep the 3 way, 2 way and main effects
- always have to keep the lower down effects
- that’s why you can assume that the main effects are present, if you are keeping even just one 2-way effect
Why is the model saturated in loglinear modelling?
- cannot use any more parameters (all main and interaction effects included)
- more parameters are redundant
eg.
- log(F)m-nd = sigma + (lambda)M
- log(F)fd = sigma + (lambda)D
- log(F)f-nd = sigma