Logistic Regression Flashcards

Question

the issue with this is

Answer 1

not very interpretable; – if p=.8; odds=4; logit=1.386

Answer 2

increment in the logit given unit increment in predictor

Answer 3

the amount by which odds of being-a-case are multiplied given a unit increment in predictor (or change in level of predictor if predictor is categorical)

Answer 4

Ð maximum likelihood estimation; an ITERATIVE solution where the regression coefficients are estimated by trial-and-error and gradual adjustment. Ð (this seeks to maximize the likelihood (L) of the observed values of Y given a model and using the observed values of the predictors)

Answer 5

uses measures of deviance rather than sums of squares Focus is lack of fit more focused on (1-R2) minimizing this – just the same as MR but flipped round Null Deviance, Dnull; similar to SSTotal = reflects amount of variability in data, the amount of deviance that could potentially be accounted for. Model Deviance, DK; similar to SSResidual = reflects amount of variability in data after accounting for prediction from k predictors

Answer 6

log likelihood

Answer 7

nested So, minimising the lack of fit of the model – maximising the likelihood of the data So, take models, compare sets of models with one another. Simplest way is comparing model with all the variables in, with no model at all..and compare subsets of the model, with and without individual predictors + sig of each predictor. (COMPARING 2 MODELS, ONE IS BIGGER AND ONE IS SMALLER AND NESTED IN THE BIGGER MODEL, COMPARING HIERARCHICALLY)

Answer 8

Ð If the smaller model is true then the Likelihood ratio test (LRT) statistic is distributed as χ2 with m df if smaller model is correct, under assumptions, then LRT is chi 2 with m df. SOOOO its testing whether it is worth having those m parameters in the model, if the LRT is not any bigger under the chi 2 dist with m df then it is not worth putting the additional parameters in, and we prefer the simpler model – more parsimonious explanation – only prefer bigger model if it improved fit.

Answer 9

Standard approach is more hierarchical than standard MR – resembles hierarchical MR. We only accept more predictors if they significantly enhance the degree of fit.

Answer 10

would one model on its own be a better fit etc?

Answer 11

as catagorical - IVs are either 0 or a 1 needs big sample sizes

Answer 12

waste of time – analgous to R2 – McFaddons/ Cox n Snell / Nagelkerke – all crap. Not “variance accounted for” as not homoscedastic.

Answer 13

To calculate DF for a binary DV (has disease: yes vs no)– you need to add up all the main effects and interactions. It is not N-1 as when DV is continuous. Similarly, categorical IVs need (m-1)\*(n-1) parameters to capture the effects when there are m levels of the IV and n levels of the DV.

Answer 14

Wald statistic – a Z statistic, tested against chi2 again– similar to B or beta in OLS MR. Sometimes too conservative. This Wald in combination with the change in the model when you take that predictor out.

Answer 15

HE mentions – this below would be in the exam WHEN VIEWING THE SPSS OUTPUT IN THE EXAM – I WOULD BE CAREFUL TO CHECK THE CODING OF THE VARIABLES – AS THEY PROBABLY WLL TRY CATCH US OUT. SAY HAVING AUTISM IS CODED AS 0, AND NOT HAVING AUTISM IS 1. YOU THINK OTHER WAY AROUND ETC. – see slide 39

Answer 16

Possible to also examine – Full factorial (main effects and interactions between factors; no interactions involving covariates – like omnibus, doesn’t tell us much) – Complete: main effects and all interactions, including interactions with covariates (good) – Saturated: as complete only covariates treated as factors (just everything, huge model, granular way of looking at it – just IS the data itself)

Answer 17

Common technique in LOG\_REG is backward-stepwise: (like we would rerun without maternal warmth) Ð Iterative process (see SPSS class): Ð Begin with complete model Ð Remove non-significant variables Ð Re-run model and compare fit

Answer 18

Absence of multicollinearity among the predictors

Answer 19

HOSMER & LEMESHOW method BOX-TIDWELL

Answer 20

Turn covariates into quartiles and enter as factor. Then compare the quartiles (do the EXP\_Bs (odds ratio) increases at roughly equal levels/roughly linear trend)

Answer 21

In this approach, terms, composed of interactions between each predictor and its natural logarithm, are added to the logistic regression model. The assumption is violated if one or more of the added interaction terms are statistically significant. Construct a new predictor (in his e.g. Leadership\*log(leadership) If this extra predictor is sig. then there is evidence of non-linearity in the logit

Answer 22

– Categorical predictors are called ‘factors’ – Continuous predictors are called ‘covariates’

Answer 23

there are too few cases relative to the number of predictor variables.

Answer 24

To find a source of multicollinearity among the discrete predictors, use multiway frequency analysis (cf. Chapter 16) to find very strong relationships among them. To find a source of multicollinearity among the continuous predictors, replace the discrete predictors with dichotomous dummy variables and then use the procedures of Section 4.1.7. Delete one or more redundant variables from the model to eliminate multicollinearity.

Answer 25

Independence of Errors

Answer 26

One remedy is to do multilevel modeling with a categorical DV in which such dependencies are considered part of the model

Answer 27

: one is a model with no effects just the intercept (intercept only model) and the other (final model) is the model specified for this stage

Answer 28

indicating worse fitting models.

Answer 29

likelihood ratio test statistic and this is distributed approximately as the chi-squared distribution

Answer 30

the difference in number of parameters between the two models.

Answer 31

that there is be a statistically significant deterioration in fit from the final model to the intercept only model

Answer 32

explaining variance in outcome (i.e., DV category membership

Answer 33

I think this is how many levels of DV -1 for each parameter (i.e. main effect and interaction term) included here i ref the 2002 model answer question 2 "• The final model contains terms for age covariate effect, which for a 3-category DV requires 2 parameters; the gender factor also requires 2 parameters; and the age\*gender interaction also requires 2 parameters. This explains the 6 df (=2+2+2). Note that effects in logistic regression are really effect\*DV interactions, explaining why the 3 levels of the DV are relevant to the number of parameters needed. "

Answer 34

the deterioration of fit between a saturated model (i.e., a model with 0 df, that provides the best possible fit to the data) and the final model for that stage of the analysis. Nominal Log\_Reg (from past exam)

Answer 35

deviance and pearson calculate the goodness of fit statistic in slightly different ways (deviance is a log likelihood ratio test).

Answer 36

This shows that the final model for stage 1 is not a significantly worse fit to the data than a saturated model with x more parameters (x more parameters in the saturated model explain why these test statistics have 8 df). Thus the extra parameters in the saturated model are not particularly useful in fitting the data. A really good answer might explain why there are 8 more parameters required for the saturated model than for the final model of stage 1. But i dont really understad this tbh

Answer 37

The likelihood ratio tests table provides information on the deterioration in the fit from the final model fitted in this stage to reduced models in which particular terms are removed from the model. (as we go down the list)

Answer 38

the difference between the intercept and our model with x predictor and x df

Answer 39

From SPSS output " the chi 2 statistic is the difference in -2 log likelihoods between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of the effect are 0" gives the difference between the model and intercept full model using the -2 log likelihood and compares it against a chi2 dist

Answer 40

lack of fit

Answer 41

Deviance measures contrast log likelihoods using log likelihood ratios. The log likelihood is a function of the probabilities of the observed and model- predicted outcomes for each case, summed over all cases so roughly, the maximum log likelihoods of the following get compared to each other: 1. model with no predictors (only an intercept), with 2. perfectly fitting model (aka saturated model) 1. model with a set of k predictors, with 2. perfectly fitting model 1) Compute a log likelihood (LLS) value for a smaller model (one with k parameters) 2) Compute a LLB value for a bigger model (one with k+m parameters) Likelihood ratio test (LRT) statistic, LRT = -2\*LLS – (-2\*LLB) = -2\*log(LS/LB) If the smaller model is true then the LRT statistic is distributed as χ2 with m df

Answer 42

can't compare models (one with another) sensitive to N

Answer 43

Individual predictor contribution The likelihood ratio tests table provides information on the deterioration in the fit from the final model [fitted in this stage] to reduced models in which particular terms are removed from the model. So if one variable exhibits a significant value it means the model has significantly worse fit without that variable

Answer 44

re run without those predictors as they do not contribute to the model well

Answer 45

indicates a worse fitting model

Answer 46

The difference in -2LL values is shown in the chi-square column of the table and this is the likelihood ratio test statistic for the deterioration in model fit and it has x df which is the number of parameters associated with the main effect

Answer 47

the effect of each component in a model with all the components present - their effect, independent of the other predictors present in the model the parameter estimates table provides a test of the effect within a model containing the other terms - their independent contribution to the model over and above all the other factors in the model (like partial regression coefficients)

Answer 48

one compared to the other (which is why one is set to =0 and no values are put in)

Answer 49

on their own ....

Answer 50

change in odds ratio for every unit increase in the variables ON THE dv

Answer 51

odds change in reference to the other category on DV

Answer 52

the more is going on

Answer 53

that the unstated reference category is more likely (women here was the reference category) "In both cases the odds of identification are lower for men than for women. The odds ratios for men:women, are given in the gendwitn=1 rows, and are 0.644 and 0.471 for identifying a suspect and identifying a volunteer respectively. These odds ratios are both significantly below 1. This pattern is consistent with a bias towards making an identification (whether accurate or not) in women compared to men.

Answer 54

The odds of xxx [sig or not] decreases for each unit increase in xxx

Answer 55

For each unit increase in xxxxx, the odds of a xxxx almost doubles

Answer 56

interaction terms between some of the main effects (e.g. interaction between gender and marital status)

Answer 57

full factorial model (it's full with respect to factors i.e. it has all factors, main effects and interactions, and it has all the main effects of the covariates) same as ancova -full factorial- doesnt have interactions between categorical and covariate variables only factors interact with other factors not with covariates He calls it COMPLETE model saturated would be treating every possible cell by cell combination (so age would be broken down into every single age)

Answer 58

likelihood ratio test statistic

Answer 59

chi-squared distribution with degrees of freedom (df) equal to the difference in number of parameters between the two models.

Answer 60

= statistically significant deterioration in fit from the final model to the intercept only model.

Answer 61

• A good answer might also explain why there are 5 df. The final model contains terms for breed (3 levels), which requires 2 parameters (3‑1); the food factor (2 levels) requires 1 parameters (2-1); and the breed\*food interaction also requires 2 parameters. This explains the 5 df (=2+1+2).

Answer 62

there are no more degrees of freedom.

Answer 63

the deterioration of fit between a saturated model (i.e., a model with 0 df, that provides the best possible fit to the data) and the final model for that stage of the analysis.

Answer 64

slightly different ways

Answer 65

to a log likelihood ratio test). The zero df for the GOF statistics indicates that the model tested is a saturated one (as it contains the same number of parameters as the saturated model with which it is compared).

Answer 66

Dnull - Dk as equivalent to SSregression and Dnull as equivalent to SSresidual, and thereby calculates R2 just as one would in OLS regression.

Answer 67

.75 when there is an equal N in each category of the DV,

Answer 68

Cox and Snell's R2 by its maximum in order to achieve a measure that ranges from 0 to 1.

Logistic Regression Flashcards

(95 cards)