M13 - Tutorial Logistic Regression Flashcards

Question 1

Q

Logistic Regression

basic idea
disadv of linear model
problems in practice: estimation
regression of parameters

Answer

A

DV is discrete
-> binary problems

–> you only have 0 and 1 as value of IV and DV
- probabilities can be less than zero or greater than 1
–> linear regr is only valid if the var have a linear relship: categorial variables dont have this
–> Logit regr expresses the multiple regr equation in log terms –> overcomes the problem of linearity
Y takes on values between 0 and 1: (value close to 0 means that it is unlikely to not have occured, close to 1 means that y is very likely to have occured)

log regr tries to determine the probability of occurence of a certain event using regression approach considering different influencing variables
regr parametrs cannot be interpreted as knwon from linear regression
through maximum likelihood

Question 2

Q

Classification table - block 0

what is included?
cut value:
hit rate:

Answer

A

block 0 only includes the constant in the model
it does not include explanatory variabes
-> predictions are only based on which category appears most often in the dataset
Cut value: indicates the point from which on the estimated probability of an observation is assigned to Y=1
hit rate: how often is this prediction right?

Question 3

Q

Classification table - Block 1

- what is included?

Answer

A

Block 1 comprises all IV

Question 4

Q

Wald test
- tests…

adv over LR
adv over t-test
disadv

Answer

A

–> tests how far the estimated parameters are from zero in SE

avd LR: only requires estimating one model
adv t-test: can test multiple parameters simultaneously
disadv: not standardized

= the wald test approximates the LR test

test the H0: bj = 0.
-> if H0 cannot be rejected: removing variables from the model will not substantially harm the fit of the model, since a predictor with a coeff that is very small relative to its SE is generally not doing much to help predct the DV.

Question 5

Q

Interpretation of Metric Variables 
- odds
- odds ratio
 =1
=2
=0.2
label in SPSS

Answer

A

odds: the likelihood of an event occuring relative to the likelihood of an event not occuring
odds ratio: “effect size” : how much do the odds (event occuring) increase/decrease when there is a unit change in the associated IV (OR = Odds after 1unit change/ original odds)
–> if >1, than as IV increases, th odds of the outcome occuring increases
–> if < 1, than as the predictor increases, the odds of the outcome occuring decreases
“the higher blabla, the lower the probability of blabla to occur”

–> statement about how many percentages can only marginal effects give

SPSS: odds ratio : Exp(B)

Question 6

Q

z-test and t-test

difference?

Answer

A

t-test: 
ONE: compare sample mean with pop mean
TWO: compare two independent samples
- N < 30
- SD unkown
- student's t distribution
--> does the predictor have explanatory value?

z-test: compare sample mean with pop mean

N > 30
SD known
normally distributed

Question 7

Q

Pseudo-R²-Measure

Problem for logit
tries to …
types
values

Answer

A

for the logit there is no meassure exactly matching the R² of the OLS

…tries to quantify the fraction of variance explained by the logistic regression model - how well does the logistic model fit the data?

McFadden R², Cox & Snell R², Nagelkerke R²
values >0.1 acceptable, > 0.2 good

Question 8

Q

Ordered Logit

ordered logit model
DV

Answer

A

regr model for ordinal DV

–> extension of logistic regr model that applies to dichotomous DV, allowing for more than two ordered response categories (e.g. Likert-scale)

DV: variable with at least three attributes that can be ranked –> ordinally scaled

Question 9

Q

Ordered Logit - Model estimation

- y and y*

Answer

A

y is a recording of metric variable y*
values of y* are not observable = threshold value
relship between y and y* is modeled using threshold model –> set upper and lower bounds (e.g. values for hot, medium, cold)
-> if y* <= threshold value O1, the first category will be observed
-> if y* > O1, but threshold value O2, the second category will be observed, etc.

Question 10

Q

Log-likelihood test

assesses…
is an indicator of …
large values …., because

Answer

A

assesses the goodness of fit in logistic regression
is an indicator of how much unexplained information there is after the model has been fitted
large values indicate poorly fitting of models, because the larger the value, the more unexplained observations ther are.
H0 = all the parameters of the predictors are zero
-> if H0 rejected : predictors do have influence

Question 11

Q

Maximum-Likelihood estimation

corresponding to … in linear regression
how?

Answer

A

corresponding to OLS to estimate regression parameters (but does not aim at minimizing variance)
selects coefficients that make the observed values most likely to occur

Question 12

Q

Chi²-test

tests…
calculates the fit/total error of a model, how?
interpretation

Answer

A

test if there is a relship between two categorial variables (does the number of cats that line-dance relate to the type of training they use?)
[(observed ij - model ij)²/ model ij]
-> standardizing the deviation for each observation
-> adding upa ll those Std Deviations : chi²
chi² : look up critical values for the df: it is significant if the value is bigger than the critical value

Question 13

Q

Likelihood Ratio test

alternative to…
based on …
the resulting statistic is based on …
interpretation

Answer

A

alternative to chi²
based on maximum-likelihood test
based on comparing observed frequencies with those predicted by the model
also has a chi² distribution: look up critical values for the df: it is significant if the value is bigger than the critical value

Question 14

Q

Marginal effects

how?
difference between …
Obacht!

Answer

A

set all variables equal tot he mean and consider the marginal effects of xi on y
difference between the p-values of Y=1 and Y=0
Obacht: marginal value dpeends on the considered variable and on the values of other IV

Question 15

Q

Omnibus test

Answer

A

test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall.
F-test

M13 - Tutorial Logistic Regression Flashcards

(15 cards)