week 8 - binary logistic regression models Flashcards

1
Q

what is the binary outcome variable

A
  • linear regression can work very well when you have a continuous outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the two possible outcomes

A
  • participants can either be outcome a or outcome, either be happy or unhappy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do we overcome the violation

A
  • When the outcome is binary (two possible outcomes), the assumption of linearity is ALWAYS violated
  • We can apply a transform to the data to express the non-linear relationship in a linear way
  • Binary logistic regression does this by expressing the linear regression equation in logarithmic terms
  • This overcomes the issue of violating this assumption
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is binary logistic regression

A
  • Logistic regression is a generalized linear model – flexible generalisation of linear regression
  • Predicting an outcome that has only two possible outcomes
  • Which of two outcomes is an individual likely to have (e.g. happy/not happy, pass/fail)?
  • Predictors can be continuous, categorical, or a combination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the odds ration

A
  • An odds ratio is one of the most important outcomes of logistic regression
  • Odds ratio = Change in odds resulting from a unit change in the predictor
  • Measure of association between a predictor and an outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how do we interpret the odds ratio

A
  • a unit increase in the predictor is associated with a lower odds of the outcome
  • unit increase in the predictor is associated with a higher odds of the outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does the odds ratio number mean

A
  • Individuals who have a hamster have 4.69x higher odds being happy relative to individuals who do not have a hamster
  • You must use the word ‘odds’ when referring to odds ratios
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the independence of errors

A
  • Cases of data should not be related
  • For instance, each cases should represent data from a different person
  • We can’t really test for this - we should just know this is true based on the methodology
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the failure to coverage

A
  • When you run a binary logistic regression model, R starts by estimating the parameters with a best guess
  • It then attempts to estimate the parameters more accurately
  • It stops when on each new attempt, the parameters are very similar (it “converges”)
  • Sometimes it doesn’t converge:
    Ignore the output – not accurate!!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how do we prepare our data

A
  • The binary outcome should be stored as a numeric value with outcomes coded as 0 and 1
  • Categorical predictor should be a factor
  • Run the binary logistic regression model Code to run the binary logistic regression model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how do we evaluate the model

A
  • To assess the fit of our model, we can compare our specified model to a model containing only the intercept (no predictors)
  • We do this by looking at a measure called the “deviance”:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

does the binary logistic regression have R2

A
  • R2 in linear regression = the proportion of variance explained by the model
    In logistic regression, this doesn’t exist
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the intercept

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how do we evaluate the individual predictors

A
  • To convert back from the log scale, we exponentiate our log odds (“Estimate”).
  • This gives us our odds ratio
  • We also want a confidence interval arounds the odds ratio
  • 95% confidence interval tells us the likely range the true odds ratio in the population is contained in
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are predicted probabilities

A
  • But we can obtain probabilities from our model too. For instance:
  • If an individual has a hamster, what’s the probability they will be happy?
  • If an individual does not have a hamster, what’s the probability they will be happy?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly