Binary outcome Flashcards by Sarah Ashmori

what is the purpose of logistic regression

to classify samples
obese vs not obese
true vs false

How well did you know this?

Not at all

Perfectly

What’s the difference between a simple vs complicated model in logistic regression?

Simple model: can predict binary outcome using single PV (weight predicts outcome  obese vs not obese)
Complicated: use more than 1 (weight + genotype + age predicts outcome  obese vs not obese)

How well did you know this?

Not at all

Perfectly

for logistic regression does the PV also need to be binary? What about linear regression?

No can use continuous and discrete data to predict binaristic outcome.
Same goes for linear regression, the only difference really is that the outcome is continuous not binary
WHAT DETRMINS WHICH IS USED DEPENDS ONLY ON THE OUTCOME VARIABLE

How well did you know this?

Not at all

Perfectly

how do we know if each variable is usefully contributing to the model?

if the variable’s prediction is significantly different from 0 then its useful to the model
Use Walds test

How well did you know this?

Not at all

Perfectly

In linear regression - we have the concept of the residual, why does logistic regression not have this?

All the values don’t deviate from the line too much, see below

How well did you know this?

Not at all

Perfectly

what does logistic regression use to calculate the fit of a model

Maximum likelihood (curve)

How well did you know this?

Not at all

Perfectly

how do we find the line with the maximum likelihood

First pick a probability (curve) that estimimates the outcome for different values of weight.
Then use this curve to predict the likelihood of observing an obese vs non obese mouse for each value
Then multiply all those likelihoods together = the likelihood of the data GIVEN this curve
Do this for lots of different lines, each gives you the total likelihood
The curve with the maximum likelihood is selected

How well did you know this?

Not at all

Perfectly

Why is it innaprioriate to use linear regression when you have a binary outcome?

becasue the model will predict not only 0 and 1 outcomes but values between 0 and 1 e.g., 0.6.

This will produce large residuals which is bad becasue the residuals are whats used to do the fitting. large residuals will bias the result

How well did you know this?

Not at all

Perfectly

what is the equation of the logistic curve (S; sigomidal)

1/1+e(c + bx)

How well did you know this?

Not at all

Perfectly

odds?

can use the prediction in the logistic regression equation to compute the odds

Probability of event happening divided by the probability of the event not
Happening

This is the same thing as euleurs number raised to the power of the systematic component

(e(c+bX))

How well did you know this?

Not at all

Perfectly

Log odds or logit

simply the natural logs of the equation

The log odds transofrm the equation into a linear one. Got rid of Euleurs number.

Log odds vary between negative infinity and infinity as the probability moves from 0 to 1. Log odds are linearly related to the independent variable.

How well did you know this?

Not at all

Perfectly

Imagine we have the logit but want the odds. How do we calculate the odds?

e(logit)

How well did you know this?

Not at all

Perfectly

imagine we have the odds and want the probability outcome of there being a case or not. How do we calculate this?

Prediction = odds/1+odds

How well did you know this?

Not at all

Perfectly

what does the logit tell us

the linear impact of a PV on the DV

with a score 55% tp 56% int he PV, there is an increase in the logit by x

Same amoubt of increase if we were looking at the difference between a score of 64% and 65%

How well did you know this?

Not at all

Perfectly

if we are looking at odds of a value of 55%- 56% is the amount it changes equal to a difference between 64% and 65%?

No, bc there isnt a linear relationship

How well did you know this?

Not at all

Perfectly

if we are looking at the probability of a value of 55%- 56% is the amount it changes equal to a difference between 64% and 65%?

no, bc there isnt a linear relationship

How well did you know this?

Not at all

Perfectly

Odds ratio

calculated by dividing odds from point B with point A.

Odds at 55% attendance/ odds at 54% attendance. Odds for successive odds remains constant.

Gives indication of treatment effect. Tells us the relative increase in odds as you increase the IV by 1 unit

Example: 13 minutes adherence = odds of 0.2551

Odds ratio = 1.2190

Therefore 14 minutes of adherence = 0.2551 * 1.2190 = 0.3110

Then you can apply this to different contexts. Imagine you fit a logistic regression to a sample and get an odds ratio of 0.1 of people getting or not getting a disease. Can conclude for every minute n adhered to treatment they decreased their risk of getting disease by 10%.

How well did you know this?

Not at all

Perfectly

LEC: risk

number of n with event/total population

How well did you know this?

Not at all

Perfectly

Relative risk

Study These Flashcards

risk in group of interest (n with event/ total n in group A)
/
risk in reference group (n with event/total n in group B)

Risk difference

Study These Flashcards

risk in group of interest - risk in reference

odds

Study These Flashcards

number of n who have event / number of n who don’t have an event

Odds ratio

Study These Flashcards

odds in group of interst / odds in reference group

(n with event/n without event; TREATMENT GROUP) / (n with event/n without event; reference group)

interpret RR or odds ratio of:
1
1>
1<

Study These Flashcards

1 no association between exposure and control
1>  risk/odds of outcome is greater in the exposed group
1<  risk/odds of outcome smaller in the exposed group

what is the relationship between the RR and OR if the event is rare vs frequent

Study These Flashcards

If the outcome is rare, these values will be more similar. If the outcome is frequent, they are not similar

with binary outcomes, what tests can we use to test for differences in the event/non events between the intervetnion and control

 Chi-squared test

if we have a small numbers what correction do we apply to the chi-squared test?

Yate’s correction for continnuity

how does fisher's exact test work?

Creates a contingency table for all the possible values in cells that the row and column totals could be the same Then determines the probability of observing each table if the null was true (chance occurences) Then the sum of the probabilities of the tables that are equal to or more extreme than the observed table = the p value.

with binary outcomes, what do we use to test for differences in the event/non events between the intervetnion and control if numbers are small (less than 5 events in any cell)

 Fisher’s exact test

what is typically used to test the difference in dverse events between the intervention and control group

fisher's exact test

In stata, what are we looking for in chi squared test to tell us about the difference in each group getting a case or not

- Risk difference - Relative risk/risk ratio - Odds ratio - Chi squared result - P value

what are the assumptions that need to be met prior conducting logistic regression

- Don’t assume variables in model are normally distributed - Outcomes are independent – whether or not person 1 is a case has no effect on whether person 2 is

What things represents the treatment effect in logistic regression?

- Odds ratio - Log odds ratio

Interpret the odds ratio of 4.03

The odds of having an event is larger in the treatment group by a magnitude of 4.03

why might we get a odds ratio when using chi-squared vs logistic regression?

because chi-squared does not adjust for baseline covariates while logistic regression does

if the odds ratio for chi-squared test and logistic regression is the same what does that mean

the variables adjusted for had no effect on the outcome

what is the coefficient of the model in logistic regression?

The log(odds ratio) the model coefficient represents the change in the log-odds of the outcome variable associated with a one-unit change in the predictor variable, holding all other predictor variables constant. The log-odds is the natural logarithm of the odds, which is the probability of an event occurring divided by the probability of the event not occurring. The log-odds can take on any value from negative infinity to positive infinity, with positive values indicating higher odds of the event occurring and negative values indicating lower odds. So, when the coefficient of the model is the log-odds ratio, it tells us how the odds of the outcome variable change with a one-unit increase in the predictor variable. A positive coefficient means that the odds of the outcome variable increase as the predictor variable increases, while a negative coefficient means that the odds of the outcome variable decrease as the predictor variable increases.

How do we get the odds ratio and the CI in logistic regression?

take the exponent of the model coefficient and the exponent of the limits of the 95% CI of model coefficient to get the odds ratio and its CI.

what is the coefficient in log-binomial regression?

log(risk ratio)

how do we get the relative risk and its CI in log-binomial regression?

take the exponent of the coefficient and the exponent of the limits of the 95% CI

Interpret the odds ratio of 4.03

The odds of having an event is larger in the treatment group by a magnitude of 4.03

interpret: chi squared test Relative risk/Risk ratio [95% CI]: 1.40 [1.20 to 1.64] interpret this

The risk of losing ≥5% of initial weight by 12 months is 40% higher in the intervention (support) group than in the advice group.

interpret: Chi squared test, Odds ratio [95% CI]: 1.59 [1.29 to 1.96]

The odds of losing at least 5% of initial weight by 12 months is higher in the intervention (support) group than in the advice group by a factor of 1.6.

interpret: logistic regression Odds Ratio [95% CI]: 1.60 [1.29 to 1.97]; P value <0.001

A significant adjusted odds ratio in favour of the support arm was found, indicating that participants had an increased odds of 1.596 (or 1.6) of losing at least 5% of initial weight in the Support group compared to the Advice group

interpret: log-binomial regression Risk Ratio [95% CI]: 1.41 [1.21 to 1.64]; P value <0.001

A significant adjusted risk ratio in favour of the support arm was found, indicating that participants had a 41% increase in the risk of losing at least 5% of initial weight in the Support group compared to the Advice group after adjusting for gender and baseline weight.

binary outcome, what are the unadjusted tests and adjusted tests we use?

unadjusted: > chi squared > fisher's exact test non adjusted > logistic regression > log-binomial regression

binary outcome, what values are used to determine treatment effect

odds ratio (logistic regression) relative risk/risk ratio (log-binomial regression)

Binary outcome Flashcards

(46 cards)