Logistic Regression Flashcards
What is logistic regression used for?
Logistic regression can be used to analyze binary response as well as ordinal response data.
Binary:
- The response, Y, of a subject can take one of two possible values, denoted by 1 and 2 (for example, Y=1 if a disease is present; otherwise, Y=2).
Ordinal:
- The response, Y, of a subject can take one of m ordinal values, denoted by 1; 2;…;m
What does infinite parameters mean?
The term infinite parameters refer to the situation when the likelihood equation does not have a finite solution.
What is Complete Separation?
There is a complete separation of data points if there exists a vector b that correctly allocates all observations to their response groups.
The maximum likelihood estimates does not exists.
What is Quasi-complete Separation?
With equality holds for at least one subject in each response group, there is a quasi-complete separation.
The maximum likelihood estimates does not exists
What is Overlap?
If neither complete nor quasi-complete separation exists in the sample points, there is an overlap of sample points.
The maximum likelihood estimate exists and is unique.
When is Complete separation and quasi-complete separation normally a problem?
Complete separation and quasi-complete separation are problems typical for small sample.
What is logistic regression?
Logistic regression allows one to predict a discrete outcome such as group membership from a set of variables that may be continuous, discrete, dichotomous, or a mix.
Logistic regression emphasizes the probability of a particular outcome for each case.
The procedure for estimating coefficients is maximum likelihood, and the goal is to find the best linear combination of predictors to maximize the likelihood of obtaining the observed outcome frequencies.
What assumptions are not relevant in logistic regression?
in logistic regression, the predictors do not have to be normally distributed, linearly related to the DV, or of equal variance within each group.
Mention the different types of logistic regression.
there are three major types of logistic regression: direct (standard), sequential, and statistical.
Explain direct (standard) logistic regression
all predictors enter the equation simultaneously (as long as tolerance is not violated).
This is the method of choice if there are no specific hypotheses about the order or importance of predictor variables.
The method allows evaluation of the contribution made by each predictor over and above that of the other predictors.
This method has the difficulties with interpretation when predictors are correlated. A predictor that is highly correlated with the outcome by itself may show little predictive capability in the presence of the other predictors.
Explain sequential logistic regression.
The researcher specifies the order of entry of predictors into the model.
Explain statistical logistic regression.
Inclusion and removal of predictors from the equation are based solely on statistical criteria.
When statistical analyses are used, it is very easy to misinterpret the exclusion of a predictor; the predictor may be very highly correlated with the outcome but not included in the equation because it was “bumped” out by another predictor or by a combination of predictors.
How do you Interpret Coefficients Using Odds?
The odds ratio is the change in odds of being in one of the categories of outcome when the value of a predictor increases by one unit.
The coefficients, B, for the predictors are the natural logs of the odds ratios; odds ratio = e^B .
Therefore, a change of one unit on the part of a predictor multiples the odds by e^B .
What is the dependent variable expressed as in Log. regression?
the natural log of the probability of being in one group (0) divided by the probability of being in the other group (1)
What does the coefficient stand for in log. reg.?
The change in logit (log of odds) of the outcome variable