Lecture 7A: Logistic Regression Flashcards

1
Q

What does logistic regression measure?

A

Measures the relationship between categorical dependent variable and one or more independent variables by estmating probabilities using logisitc function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When to use Logistic Model

A
  • Independent variables are continous
  • Meets assumptions of linear regression models
  • Distribution fits linear model but target class is binary (normal distribution)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does it mean to train the model?

A

It means finding the optimal values. Such that we get the best predictive performance, or, the best seperation of Y(1)’s and N(0)’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Optimal Coefficients

A

● The optimal coefficients can be used to predict the unseen features (x values in the equation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Predictive Models

A

● Predictive models are “predictive” and they are expected to have “Errors”
○ Objective is to go as CLOSE as possible to the would be reality
○ Error is the gap between the prediction and the reality
○ Process: Feed the model with labeled data and modify the parameters to minimize the ERROR (the training process)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Feature Importance of the Model Features by

A

○ Multiplying the coefficients by the Standard Deviation
○ Convert the data set to standardized data before getting the coefficients
○ Higher coefficient values indicate larger influence of corresponding features on Outcome (Target Variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Linear regression is similar to logistic regression, except…

A

Logistic Regression predicts if something is true or false, instead of predicting something continous, like size…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Instead of fitting a line, like we do in linear regression, in logistic regression, we fit…

A

fits an “S” shaped “logistic function”. The “S” curve goes from zero to one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is logistic regression usually used for?

A

It is usually used for classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Just like linear regression, logistic regression can work with what type of data?

A

Logistic regression can work with continous data (like weight and age) and discrete data (like genotype and astrological sign)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Logistic regression does not have the same concept of a “residual” which is used in linear regression, so it can’t use least squares and it cant calculate R^2, instead it uses…

A

Maximum Likelihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In summary, logistic regression can be used to

A

classify samples, and it can use different types of data (like size and/or genotype) to do that classification. It can also be used to assess what variables are useful for classifying samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we find optimal values of the coefficients?

A

Cost Function, Loss Function, Error Function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cost Function

A

Alternate Terms - Loss Function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Gap between “prediction” and “reality” is prediction “…”

A

“Error”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In Logistic Regression, how do we minimize error?

A

WE feed the model with “labelled data” and continually modify its parameters (or coefficients) to minimize the ERROR (the training process).

17
Q

How do we know when we reached the “trained state” where the ERROR is minimal?

A

Its a mathemtical optimization problem. Various approaches.
In linear regression, we try to minimize the Mean Square Error(MSE).

Instead of using error values directly, we develop a function that will measure the “cost” or “loss” related to the error and the function is “continous”.

Make the function to be “convex” so there is a clear “global minima”