week 3 part 1 Flashcards

1
Q

How come that we may need to develop our dummy variables?

A

When we used the dummyvariables the predicted y differed between two categories by a fixed amount across values of x. Sometimes we need an interaction variable that allows the predicted y to differ between the two categories by a varying amount across values of x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you assess the significance of the interaction variable?

A
  • A t-test for the individual significance of the dummy variable d and the interaction variable xd.
  • A partial F-test to evaluate the joint significance of d and xd
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Besides models where the explanatory variables are dummy variables, What other types of Classification Models can we build?

A

Classification models where the response variable is binary (e.g., yes/no, success/failure).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the outcomes in a linear regression model where y is a binary variable?

A

Then y is a discrete stochastic variable with only two possible outcomes (0 or 1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is this linear regression model applied to a binary response variable called?

A

The linear probability model (LPM).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the pros and cons of LPM?

A
  • Pros: It is simple to estimate and interpret.
  • Cons: The model can predict probabilities greater than 1 or less than 0, which is not feasible.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a more suitable model for binary response variables than the LPM?

A

The logistic regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is the Logistic Regression Model better than the LPM for binary response variables?

A

The logistic regression model ensures that the predicted probabilities lie between 0 and 1 for all values of the explanatory variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is used to estimate logistic regression?

A

maximum likelihood estimation (MLE) instead of OLS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is the interpretation the same in a logistic regression as in the linear models?

A

No, the coefficients have a different interpretation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is sometimes used to interpret the logistic model?

A

The odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you calculate accuracy?

A
  • First, convert the predicted y(hat) values to binary predictions:
    1 if y(hat) ≥0.5 and 0 if y(hat) < 0.5.
  • Then, compare the binary values of the response variable with the binary predictions.
  • The accuracy is calculated as:
    Accuracy = (number of correct predictions) / (the number of predictions) * 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the odds ratio?

A

The ratio between the probability of success P(y=1) and failure P(y=0), P(hat)/(1-P(hat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is often used to measure of goodness of fit for binary choice models?

A

Accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When can accuracy be misleading?

A

In cases where there are many 0s and few 1s, or vice versa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly