Ch 8 Notes Flashcards

1
Q

8.1 Regression Models with Interaction Variables

Sample regression equation

A

𝑦̂=𝑏_0+𝑏_1 π‘₯_1+𝑏_2 π‘₯_2+…+𝑏_π‘˜ π‘₯_π‘˜.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

8.1 Regression Models with Interaction Variables

Sometimes natural for the partial effect to depend on a predictor variable

Ex: Price of House

A

-Assumed that an additional bedroom results in the same increase in the house price regardless of the square footage

-Assumption may be unrealistic for larger houses, an additional bedroom often results in a higher increase in house prices

-Interaction effect between the number of bedrooms and square footage of the house

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

8.1 Regression Models with Interaction Variables

Interaction Variables

A

-Capture effects by incorporating interaction variables in regression model

-Product of two interacting predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

8.1 Regression Models with Interaction Variables

Interaction Effect

A

Occurs when the partial effect of a predictor variable on the response depends on the value of another predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

8.1 Regression Models with Interaction Variables

Three types of interaction variables

A
  1. Interaction between two dummy variables
  2. Interaction between a dummy variable and numerical variable
  3. Interaction between two numerical variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

8.1: Regression Models with Interaction Variables

Regression Model with two dummy variables 𝑑_1 and 𝑑_2, and an interaction variable 𝑑_1 𝑑_2

𝑦= 𝛽_0+𝛽_1 𝑑_1+𝛽_2 𝑑_2+𝛽_3 𝑑_1 𝑑_2+πœ€

A

Estimated model: 𝑦̂=𝑏_0+𝑏_1 𝑑_1+𝑏_2 𝑑_2+𝑏_3 𝑑_1 𝑑_2

**Partial Effect of 𝑑_1 on 𝑦̂ is 𝑏_1+𝑏_3 𝑑_2, this depends on 𝑑_2.

𝑑_2=0: the partial effect of 𝑑_1 on 𝑦̂ is 𝑏_1

𝑑_2=1: the partial effect of 𝑑_1 on 𝑦̂ is 𝑏_1+𝑏_3

**The partial effect of 𝑑_2 on 𝑦̂ is 𝑏_2+𝑏_3 𝑑_1, this depends on 𝑑_1.

𝑑_1=0: the partial effect of 𝑑_2 on 𝑦̂ is 𝑏_2

𝑑_1=1: the partial effect of 𝑑_2 on 𝑦̂ is 𝑏_2+𝑏_3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

8.1: Regression Models with Interaction Variables

Regression Model with a numerical value π‘₯, a dummy variable 𝑑, and an interaction variable π‘₯𝑑

𝑦= 𝛽_0+𝛽_1 π‘₯+𝛽_2 𝑑+𝛽_3 π‘₯𝑑+πœ€

A

The estimated model is 𝑦̂=𝑏_0+𝑏_1 π‘₯+𝑏_2 𝑑+𝑏_3 π‘₯𝑑.

**The partial effect of π‘₯ on 𝑦̂ is 𝑏_1+𝑏_3 𝑑, this depends on 𝑑.

𝑑=0: the partial effect of π‘₯ on 𝑦̂ is 𝑏_1

𝑑=1: the partial effect of π‘₯ on 𝑦̂ is 𝑏_1+𝑏_3

**The partial effect of 𝑑 on 𝑦̂ is 𝑏_2+𝑏_3 π‘₯, this depends on π‘₯.

Difficult to interpret because π‘₯ is numerical

Common to interpret this partial effect at the sample mean π‘₯Μ…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

8.1: Regression Models with Interaction Variables

Tegression model with two numerical variables, π‘₯_1 and π‘₯_2, and an interaction variable π‘₯_1 π‘₯_2

𝑦= 𝛽_0+𝛽_1 π‘₯_1+𝛽_2 π‘₯_2+𝛽_3 π‘₯_1 π‘₯_2+πœ€

A

The estimated model is 𝑦̂=𝑏_0+𝑏_1 π‘₯_1+𝑏_2 π‘₯_2+𝑏_3 π‘₯_1 π‘₯_2.

The partial effect of π‘₯_1 on 𝑦̂ is 𝑏_1+𝑏_3 π‘₯_2, this depends on π‘₯_2.

The partial effect of π‘₯_2 on 𝑦̂ is 𝑏_2+𝑏_3 π‘₯_1, this depends on π‘₯_1.

The partial effects of both variables are difficult to interpret.

**Consider the partial effects at the sample means π‘₯Μ…_1 and π‘₯Μ…_2.

At π‘₯Μ…_1, the partial effect of π‘₯_2 on 𝑦̂ is 𝑏_2+𝑏_3 π‘₯Μ…_1.

𝑏_3>0: the partial effect on 𝑦̂ will be greater (smaller) at values higher (lower) thanγ€– π‘₯Μ…γ€—_1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

8.2 Regression Models for Nonlinear Relationships

Linear Regression

A

Often justified on the the basis of its computational simplicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

8.2 Regression Models for Nonlinear Relationships

Implication of a simple linear regression model

A

If π‘₯ goes up by one unit, then the expected 𝑦 changes by 𝛽_1 regardless of π‘₯.

In many applications, the relationship between the variables cannot be represented by a straight line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

8.2 Regression Models for Nonlinear Relationships

Linearity Assumption

A

Places the restriction of linearity on the parameters and not on the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

8.2 Regression Models for Nonlinear Relationships

We can capture many interesting nonlinear relationships within the framework of the linear regression model

A

By simple transformations of the response and/or predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

8.2 Regression Models for Nonlinear Relationships

*In microeconomics, A firm’s average cost curve tends to be β€œU-Shaped”

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

8.2 Regression Models for Nonlinear Relationships

Due to economies of scale, average cost cost 𝑦 of a firm

A

Initially decreases as output π‘₯ increases

As π‘₯ increases beyond a certain point, its impact on 𝑦 turns positive.

Other applications show the influence of the predictor variable initially positive but then turning negative, leading to an β€œinverted U-shape”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

8.2 Regression Models for Nonlinear Relationships

Quadratic regression model is appropriate when

A

Slope capturing the influence of π‘₯ on 𝑦, changes in magnitude as well as sign.

𝑦=𝛽_0+𝛽_1 π‘₯+𝛽_2 π‘₯^2+πœ€

Model can be extended to include multiple predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

8.2: Regression Models for Nonlinear Relationships

B_2 > 0

A

Slope up

17
Q

8.2: Regression Models for Nonlinear Relationships

B_2 < 0

A

Slope down

18
Q

8.2: Regression Models for Nonlinear Relationships

Use OLS to obtain

A

Sample regression equation
𝑦̂=𝑏_0+𝑏_1 π‘₯+𝑏_2 π‘₯^2

Cannot interpret 𝑏_1 in the usual way, focus on 𝑏_2.

The partial effect of π‘₯ on 𝑦̂ can be approximated by 𝑏_1+γ€–2𝑏〗_2 π‘₯.

19
Q

8.2: Regression Models for Nonlinear Relationships

𝑦̂ reaches a maximum/minimum at

A

π‘₯=(βˆ’π‘_1)/(2𝑏_2 ).

Maximum when 𝑏_2<0

Minimum when 𝑏_2>0

Maximum/minimum reached when 2𝑏_2 is positive/negative

20
Q

8.2: Regression Models for Nonlinear Relationships

Use ____ to compare linear and quadratic models

A

R^2

21
Q

8.2: Regression Models for Nonlinear Relationships

Another common transformation that captures nonlinearity is based on

A

Natural Logarithim

22
Q

8.2: Regression Models for Nonlinear Relationships

Natural Logarithm

A

-Convert changes in a variable into percent changes

-Useful, many relationships are naturally expressed in terms of percentages

Ex: Income, House Prices, Sales

-Use intuitive and statistical measures to determine the appropriate form

23
Q

8.2: Regression Models for Nonlinear Relationships

In a log-log regression model

A

Both the response and predictor are transformed using natural logs:

ln⁑(𝑦)=𝛽_0+𝛽_1 ln⁑(π‘₯)+πœ€

Relationship is a curve that depends on slope coefficient 𝛽_1

24
Q

8.2: Regression Models for Nonlinear Relationships

0 < B_1 < 0

A

Slope up

25
Q

8.2: Regression Models for Nonlinear Relationships

B_1 < 0

A

Slope down

26
Q

8.2: Regression Models for Nonlinear Relationships

𝛽_1 measures the approximate percentage change in 𝐸(𝑦) when

A

**When π‘₯ increases by 1%.

0<𝛽_1<1: positive relationship, 𝐸(𝑦) increases as a slower rate

𝛽_1>1: positive relationship, 𝐸(𝑦) increases as a faster rate

𝛽_1<0: negative relationship; 𝐸(𝑦) decreases as a slower rate

𝛽_1 is a measure of elasticity

27
Q

8.2: Regression Models for Nonlinear Relationships

Log Log Regression Model

A

Nonlinear in the variables but it is still linear in coefficients

Using an anti-log reverse transformation to each side has been known to underestimate the expected value of y

Predictions are made by 𝑦̂=𝑒π‘₯𝑝(𝑏_0+𝑏_1 ln⁑(π‘₯)+(𝑠_𝑒^2)⁄2), use unrounded coefficients.

28
Q

8.2: Regression Models for Nonlinear Relationships

Logarithmic regression model

A

Semi-log model that transforms only the predictor variable: y=𝛽_0+𝛽_1 ln⁑(π‘₯)+πœ€

Attractive when only the predictor variable is better captured in percentages

𝛽_1βˆ—0.01 measures the approximate change in 𝐸(𝑦) when π‘₯ increases by 1%.

Predictions are made by 𝑦̂=𝑏_0+𝑏_1 ln⁑(π‘₯).

29
Q

8.2: Regression Models for Nonlinear Relationships

Exponential regression model

A

Semi log model that transforms only the response variable: : ln(y)=𝛽_0+𝛽_1 π‘₯+πœ€.

𝛽_1βˆ—100 measures the approximate percentage change in 𝐸(𝑦) when π‘₯ increases one unit

Predictions are made by 𝑦̂=𝑒π‘₯𝑝(𝑏_0+𝑏_1 π‘₯+(𝑠_𝑒^2)⁄2)

30
Q

8.2: Regression Models for Nonlinear Relationships

xy = B_0 + B_1X + E

A

Predicted Value: y^ = b_0 + b_1x

Estimated Slope coefficent: b_1 measures the change in y^ when x increases by one unit

31
Q

8.2: Regression Models for Nonlinear Relationships

ln(y) = B_0 + B_1ln(x) + e

A

Predicted Value: y^ = exp(b_0 + b_1ln(x) + s2e/2)

Estimated slope coefficient: b_1 measures the approximate percentage change in y^ when x increases by 1%

32
Q

8.2: Regression Models for Nonlinear Relationships

y = B_0 + B_1ln(x) + E

A

Predicted Value: y^ = b_0 + b_1ln(x)

Estimated Slope Coefficient: b_1 x 100 measures the approximate percentage change in y^ when x increases by one unit

33
Q

8.2: Regression Models for Nonlinear Relationships

ln(y) = B_0 + B_1x + E

A

Predicted Value: y^ = exp(b_0 + b_1x + s2e/2)

Estimated Slope Coefficient: b_1 x 100 measures the approximate percentage change in y^ when x increases by one unit

34
Q

8.2: Regression Models for Nonlinear Relationships

We cannot use adjusted 𝑅^2 to compare models that use 𝑦 as the response to models that use ln⁑(𝑦) as the response.

Compute the correlation between the observed and predicted values of each model: π‘Ÿ_(𝑦𝑦̂ ).

Then use an alternative way to compute 𝑅^2=(π‘Ÿ_(𝑦𝑦̂ ) )^2.

A
35
Q

8.3 Cross-Validation Methods

All of the model evaluation measures thus far assess

A

predictability in the sample data that was used to build the model.

These measures do not help gauge how well an estimated model will predict an unseen sample

It is possible a model performs well with the sample data used for estimation, but then performs miserably once a new sample is evaluated

36
Q

8.3: Cross-Validation Methods

Overfitting occurs

A

**When an estimated model describes quirks of the data rather than the relationships between variables.

Model becomes too complex

Fails to describe the behavior in a new sample

Predictive power for new samples is compromised

37
Q

8.3: Cross-Validation Methods

Cross Validation

A

Measure of the predictive power of a model based on a set of data not used in estimation

**Technique that evaluates predictive models by partitioning the sample.

Training set: use to build/train the model

Validation set: use to evaluate/validate the mode

Use the root mean square error (RMSE) on the validation set.

𝑅𝑀𝑆𝐸=√(βˆ‘(𝑦_π‘–βˆ’π‘¦Μ‚_𝑖 )^2 /𝑛^βˆ— =βˆšβˆ‘(𝑒_𝑖 )^2 /𝑛^βˆ—

𝑦̂ is a true prediction for an observation in the validation set.

𝑛^βˆ— is the number of observations in the validation set.

RMSE will be lower for the training set than the validation set.

38
Q

8.3: Cross-Validation Methods

Holdout Methods

A

**Partitions the sample data set into two independent and mutually exclusive data sets.

Partition the sample into two parts: training and validation sets.

Use the training set to estimate competing models.

Use the estimates from the training set to predict the response in the validation set.

Calculate RMSE (or other performance measures) for each competing model. The preferred model will have the smallest RMSE.

We would like the model with the best performance in the training set to also have the best performance in the validation set

39
Q

8.3: Cross-Validation Methods

Conflicting Results are a sign of

A

overfitting using the training set