7 lecture Flashcards
Regression basics video Equation
Y = a(intercept) + b(slope) * X
º Intercept
a The independent variable
Slope:
b The dependent variable
R^2 =
ow much of the variation in the dependent variable is explained by the model?
R2 = 0.9033 = it explains 90,33 of the variation in te number of comments
Concepts:
1. Slope
A unit change in X lead to Beta units change in Y
Ex. With every new day (whatever the time unit is) the number of comments is
Concepts:2. Intercept
Average value of Y when X=0
Intercept interpretation not always possible (if the value 0 does not make sense for X)
Dependent variable Independent variable Interpretation of β (slope)
Y X =
A unit change in X leads to β units change in Y
Dependent variable Independent variable Interpretation of β (slope)
Y (log) X =
A 1% change in X leads to a β units change in Y
Dependent variable Independent variable Interpretation of β (slope)
(log)Y X =
A unit change in X leads to a β % change in Y
Dependent variable Independent variable Interpretation of β (slope)
(log)Y (log) X =
A 1% change in X leads to a β % change in Y
What to look at? with a lineair regression model? 2X
- Is the coefficient for the number of website visits statistically significant? A coefficient is statistically significant when its p-value < 0.05 (5% is a standard level of significance=we are willing to accept a 5% probability of making the wrong conclusion)
- Is there an effect? If yes, what is the direction (+/-) and magnitude of the effect on purchase value?
If no, is there some potential explanation?
Regression model workflow 5 steps
Regression model workflow 1. Model specification based on theory and logic Which variables to include? Possible interactions? 2. Estimate parameters using software 3. Interpret coefficients (significant coefficients only) Direction and magnitude 4. Evaluate model Overall model significance Model fit 5. Use model for prediction
Model specification: linear regression
Which variables to include?
5X
- Marketing variables
- Customer characteristics
- Product characteristics
- Competitor activity
- Seasonality
what leads to biased results?
Omitting relevant variables or incorrectly specifying the relationship between variables
ProductSales_i=
β_0+β_1 VolumeOwned_i+β_2 VolumeEarned_i