Week 3: Multivariate regression and inference in regression Flashcards
Non-directional hypothesis
H0: β = 0
H1: β ≠ 0
- I expect an effect
- Implies a two-tailed hypothesis test that splits alpha into both ‘tails’ of the distribution
- Less precis
Directional hypothesis
- H0: β ≥ 0; H1: β < 0 (negative)
- H0: β ≤ 0; H1: β > 0 (positive)
- I expect a negative/positive effect
- Implies a one-tailed hypothesis test with alpha in left (negative) OR right (positive) tail of the distribution
- Critical value will be closer to 0 than in two-tailed test, easier to reject null hypothesis.
T-distribution
Used when data are approximately normally distributed, which means the data follow a bell shape but the population variance is unknown. The variance in a t-distribution is estimated based on the degrees of freedom of the data set (total number of observations minus number of independent variables minus 1 (n-k-1)).
When (n-k) > 30, t and Z distribution start to look the same; SO, when n-k is big, use Z (it’s much easier!)
Intercept/constant
The predicted value of the dependent variable when the independent variables are 0
Slope
How steep the line is
R-squared
A statistical measure in a regression model that dtermines the proportion of variance in the dependent variable that can be explained by the independent variable.
Dummy variables
A variable that takes a binary value (0 or 1) to indicate the absence or presence of some categorical effect that may be expected to shift the outcome.
Why multivariate regression
- Because more than one independent variable influences the dependent variable
- Fuller explanation, better predictions
- Find the seperate contribution from each independent variable
Multivariate hypothesis testing
t = b/SE
- t: The t-value, or t-score, is a ratio of the difference between the mean of the two sample sets and the variation present within the sample sets
- b: Slope
- SE: Standard error
If the t-value > T critical, than reject null hypothesis and vice versa