Correlation And Regression Flashcards
Test statistic for significance of correlation coefficient
t = r sqrt(n - 2)
————–
sqrt(1 - r^2)
r = correlation coefficient n = number of samples
What is correlation coefficient
Measure of strength of linear relationship between two variables
x>0 = positive correlation
x<0 = negative correlation
X=0 = no correlation
Regression assumptions
Independent variable:
- linearly related to dependent var
- uncorrelated with residual
Residual term:
- exp value = 0
- constant variance
- independently and normally distributed
Confidence interval of y value
Predicted Y value +/- (critical t-value)*(standard error)
Difference between t-test and F-test
T-test: assesses statistical significance of individual regression parameters
F-test: assesses model effectiveness
How to test for heteroskedasticity
Using Breusch-Pagan test on regressed squared residuals
Lagrange Multiplier LM = n*R^2
Will be x^2 random variable w/df= num independent variables in regression
Or White Test?
How to test for serial correlation
Durbin-Watson test
If it differs sufficiently from 2, then regression errors have significant serial correlation
How to test for multicollinearity
No specific test
Identify whether R^2 is high with significant F-stat despite insignificant t-tests
Drop one of correlated variables
What are two types of tests for significance in multiple regression?
Whether each independent variable explains dependent (t stat)
Whether some or all independent variables explain dependent (F stat)
What is t stat formula for multiple regression
t = est regression param
—————————————
Std error of regression param
With n - k - 1 df
What info is included in an ANOVA table? (analysis of variance)
Degrees of freedom Sum of squares Mean square (ss / df)
For regression and error and total
How to calc MSR (mean squared regression)
k
How to calc MSE (mean squared error)
n - k - 1
What is coefficient of determination and how to calc (R^2)
% variation in dep var explained by indep vars
R^2 = regression sum of squares
——————————
total sum of squares
= SST - SSE ------------ SST
What does adjusted R^2 measure
Goodness of fit that adjusts for the number of independent variables included in the model
What is standard error of estimate and how to calc
Measures uncertainty of values of dependent variable around regression line
= Sqrt (mean squared error)
How to calculate F stat
mean squared error
With k and n-k-1 df
Reject H0 if F > Fcritical
Confidence interval for regression coefficient
Regression coefficient +- (critical t-value)(standard error of regression coefficient)
If zero is included in conf interval, slope not statistically significant
What is conditional heteroskedasticity
Residual variance related to level of independent variables
What is serial correlation
Residuals are correlated
What is multicollinearity
Two or more independent variables correlated
What are six common misspecifications of a regression model?
Omitting a variable Transforming variable Incorrectly pooled data Using lagged dependent var as an independent var Forecasting the past Measuring independent vars with error
What do model misspecifications result in?
Biased and inconsistent regression coefficients
No confidence in hypothesis tests of coefficients
No confidence in predictions
When do you use a linear trend model?
Data points equally distributed above and below line
Mean is constant