Correlation And Regression Flashcards

0
Q

Test statistic for significance of correlation coefficient

A

t = r sqrt(n - 2)
————–
sqrt(1 - r^2)

r = correlation coefficient 
n = number of samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What is correlation coefficient

A

Measure of strength of linear relationship between two variables
x>0 = positive correlation
x<0 = negative correlation
X=0 = no correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Regression assumptions

A

Independent variable:

  • linearly related to dependent var
  • uncorrelated with residual

Residual term:

  • exp value = 0
  • constant variance
  • independently and normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Confidence interval of y value

A

Predicted Y value +/- (critical t-value)*(standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Difference between t-test and F-test

A

T-test: assesses statistical significance of individual regression parameters

F-test: assesses model effectiveness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to test for heteroskedasticity

A

Using Breusch-Pagan test on regressed squared residuals
Lagrange Multiplier LM = n*R^2
Will be x^2 random variable w/df= num independent variables in regression
Or White Test?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to test for serial correlation

A

Durbin-Watson test

If it differs sufficiently from 2, then regression errors have significant serial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to test for multicollinearity

A

No specific test

Identify whether R^2 is high with significant F-stat despite insignificant t-tests

Drop one of correlated variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are two types of tests for significance in multiple regression?

A

Whether each independent variable explains dependent (t stat)

Whether some or all independent variables explain dependent (F stat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is t stat formula for multiple regression

A

t = est regression param
—————————————
Std error of regression param

With n - k - 1 df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What info is included in an ANOVA table? (analysis of variance)

A
Degrees of freedom 
Sum of squares 
Mean square (ss / df)

For regression and error and total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to calc MSR (mean squared regression)

A
                k
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calc MSE (mean squared error)

A
        n - k - 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is coefficient of determination and how to calc (R^2)

A

% variation in dep var explained by indep vars
R^2 = regression sum of squares
——————————
total sum of squares

    = SST - SSE
       ------------
            SST
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does adjusted R^2 measure

A

Goodness of fit that adjusts for the number of independent variables included in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is standard error of estimate and how to calc

A

Measures uncertainty of values of dependent variable around regression line

= Sqrt (mean squared error)

16
Q

How to calculate F stat

A

mean squared error

With k and n-k-1 df

Reject H0 if F > Fcritical

17
Q

Confidence interval for regression coefficient

A

Regression coefficient +- (critical t-value)(standard error of regression coefficient)

If zero is included in conf interval, slope not statistically significant

18
Q

What is conditional heteroskedasticity

A

Residual variance related to level of independent variables

19
Q

What is serial correlation

A

Residuals are correlated

20
Q

What is multicollinearity

A

Two or more independent variables correlated

21
Q

What are six common misspecifications of a regression model?

A
Omitting a variable
Transforming variable 
Incorrectly pooled data
Using lagged dependent var as an independent var
Forecasting the past
Measuring independent vars with error
22
Q

What do model misspecifications result in?

A

Biased and inconsistent regression coefficients
No confidence in hypothesis tests of coefficients
No confidence in predictions

23
Q

When do you use a linear trend model?

A

Data points equally distributed above and below line

Mean is constant

24
When do you use a log-linear model?
When dependent variable grows at constant rate (to a power) Residuals may be correlated Mean is non constant Seasonality
25
When do you use an auto regressive model?
When dependent variable is regressed against previous values of itself Autocorrelation a of residuals are not statistically signif Use t test to identify correlation between residuals
26
What is autocorrelation?
When residuals exhibit serial correlation Used to test for seasonality
27
What are 3 conditions required to be covariance stationary
Required for time series models Three conditions: - constant and finite expected value - constant and finite variance - constant and finite covariance with leading or lagged values
28
Three steps to determine whether a time series is covariance stationary and how to correct
Plot data; mean /variance constant? Run AR model; test correlation Dickey fuller test (for unit root) Can correct with first differencing
29
What is mean reversion
b0/(1 - b1) For AR(1) model
30
What is a unit root and how to identify
Unit root is when value of lag coefficient = 1 Use Dickey-Fuller test to identify Cov stationary Random walk
31
How to convert random walk to covariance stationary
Use first differencing (model change in value vs value) Var y = xt - x(t- 1) = error term
32
What is root mean squared error used for
Used to assess predictive accuracy of autoregressive models Lower RMSE is better Out of sample RMSE tests forecasting ability
33
What is a structural change?
A significant shift in plotted data dividing in two distinct patterns Results in coefficient instability Run two diff models before and after change
34
What is cointegration
When two time series are economically linked Related to same macro variable
35
How to test for cointegration
Regress one time series on other Test residuals for unit root; if test rejects unit root, two series are cointegrated If cointegrated, results in covariance stationary error term and reliable t tests
36
What is autoregressive conditional heteroskedasticity (ARCH) and what does it result in
Variance of residuals in one time period dependent on variance of residuals in another time period Results in invalid reg coeff standard errors and hypothesis tests
37
How to correct for ARCH
Use Generalized least squares Or use model to predict variance of residuals in following periods