Correlation And Regression Flashcards

0
Q

Test statistic for significance of correlation coefficient

A

t = r sqrt(n - 2)
————–
sqrt(1 - r^2)

r = correlation coefficient 
n = number of samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What is correlation coefficient

A

Measure of strength of linear relationship between two variables
x>0 = positive correlation
x<0 = negative correlation
X=0 = no correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Regression assumptions

A

Independent variable:

  • linearly related to dependent var
  • uncorrelated with residual

Residual term:

  • exp value = 0
  • constant variance
  • independently and normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Confidence interval of y value

A

Predicted Y value +/- (critical t-value)*(standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Difference between t-test and F-test

A

T-test: assesses statistical significance of individual regression parameters

F-test: assesses model effectiveness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to test for heteroskedasticity

A

Using Breusch-Pagan test on regressed squared residuals
Lagrange Multiplier LM = n*R^2
Will be x^2 random variable w/df= num independent variables in regression
Or White Test?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to test for serial correlation

A

Durbin-Watson test

If it differs sufficiently from 2, then regression errors have significant serial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to test for multicollinearity

A

No specific test

Identify whether R^2 is high with significant F-stat despite insignificant t-tests

Drop one of correlated variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are two types of tests for significance in multiple regression?

A

Whether each independent variable explains dependent (t stat)

Whether some or all independent variables explain dependent (F stat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is t stat formula for multiple regression

A

t = est regression param
—————————————
Std error of regression param

With n - k - 1 df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What info is included in an ANOVA table? (analysis of variance)

A
Degrees of freedom 
Sum of squares 
Mean square (ss / df)

For regression and error and total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to calc MSR (mean squared regression)

A
                k
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calc MSE (mean squared error)

A
        n - k - 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is coefficient of determination and how to calc (R^2)

A

% variation in dep var explained by indep vars
R^2 = regression sum of squares
——————————
total sum of squares

    = SST - SSE
       ------------
            SST
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does adjusted R^2 measure

A

Goodness of fit that adjusts for the number of independent variables included in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is standard error of estimate and how to calc

A

Measures uncertainty of values of dependent variable around regression line

= Sqrt (mean squared error)

16
Q

How to calculate F stat

A

mean squared error

With k and n-k-1 df

Reject H0 if F > Fcritical

17
Q

Confidence interval for regression coefficient

A

Regression coefficient +- (critical t-value)(standard error of regression coefficient)

If zero is included in conf interval, slope not statistically significant

18
Q

What is conditional heteroskedasticity

A

Residual variance related to level of independent variables

19
Q

What is serial correlation

A

Residuals are correlated

20
Q

What is multicollinearity

A

Two or more independent variables correlated

21
Q

What are six common misspecifications of a regression model?

A
Omitting a variable
Transforming variable 
Incorrectly pooled data
Using lagged dependent var as an independent var
Forecasting the past
Measuring independent vars with error
22
Q

What do model misspecifications result in?

A

Biased and inconsistent regression coefficients
No confidence in hypothesis tests of coefficients
No confidence in predictions

23
Q

When do you use a linear trend model?

A

Data points equally distributed above and below line

Mean is constant

24
Q

When do you use a log-linear model?

A

When dependent variable grows at constant rate (to a power)

Residuals may be correlated
Mean is non constant
Seasonality

25
Q

When do you use an auto regressive model?

A

When dependent variable is regressed against previous values of itself

Autocorrelation a of residuals are not statistically signif

Use t test to identify correlation between residuals

26
Q

What is autocorrelation?

A

When residuals exhibit serial correlation

Used to test for seasonality

27
Q

What are 3 conditions required to be covariance stationary

A

Required for time series models

Three conditions:

  • constant and finite expected value
  • constant and finite variance
  • constant and finite covariance with leading or lagged values
28
Q

Three steps to determine whether a time series is covariance stationary and how to correct

A

Plot data; mean /variance constant?
Run AR model; test correlation
Dickey fuller test (for unit root)

Can correct with first differencing

29
Q

What is mean reversion

A

b0/(1 - b1)

For AR(1) model

30
Q

What is a unit root and how to identify

A

Unit root is when value of lag coefficient = 1

Use Dickey-Fuller test to identify

Cov stationary
Random walk

31
Q

How to convert random walk to covariance stationary

A

Use first differencing (model change in value vs value)

Var y = xt - x(t- 1) = error term

32
Q

What is root mean squared error used for

A

Used to assess predictive accuracy of autoregressive models

Lower RMSE is better

Out of sample RMSE tests forecasting ability

33
Q

What is a structural change?

A

A significant shift in plotted data dividing in two distinct patterns

Results in coefficient instability

Run two diff models before and after change

34
Q

What is cointegration

A

When two time series are economically linked

Related to same macro variable

35
Q

How to test for cointegration

A

Regress one time series on other

Test residuals for unit root; if test rejects unit root, two series are cointegrated

If cointegrated, results in covariance stationary error term and reliable t tests

36
Q

What is autoregressive conditional heteroskedasticity (ARCH) and what does it result in

A

Variance of residuals in one time period dependent on variance of residuals in another time period

Results in invalid reg coeff standard errors and hypothesis tests

37
Q

How to correct for ARCH

A

Use Generalized least squares

Or use model to predict variance of residuals in following periods