Quantitative Methods Flashcards

1
Q

r

A

r = Cov(X, Y) / sX sY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

r (extended)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

spurious correlation

A
  • Correlation between two variables that reflects chance relationships in a particular data set
  • Correlation induced by a calculation that mixes each of two variables with a third
  • Correlation between two variables arising not from a direct relation between them but from their relation to a third variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

CFO

A

NI + non cash charges - working capital investement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Assumptions of the linear regression model

A
  1. The relationship between the dependent variable, Y, and the independent variable, X is linear in the parameters b0 and b1. This requirement means that b0 and b1 are raised to the first power only and that neither b0 nor b1 is multiplied or divided by another regression parameter (as in b0/b1, for example). The requirement does not exclude X from being raised to a power other than 1
  2. The independent variable, X, is not random
  3. The expected value of the error term is 0: E(ε) = 0
  4. The variance of the error term is the same for all observations: E(ε2i)=σ2ε , i = 1, …, n
  5. The error term, ε, is uncorrelated across observations. Consequently, E(εiεj) = 0 for all i not equal to j
  6. The error term, ε, is normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

b0

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

b1

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

FFCF

A

CFO + Interest expense (1 - t) - FCInv

/

NI + non cash charges - WCInv + Interest expense (1 - t) - FCInv

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

t-test for the correlation coefficient

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Least square equation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Coefficient of determination

A

r2

/

1 - (unexplained variation/ total variation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

t-test for linear regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

t-test for linear regression - utility

A

For hypothesis tests concerning the population mean of a normally distributed population with unknown (known) variance, the theoretically correct test statistic is the t-statistic (z-statistic). In the unknown variance case, given large samples (generally, samples of 30 or more observations), the z-statistic may be used in place of the t-statistic because of the force of the central limit theorem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

t-test for linear regression - degrees of freedom

A

of observations - (Number of independant variables + 1) =

n - (k + 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

t-test for linear regression - interval

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SEE

A

(SSE/n-2)1/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

SEE - relation to unexplained variation

A

Unexplained variation = SSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SEE - definition

A

The standard error of the estimate is a measure of the accuracy of predictions made with a regression line. (Also called the residual standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

SE of the t-test for linear regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard error versus standard deviation

A

The standard error of the sample is an estimate of how far the sample mean is likely to be from the population mean, whereas the standard deviation of the sample is the degree to which individuals within the sample differ from the sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Type I error : rejecting a true null hypothesis

A

Type II error : failing to reject a false null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

p-value definition

A

Smallest level of significance at which the null hypothesis can be rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

EV

A

Market value of equity and debt - value of cash and investments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

IC (Invested Capital)

A

Book value of debt and equity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
* R2 * SST * SSR * SSE (sometimes residual sum fo square, RSS)
26
F-statistic definition
The F-statistic is the ratio of the average regression sum of squares to the average sum of the squared errors. It measures how well the regression equation explains the variation in the dependent variable
27
Relation of the t-test and the F-test for regression with only one independent variable
In such regressions, the F-statistic is the square of the t-statistic for the regression coefficient
28
F-test for the regression coefficient with one independent variable - formula
(RSS/ 1) / (SSE/(n - 2)) / Mean regression sum of squares/ Mean squared error
29
F test for multiple regression coefficients
30
F test for multiple regression coefficients - notation
Fk, n- (k + 1) k = slope coefficients n = number of observations
31
S2f - formula * s2 being the squared standard error of estimate (SEE = s)
32
ANOVA
Analysis of variance - Used to determine the sources of variance of a variable - Uses the F-test to verify whether all regression coefficients are equal to 0
33
ANOVA degrees of freedom
* SSR = # of slope coefficients = k * SSE = # of observations - (Number of independant variables + 1) = n - (k + 1) * SST = # of observations - 1
34
S2f is the estimated variance of the prediction error. It is used to build an interval around the intercept Ŷ.
35
Beta
36
RANVA
Risk adjusted net value
37
Assumptions of the multiple linear regression model
1. The relationship between the dependent variable, Y, and the independent variables, X1, X2, ..., Xk, is linear 2. The independent variables (X1, X2, ..., Xk) are not random. Also, no exact linear relation exists between two or more of the independent variables 3. The expected value of the error term, conditioned on the independent variables, is 0: E(ε | X1,X2, …, Xk) = 0 4. The variance of the error term is the same for all observations:  E(ε2i)=σ2ε 5. The error term is uncorrelated across observations: E(εiεj) = 0, j ≠ i 6. The error term is normally distributed
38
Adjusted R2 - definition
* A measure of goodness-of-fit of a regression that is adjusted for degrees of freedom and hence does not automatically increase when another independent variable is added to a regression * If k ≥ 1, R2 is strictly greater than adjusted R2
39
Adjusted R2
40
Residual standard error
* [SSE / n - (k + 1)]1/2 * MSSE1/2
41
Breusch-Pagan test
* A test for conditional heteroskedasticity in the error term of a regression * Chi-squared with df = number of independent variables * Test statistic = nR2 * R2 = coefficient of determination from the regression of the squared residuals on the independent variables from the original regression (not the R2 from the original regression)
42
Generalized least squares
Eliminates heteroskedasticity
43
Robust standart deviation
Accounts for conditional heteroskedasticity
44
* Conditional heteroskedasticity * Unconditional heteroskedasticity
* Heteroskedasticity in the error variance that is correlated with the values of the independent variable(s) in the regression * Heteroskedasticity in the error variance that is not correlated with the values of the independent variable(s) in the regression
45
* Heteroskedasticity * Homoskedasticity
* The property of having a nonconstant variance; refers to an error term with the property that its variance differs across observations * The property of having a constant variance; refers to an error term that is constant across observations
46
Serially correlated
With reference to regression errors, errors that are correlated across observations
47
Positive serial correlation
An error for one observation increases the chance of error for another observation
48
First-order serial correlation
When the sign of the error tends to persist from one period to another
49
Multicollinearity
* A regression assumption violation that occurs when two or more independent variables (or combinations of independent variables) are highly but not perfectly correlated with each other * In order to correct the regression, we need to remove on or more of the highly correlated dependent variables
50
Classic symptoms of multicollinearity
* High R2 * Significant F-statistic when the t-statistics are not significant
51
Durbin and Watson test - utility
Test used for serial correlation
52
Durbin and Watson test - formula
53
Durbin and Watson regression residual for period t
54
Durbin and Watson values
* No serial correlation: 2 * Serial correlation of 1: 0 * Serial correlation of -1: 4 * If \> du, then we fail to reject the null hypothesis of no serial correlation * If \< d1, then we reject the hypothesis of no serial correlation * Inconclusive between d1 and du
55
Bias
* Data-mining * Omitted variable bias * Multicollinearity (F-test) * Serial correlation (DW)
56
Qualitative dependant variable
Use a logit or probit model
57
Covariance-stationary
* The mean and variance are constant through time * We can not use standard regression analysis on a time series that is not covariance-stationary
58
Convergance of covariance stationary series
They converge to their mean reverting value : xt = b0/(1 - b1)
59
Nonstationarity
Variables that contain trends
60
Unit root
A time series that is not covariance stationary is said to have a unit root
61
Mean reversion - formula and context
* xt = b0/(1 - b1) * All covariance stationary time series have a finite mean-reverting level
62
Autocorrelation
Correlation of a time serie with it's own past values / Order of correlation k = number of periods lagged
63
Method to correct autocorrelation
* The most prevalent method for adjusting **standard errors** was developed by **Hansen** (1982) * An additional advantage of Hansen’s method is that it simultaneously corrects for conditional heteroskedasticity
64
*k*th order autocorrelation
65
*k*th order estimated autocorrelation
66
Autocorrelation of the error term
67
Standard error of the residual correlation (for autocorrelation)
1/(T½) T = number of observations
68
In-sample forecast errors - residuals from a fitted time series model
Out-of-sample forecast errors - differences between actual and predicted values outside the time period of the model
69
Root mean squared error (RMSE)
Square root of the average squared error
70
Random walk - formula
71
Random walk covariance
72
Random walk variance
(t - 1)σ2
73
Dickey and Fuller test - formula
74
Dickey and Fuller test - utility
* Test for the unit root using an AR(1) model * The null hypothesis is: H0: g1 = 0 * The alternative hypothesis is: Ha: g1 \< 0 * g1 = b1 - 1
75
Seasonality in time-series - formula
76
Autoregressive model (AR)
* A time series regressed on its own past values * AR, MA & ARMA models * Should be covariance stationary (Dickey and Fuller test)
77
Autoregressive conditional heteroskedasticity (ARCH) - ARCH (1) model distribution
78
ARCH linear regression equation
If the estimate of a1 is statistically significantly different from zero, we conclude that the time series is ARCH(1)
79
ARCH variance of the error
80
Cointegrated
Two time-series are cointegrated if a long-term financial or economic relationship exists between them such that they do not diverge from each other without bound in the long run
81
Durbin and Watson for lagged value (autoregressive models)
The test cannot be used for a regression that has a lagged value of the dependent variable as one of the explanatory variables. Instead, test whether the residuals from the model are serially correlated
82
Multiple R
* The correlation between actual and predicted values of the dependent variable * = (R2)1/2
83
Nonlinear relation
An association or relationship between variables that cannot be graphed as a straight line
84
Interpretation of the p-value
* A small p-value (≤ 0.05) indicates strong evidence against the null hypothesis, so it is rejected * A large p-value (\> 0.05) indicates weak evidence against the null hypothesis (fail to reject) * p-values very close to the cutoff (~ 0.05) are considered to be marginal (need attention)
85
p-value for the Beta function \*as a reference
86
p-value for the Lower incomplete beta function \*as a reference
87
p-value for the Regularized lower incomplete beta function \*as a reference / where the numerator is the lower incomplete beta function, and the denominator is the beta function
88
p-value for the t-distribution cumulative distribution function (CDF) \*as a reference / where v is the degrees of freedom, t is the upper limit of integration, and I is the regularized lower incomplete beta function
89
Heteroskedasticity, serial correlation and multicollinearity - table
90
Errors in models specifications
* Data mining * Market timing * Time-series misspecification
91
Moving-average model of order 1, MA(1) / Theta (θ) is the parameter of the MA(1) model
92
Moving-average model of order q, MA(q)
93
Autoregressive moving-average model (ARMA) / b1, b2, …, bp are the autoregressive parameters and θ1, θ2, …, θq are the moving-average parameters
94
Multiple R versus r
Capital R2 (as opposed to r2) should generally be the multiple R2 in a multiple regression model. In bivariate linear regression, there is no multiple R, and R2=r2. So one difference is applicability: **"multiple R" implies multiple regressors, whereas "R2" doesn't necessarily**. Another simple difference is interpretation. In multiple regression, the multiple **R is the coefficient of multiple correlation**, whereas its square is the **coefficient of determination**.
95
p-value
* A small p-value (≤ 0.05) indicates strong evidence against the null hypothesis, so it is rejected * A large p-value (\> 0.05) indicates weak evidence against the null hypothesis (fail to reject) * p-values very close to the cutoff (~ 0.05) are considered to be marginal (need attention)
96
Variables with a correlation close to 0 can nonetheless exhibit a strong relationship—just not a linear relationship
Correlation measures the linear association between two variables
97
If the p-value if greater than 0.05
Then the test is not significant at the 5% level
98
Significance F
* Represents the level at which the test is significant * An entry of 0.01 for the significance of F means that the regression is significant at the 0.01 level
99
Parameter instability
The problem or issue of population regression parameters that have changed over time