Quant Flashcards
Analysis of Variance (ANOVA)
The analysis of the total variability of a dataset (such as observations on the dependent variable in a regression) into components representing different sources of variation; with reference to regression, ANOVA provides the inputs for an F-test of the significance of the regression as a whole.
Dependent Variable
The variable whose variation about its mean is to be explained by the regression; the left-hand-side variable in a regression equation.
Error Term
The portion of the dependent variable that is not explained by the independent variable(s) in the regression
Estimated Parameters
With reference to a regression analysis, the estimated values of the population intercept and population slope coefficient(s) in a regression
Fitted Parameters
With reference to a regression analysis, the estimated values of the population intercept and population coefficient(s) in a regression
Independent Variable
A variable used to explain the dependent variable in a regression; a right-hand-side variable in a regression equations
Linear Regression
Regression that models the straight-line relationship between the dependent and independent variable(s)
Parameter Instability
The problem or issue of population regression parameters that have changed over time
Regression coefficient
The intercept and slope coefficient(s) of a regression
Adjusted R2
A measure of goodness-of-fit of a regression that is adjusted for degrees of freedom and hence does not automatically increase when another independent variable is added to a regression
Breusch-Pagan test
A test for conditional heteroskedasticity in the error term of a regression
Categorical dependent variables
An alternative term for qualitative dependent variables
Common size statements
Financial statements in which all elements (accounts) are stated as a percentage of a revenue for income statement or total assets for balance sheet
Conditional heteroskedasticity
Heteroskedasticity in error variance that is correlated with the values of the independent variable(s) in the regression
Data Mining
The practice of determining a model by extensive searching through a dataset for statistically significant patterns
Discriminate analysis
A multivariate classification technique used to discriminate between groups, such as companies that either will or will not become bankrupt during some time frame
Dummy variable
A type of qualitative variable that takes on a value of 1 if a particular condition is true and 0 if that condition is false
First-Order Serial Correlation
Correlation between adjacent observations in a time series
Generalized least squares
A regression estimation technique that addresses heteroskedasticity of the error term
Assumptions of Linear Regression Model
- Relationship between dep variable and ind variable is linear
- Ind variable is not random
- Expected value of error term = 0
- Variance of the error term is same for all observations
- Error term is not correlated across observations
- Error term is normally distributed
Type I error
Rejecting the null hypothesis when it is true (i.e. null hypothesis should not be rejected)
Type II error
Failing to reject the null hypothesis when it is false (i.e. null should be rejected)
P-value
Smallest level of significance at which the null hypothesis can be rejected
Heteroskedastic
With reference to the error term of regression, having a variance that differs across observations - i.e. non-constant variance
Having consistent standard errors will correct for this
Log regression model
&
Log-linear model
A regression that expresses the dependent and independent variables as natural logarithms
&
A time-series model in which the growth rate of the time series as a function of time is constant
Logistic regression (logit model)
A qualitative-dependent-variable multiple regression model based on the logistic probability distribution
Model specification
With reference to regression, the set of variables included in the regression and the regression equation’s functional form
Multicollinearity
Regression assumption violation that occurs when two or more ind variable are highly (not perfectly) correlated with each other
Negative serial correlation
Serial correlation in which a positive error for one observation increases the chance of a negative error for another observation
Non-stationarity
The property of having characteristics such as mean and variance that are not constant through time
Positive serial correlation
Serial correlation in which a positive error for one observation increases the chance of positive error for another observation, same for negative errors
Probit regression
A qualitative-dependent-variable multiple regression model based on normal distribution
Qualitative dependent variables
Dummy variables used as dependent variables rather than as independent variables
Random walk
Time series in which the value of the series in one period is the value of the series in the previous period plus an unpredictable random error
In AR(1) regression model, random walks will have an estimated intercept coefficient (b0) near zero and slope coefficient (B1) near 1
Robust standard errors (a.k.a. White-corrected standard errors)
Standard errors of the estimated parameters of a regression that correct for the presence of heteroskedasticity in the regression’s error term
Serially Correlated
Errors that are correlated across observations in a regression model
Correlation of a time series with its own past values
Unconditional heteroskedasticity
Error terms that are not correlated with the values of the independent variables in the regression model
Autoregressive model
Time series regressed on its own past values in which ind variable is lagged value of the dependent variable
Chain rule of forecasting
The two period ahead forecast is determined by first solving the first period forecast and substituting it into the two period ahead forecast model
Cointegrated
Two time series that have a long-term financial or economic relationship such that they do not diverge from each other without bound in the long run
Covariance stationary
A time series where the expected value and variance are constant and finite in all periods, its covariance with itself for a fixed number of periods in the past or future is constant and finite in all periods
First-differencing
A transformation that subtracts the value of the time series in period t-1 from its value in period t
In-sample forecast errors
Residuals from a fitted time-series model within the same period used to fit the model