Quant Flashcards

1
Q

Analysis of Variance (ANOVA)

A

The analysis of the total variability of a dataset (such as observations on the dependent variable in a regression) into components representing different sources of variation; with reference to regression, ANOVA provides the inputs for an F-test of the significance of the regression as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Dependent Variable

A

The variable whose variation about its mean is to be explained by the regression; the left-hand-side variable in a regression equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Error Term

A

The portion of the dependent variable that is not explained by the independent variable(s) in the regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Estimated Parameters

A

With reference to a regression analysis, the estimated values of the population intercept and population slope coefficient(s) in a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fitted Parameters

A

With reference to a regression analysis, the estimated values of the population intercept and population coefficient(s) in a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Independent Variable

A

A variable used to explain the dependent variable in a regression; a right-hand-side variable in a regression equations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Linear Regression

A

Regression that models the straight-line relationship between the dependent and independent variable(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Parameter Instability

A

The problem or issue of population regression parameters that have changed over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Regression coefficient

A

The intercept and slope coefficient(s) of a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Adjusted R2

A

A measure of goodness-of-fit of a regression that is adjusted for degrees of freedom and hence does not automatically increase when another independent variable is added to a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Breusch-Pagan test

A

A test for conditional heteroskedasticity in the error term of a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Categorical dependent variables

A

An alternative term for qualitative dependent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Common size statements

A

Financial statements in which all elements (accounts) are stated as a percentage of a revenue for income statement or total assets for balance sheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Conditional heteroskedasticity

A

Heteroskedasticity in error variance that is correlated with the values of the independent variable(s) in the regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Mining

A

The practice of determining a model by extensive searching through a dataset for statistically significant patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Discriminate analysis

A

A multivariate classification technique used to discriminate between groups, such as companies that either will or will not become bankrupt during some time frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dummy variable

A

A type of qualitative variable that takes on a value of 1 if a particular condition is true and 0 if that condition is false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

First-Order Serial Correlation

A

Correlation between adjacent observations in a time series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Generalized least squares

A

A regression estimation technique that addresses heteroskedasticity of the error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Assumptions of Linear Regression Model

A
  1. Relationship between dep variable and ind variable is linear
  2. Ind variable is not random
  3. Expected value of error term = 0
  4. Variance of the error term is same for all observations
  5. Error term is not correlated across observations
  6. Error term is normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Type I error

A

Rejecting the null hypothesis when it is true (i.e. null hypothesis should not be rejected)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Type II error

A

Failing to reject the null hypothesis when it is false (i.e. null should be rejected)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

P-value

A

Smallest level of significance at which the null hypothesis can be rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Heteroskedastic

A

With reference to the error term of regression, having a variance that differs across observations - i.e. non-constant variance

Having consistent standard errors will correct for this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Log regression model

&

Log-linear model

A

A regression that expresses the dependent and independent variables as natural logarithms

&

A time-series model in which the growth rate of the time series as a function of time is constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Logistic regression (logit model)

A

A qualitative-dependent-variable multiple regression model based on the logistic probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Model specification

A

With reference to regression, the set of variables included in the regression and the regression equation’s functional form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Multicollinearity

A

Regression assumption violation that occurs when two or more ind variable are highly (not perfectly) correlated with each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Negative serial correlation

A

Serial correlation in which a positive error for one observation increases the chance of a negative error for another observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Non-stationarity

A

The property of having characteristics such as mean and variance that are not constant through time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Positive serial correlation

A

Serial correlation in which a positive error for one observation increases the chance of positive error for another observation, same for negative errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Probit regression

A

A qualitative-dependent-variable multiple regression model based on normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Qualitative dependent variables

A

Dummy variables used as dependent variables rather than as independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Random walk

A

Time series in which the value of the series in one period is the value of the series in the previous period plus an unpredictable random error

In AR(1) regression model, random walks will have an estimated intercept coefficient (b0) near zero and slope coefficient (B1) near 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Robust standard errors (a.k.a. White-corrected standard errors)

A

Standard errors of the estimated parameters of a regression that correct for the presence of heteroskedasticity in the regression’s error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Serially Correlated

A

Errors that are correlated across observations in a regression model

Correlation of a time series with its own past values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Unconditional heteroskedasticity

A

Error terms that are not correlated with the values of the independent variables in the regression model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Autoregressive model

A

Time series regressed on its own past values in which ind variable is lagged value of the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Chain rule of forecasting

A

The two period ahead forecast is determined by first solving the first period forecast and substituting it into the two period ahead forecast model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Cointegrated

A

Two time series that have a long-term financial or economic relationship such that they do not diverge from each other without bound in the long run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Covariance stationary

A

A time series where the expected value and variance are constant and finite in all periods, its covariance with itself for a fixed number of periods in the past or future is constant and finite in all periods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

First-differencing

A

A transformation that subtracts the value of the time series in period t-1 from its value in period t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

In-sample forecast errors

A

Residuals from a fitted time-series model within the same period used to fit the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Linear trend

A

A trend in which the dependent variable changes at a constant rate with time

45
Q

Unit Root Testing for Nonstationarity

A
  1. Run an AR model and examine autocorrelations

2. Perform the Dickey Fuller test

46
Q

Seasonality

A

A pattern in a time-series that tends to repeat from year to year.

E.g. monthly sales data for a retailer (Christmas season each year will have similar results all else constant)

47
Q

How to correct for seasonality

A

To adjust for seasonality in an AR model, an additional lag of the dependent variable (corresponding to the same period in the previous year) is added to the original model as another independent variable

48
Q

Steps to determine stock price of target company using relative valuation ratio approach given comparable companies

A
  1. Calculate the relative valuation ratio for the comparable companies to determine their mean
  2. Apply the mean of each ratio to the valuation variables of the target company to get estimated stock price for each valuation variable
  3. Take the mean of the estimated stock price to arrive at your answer
49
Q

How to arrive at fair acquisition price of target company using comparable transaction approach

A
  1. Calculate the relative valuation ratios based on acquisition price and their mean
  2. Multiply target company valuation variables with the mean multiples calculated in step 1
  3. Calculate the mean estimated stock price to arrive at your answer
50
Q

Autoregressive Conditional Heteroskedasticity (ARCH)

How to Correct?

A

Occurs when examining a single time series like AR model

Exist if variance of the residuals in one period is dependent on variance of the residuals in a previous period

Correct by using regression procedures that correct for heteroskedasticity (generalized least squares)

51
Q

When can a linear regression be used?

A

Linear regression can be used if:

  1. Both time series are covariance stationary
  2. Neither time series is covariance stationary but the two series are cointegrated
52
Q

Examples of Supervised Learning

A
  1. Linear/Penalized regression
  2. Logistic
  3. CART
  4. Logit
  5. SVM
  6. KNN
  7. Ensemble & Random Forest
53
Q

Types of Unsupervised Learning

A
  1. Principal Components Analysis

2. Clustering

54
Q

Steps in data analysis project

A
  1. Conceptualization of the modeling task
  2. Data Collection
  3. Data Preparation and wrangling (*critical)
  4. Data Exploration
  5. Model Training
55
Q

Steps to analyze unstructured, text-based data

A
  1. Text problem formulation
  2. Data collection
  3. Text preparation & wrangling (*critical)
  4. Text exploration
56
Q

Activation function

A

Part of the neural network’s node that transform the total net input into final out.

Activation function operates like a light dimmer switch that dec/inc strength of the input.

57
Q

Agglomeration cluster

A

Mnemonic: think “conglomerate”

Clustering method that starts off as individual clusters until two closest clusters (by distance) are combined into 1 larger. This process is repeated until all observations are clumped into single large cluster

58
Q

Backward propagation

A

Process of adjusting weights in a neural network by moving backward through the network’s layer to reduce total error

59
Q

Base error

A

Model error due to randomness in the data

60
Q

Bias Error

A

Occurs with under fitting due to 1 or very few features producing poor approximation and high in-sample error.

61
Q

Bootstrap aggregating (“Bagging”)

A

Process where original training data set is used to generate n new training data sets.

Data can overlap between data sets. Helps improve the stability of the predictions and reduce chances of overfitting a regression

62
Q

Centroid

A

The center of a cluster formed using the K-means clustering algorithm

63
Q

Classification & Regression Tree (CART)

A

Supervised machine learning technique commonly applied to binary classifications or regression

Categorical target variable = classification tree

Continuous target variable = regression tree

64
Q

Composite variable

A

A variable that combines two or more variables that are statistically strongly related to each other.

65
Q

Cross-validation

A

Technique for estimating out-of-sample error directly by determining the error in validation samples

66
Q

Deep Learning

A

Algorithms based on complex neural networks that address highly complex tasks like image classification, face recognition, speech recognition, and natural language processing

67
Q

Dendogram

A

A type of tree diagram that highlights the hierarchical relationships among the clusters

68
Q

Dimension reduction

A

Set of techniques for reducing in the number of features in a dataset while retaining variation across observations to preserve the information contained in that variation

69
Q

Divisive clustering

A

Opposite technique from Agglomerate

Clustering method that starts with all observations belonging to a single large cluster and then divided into two clusters based on measure of distance. The process repeats until each cluster only contains one observation

70
Q

Eigenvalue

A

A measure that gives the proportion of total variance in the initial dataset that is explained by each eigenvector

71
Q

Eigenvector

A

A vector that defines new mutually uncorrelated composite variables that are linear combinations of the original features

72
Q

Ensemble learning

A

A supervised learning technique of combining predictions from a collection of models to achieve a more accurate prediction

Two types of ensemble methods:

  1. Aggregation of heterogenous learners (different algorithms combined together via a voting classifier)
  2. Aggregation of homogenous learners (same algorithm used on different training data)
73
Q

Fitting curve

A

A curve which shows in- and out-of-sample error rates on the y-axis plotted against model complexity

74
Q

Forward propagation

A

Opposite of backward propagation; adjusting weights in a neural network by moving forward through network layers to reduce total error of network

75
Q

Generalize

A

When a model retains its explanatory power when predicting out-of-sample

76
Q

Hierarchical clustering

A

An interactive unsupervised learning procedure used for building a hierarchy of clusters

77
Q

Holdout samples

A

Data samples that are not used to train a model

78
Q

Regularization

A

Describes method that reduce statistical variability in high dimensional data estimation problem

Can be applied to linear and non-linear

79
Q

K-fold cross-validation

A

A technique in which data are shuffled randomly and then are divided into k equal sub-samples, with k-1 samples used as training samples and one sample, used as a validation sample

80
Q

K-means

A

A clustering algorithm that repeatedly partitions observations into a fixed number, k, of non-overlapping clusters

81
Q

Labeled data set

A

Dataset that contains matched sets of observed inputs or features (X variable) and the associated output (Y variable).

82
Q

Least Absolute Shrinkage & Selection Operator (LASSO)

A

Type of penalized regression which involves minimizing the sum of the absolute values of the regression coefficients plus a penalty term that increases in size with the number of included features

Minimize the standard error of estimate and value of penalty associated with the number of features (independent values); think of adjusted R2

83
Q

Principal Component Analysis (PCA)

A

An example of unsupervised learning, summarized information in large number of correlated factors into small uncorrelated factors (eigenvectors)

Principal components or “eigenvectors” are linear combinations of the original data set and cannot be easily labeled or interpreted

84
Q

Clustering

A

An unsupervised learning technique that groups observations into categories based on similarities in their attributes

Requires human judgement in defining what is similar.

Used in investment management for diversification by investing in assets from multiple clusters; analyze portfolio risk evidenced by a large portfolio allocation to a particular cluster

Examples of clustering include:

K-means clustering
Hierarchical clustering

85
Q

Support Vector Machine (SVM)

A

A type of supervised learning technique and a linear classification algorithm that separates dates into one of two possible classifiers (buy vs. sell, default vs. no default, pass vs. fail)

Maximizes probability of making a correct prediction by determining boundary farthest away from all observations

Used in investment management to classify debt issues, shorting stocks, classifying texts like news articles or company press release as positive or negative

86
Q

Data Cleansing

A

Deals with reducing errors in the raw data. Errors in raw data for structured data include:

i. Missing values
ii. Invalid values
iii. Inaccurate values
iv. Inconsistent format
v. Duplicates

Accomplished via automated, rules-based algorithms and human intervention

87
Q

Data Wrangling & Transformation

A

Data wrangling involves preprocessing data for model use. Preprocessing includes data transformation.

Data transformation types include:

i. Extraction of data based on parameter
ii. Aggregation of related data using appropriate weights
iii. Filtration by removing irrelevant observations and features
iv. Conversion of data of different types (e.g., nominal or ordinal)

88
Q

Steps for Text Preparation or Cleaning

A
  1. Remove HTML tags (if text collected from web pages)
  2. Remove punctuations
  3. Remove numbers (digits replaced with annotations). If numbers are important for analysis, values are extracts via text applications first.
  4. Remove white spaces
89
Q

Steps for Text Wrangling (i.e. normalization)

A
  1. Lower casing
  2. Remove of stop words (e.g. the, is)
  3. Stemming (convert all variations of a word into a common value)
  4. Lemmatization (similar to stemming)
90
Q

Data Exploration

A

Evaluate the data set and determine the most appropriate way to configure it for model training

Steps include:

  1. Understanding data properties, finding patterns or relationships, and planning modeling
  2. Select the needed attributes of the data for model training (higher the features, higher the complexity and longer model training time)
  3. Create new features by transforming or combining multiple features
91
Q

Limitations of Regression Analysis

A
  1. Parameter instability
  2. Public knowledge of regression relationships may negate their future usefulness
  3. If regression assumptions are violated, hypothesis tests and predictions based on linear regression will not be valid. Uncertainty as to whether an assumption has been violated happens often.
92
Q

Tasks for model training

A
  1. Method Selection - Choosing the appropriate algorithm given objective and data characteristics (e.g. supervised/unsupervised, type of data, size of data)
  2. Performance Evaluation - Quantify and critique model performance
  3. Tuning - process of implementing changes to improve model performance
93
Q

Variance error

A

Resulting from overfitting model with noise-inducing features/too many features causing out-of-sample error

94
Q

Precision

A

The ratio of true positives to all predicted positives

High precision is values when the cost of a type I error is large.

P = TP / TP + FP

95
Q

Recall (a.k.a. True Positive Rate)

A

Ratio of true positives to all actual positives

High recall is values when the cost of a type II error is large

R = TP / (TP + FN)

96
Q

F1 score

A

Harmonic mean of precision and recall. Precision and recall together determine model accuracy.

F1: (2 x P x R) / (P + R)
Accuracy: (TP + TN) / (TP + TN + FP + FN)

97
Q

Receiver Operating Characteristic (ROC)

A

A curve showing the trade off between False Positives and True Positives. The true positive rate (or recall) is on the Y-axis and false positive rate (FPR) is on the X-axis.

Area under the curve (AUC) is a value from 0 to 1. The closer the AUC to 1, the higher the predictive accuracy of the model.

98
Q

Steps in Simulation

A
  1. Determine probabilistic variables - uncertain input variables that influence the value of an investment
  2. Define probability distributions for the variables and specify parameters for distribution
  3. Check for correlation among variables using historical data
  4. Run simulation
99
Q

3 approaches for specifying distribution

A
  1. Historical data - assumes future values of the variable will be similar to its past
  2. Cross-sectional data - estimate distribution of the variable based on the values of the variable for peers
  3. Subjective specification of a distribution along with related parameters
100
Q

Advantages of Simulations

A
  1. Better input estimation - forces users to think about variability in estimates
  2. A distribution rather than a point estimates (i.e. a point in time) - simulations highlight the inherit uncertainty in valuing risky assets and explain divergence in estimates
101
Q

Examples of constraints in Simulations

A
  1. Minimum book value of equity - e.g. maintain minimal capital for BASEL
  2. Earnings and cash flow - externally and internally imported restrictions on profitability
  3. Market Value - comparing value of business to value debt in all scenarios
102
Q

Problems in Simulation

A
  1. GIGO - “garbage in garbage out”
  2. Real data may not fit specified distribution
  3. Non-stationarity - changes in market events may render model useless
  4. Dynamic correlations - correlations between input variables may not be stable and if model is not factored for changes, output may be flawed
103
Q

Benefits of Montecarlo Simulation

A

Considers all possible outcomes

Better suited for continuous risks, which can be sequential or concurrent

Allows for explicitly modeling corrections of input variables

104
Q

Decision Trees

A

Appropriate tool for measuring risk in an investment when risk is discrete and sequential

It cannot accommodate correlated variables

It can be used as complements to risk-adjusted valuation or as substitutes to such valuation

105
Q

Mean Reverting Level

A

The value time series will show a tendency to move towards. If the value of the time series is greater (less) than the mean reverting level, the value is expected to decrease (increase) overtime to its mean reverting level

Calculated as B0 / (1 - B1)

106
Q

Non-Uniformity Error

A

Refers to the error that occurs when the data is not presented in an identical format

107
Q

Normalization

A

Process of rescaling numeric variables in the range of [0,1]

Xi(normalized) = Xi - Xmin / (Xmax-Xmin)

108
Q

Lambda

A

Is a hyper-parameter who value must be set before supervised learning begins of the regression model

It will determine the balance between fitting the model versus keeping the model parsimonious

When = 0, it is equivalent to an OLS regression

109
Q

Sample Covariance

A

Sum {(X-Xbar)(Y-Ybar)} / n-1