Quantitative Methods Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Primary areas of fintech

A
  • Increasing functionality to handle large sets of data that may come from many sources and exist in a variety of forms
  • Tools and techniques for analyzing a very large data set, such as artificial intelligence.
  • Automation of financial functions such as executing trades and providing investment advice.
  • Emerging technologies for financial recordkeeping that may reduce the need for intermediaries.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Big Data

A

Big Data refers to all the potentially useful information that is generated in the economy, including data from traditional sources and alternative data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Corporate exhaust

A

Businesses that generate potentially useful information such as bank records and retail scanner data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Internet of Things

A

Sensors, such as radio frequency identification chips, that are embedded in numerous devices such as smart phones and smart buildings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Characteristics of Big Data

A

Volume, velocity, and variety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data processing method include

A
  • Capture—collecting data and transforming it into usable forms.
  • Curation—assuring data quality by adjusting for bad or missing data.
  • Storage—archiving and accessing data.
  • Search—examining stored data to find needed information.
  • Transfer—moving data from their source or a storage medium to where they are needed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Neural Networks

A

Example of artificial intelligence in that they are programmed to process information in a way similar to the human brain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Machine Learning

A

This refers to programming that gives a computer system the ability to improve its performance of a task over time. The machine learning process typically requires vast amounts of data.

  • In supervised learning, the input and output data are labeled, the machine learns to model the outputs from the inputs, and then the machine is given new data on which to use the model.
  • In unsupervised learning, the input data are not labeled and the machine learns to describe the structure of the data.
  • Deep learning is a technique that uses layers of neural networks to identify patterns, beginning with simple patterns and advancing to more complex ones. Deep learning may use supervised or unsupervised learning.

ML can produce models that overfit or underfit the data.

  • Overfitting occurs when the machine learns the input and output data too exactly, treats noise as true parameters, and identifies spurious patterns and relationships.
  • Underfitting occurs when the machine fails to identify actual patterns and relationships, treating true parameters as noise.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fintech Application

A
  • Text analytics refers to the analysis of unstructured data in text or voice forms.
  • Natural language processing refers to the use of computers and artificial intelligence to interpret human language.
  • Algorithmic trading refers to computerized securities trading based on a predetermined set of rules.
    • High-frequency trading identifies and takes advantage of intraday securities mispricings
  • Robo-advisors are online platforms that provide automated investment advice based on a customer’s answers to survey questions.
    • The primary advantage of robo-advisors is their low cost to customers.
    • A disadvantage of robo-advisors is that the reasoning behind their recommendations might not be apparent.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Distributed ledger

A

A distributed ledger is a database that is shared on a network so that each participant has an identical copy. A distributed ledger must have a consensus mechanism to validate new entries into the ledger. Distributed ledger technology uses cryptography to ensure only authorized network participants can use the data.

Distributed ledgers can take the form of permissionless or permissioned networks.

  • In permissionless networks, all network participants can view all transactions. These networks have no central authority, which gives them the advantage of having no single point of failure. The ledger becomes a permanent record visible to all, and its history cannot be altered (short of the manipulation described previously). This removes the need for trust between the parties to a transaction.
  • In permissioned networks, users have different levels of access. For example, a permissioned network might allow network participants to enter transactions while giving government regulators permission to view the transaction history. A distributed ledger that allowed regulators to view records that firms are required to make available would increase transparency and decrease compliance costs.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Blockchain

A

A blockchain is a distributed ledger that records transactions sequentially in blocks and links these blocks in a chain. Each block has a cryptographically secured “hash” that links it to the previous block. The consensus mechanism in a blockchain requires some of the computers on the network to solve a cryptographic problem. These computers are referred to as miners.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Financial applications of distributed ledger technology

A
  • Cryptocurrencies are a current example of distributed ledger technology in finance. It allows participants to engage in real-time transactions without a financial intermediary and typically resides on permissionless networks.
  • Initial coin offerings sell cryptocurrency for money or another cryptocurrency. This reduces the cost and time frame compared to carrying out a regulated IPO, and initial coin offerings typically do not come with voting rights. Fraud has occurred with initial coin offerings and they may become subject to securities regulations.
  • Smart contracts are electronic contracts that could be programmed to self-execute based on terms agreed to by the counterparties.
  • Tokenization refers to electronic proof of ownership of physical assets, which could be maintained on a distributed ledger.

Post-trade clearing and settlement is an area of finance to which distributed ledger technology might be productively applied. Distributed ledgers could automate many of the processes currently carried out by custodians and other third parties. On the other hand, the inability to alter past transactions on a distributed ledger is problematic when canceling a trade is required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Covariance

A
  • A statistical measure of the degree to which the two variables move together.
  • Captures the linear relationship between two variables.
  • A positive covariance indicates that the variables tend to move together, vice versa.
  • May range from negative to positive infinity, and it is presented in terms of squared units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Correlation coefficient (r)

A
  • A measure of the strength of the linear relationship (correlation) between two variables. No unit.
  • –1 ≤ r ≤ +1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Spurious Correlation

A

Spurious correlation refers to the appearance of a causal linear relationship when, in fact, there is no relation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Simple Linear Regression

A

The purpose of simple linear regression is to explain the variation in a dependent variable in terms of the variation in a single independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dependent vs. independent variable

A
  • The dependent variable is the variable whose variation is explained by the independent variable. Also referred to as the explained variable, the endogenous variable, or the predicted variable.
  • The independent variable is the variable used to explain the variation of the dependent variable. Also referred to as the explanatory variable, the exogenous variable, or the predicting variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Linear Regression Assumptions

A
  • Linear relationship between Y and X
  • No exact linear relationship among X’s (Multicollinearity)
  • The expected value of the residual term is zero [E(ε) = 0].
  • The variance of the residual term is constant for all observations[E(εi2)=σε2]. (Heteroskedasticity)
  • The residual term is independently distributed; that is, the residual for one observation is not correlated with that of another observation [E(εiεj)=0,j≠i].
  • The residual term is normally distributed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Slope Coefficient

A

slope coefficient for the regression line describes the change in Y for a one unit change in X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Sum of squared errors (SSE)

A

The sum of the squared vertical distances between the estimated and actual Y-values is referred to as the sum of squared errors (SSE).

Simple linear regression is frequently referred to as ordinary least squares (OLS) regression, and the values estimated by the estimated regression equation, Yi, are called least squares estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Standard error of estimate (SEE)

A
  • Measures the degree of variability of the actual Y-values relative to the estimated Y-values from a regression equation.
  • The smaller the standard error, the better the fit.
  • The SEE is the standard deviation of the error terms in the regressions, also referred to as the standard error of the residual, or standard error of the regression.
  • Equal to the square root of the MSE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Coefficient of determination (R2)

A
  • The coefficient of determination (R2) is defined as the percentage of the total variation in the dependent variable explained by the independent variable. R2 = r2 for a regression with one independent variable. This approach is not appropriate when more than one independent variable is used in the regression
  • R2 by itself may not be a reliable measure of the explanatory power of the multiple regression model, because R2 almost always increases as variables are added to the model, even if the marginal contribution of the new variables is not statistically significant. A relatively high R2 may reflect the impact of a large set of independent variables rather than how well the set explains the dependent variable. This problem is often referred to as overestimating the regression.
  • Adjusted R2 is always less than or equal to R2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Analysis of variance (ANOVA)

A

a statistical procedure for analyzing the total variability of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Total sum of squares (SST)

Regression sum of squares (RSS)

Sum of squared errors (SSE)

A
  • Total sum of squares (SST) measures the total variation in the dependent variable.
  • Regression sum of squares (RSS) measures the variation in the dependent variable that is explained by the independent variable.
  • Sum of squared errors (SSE) measures the unexplained variation in the dependent variable.
  • Total variation = explained variation + unexplained variation
  • SST = RSS + SSE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Mean regression sum of squares (MSR)

Mean squared error (MSE)

A

The mean regression sum of squares (MSR) and mean squared error (MSE) are simply calculated as the appropriate sum of squares divided by its degrees of freedom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

R2

A

The R2 is the percentage of the total variation in the dependent variable explained by the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Partial Slope Coefficients

A

The slope coefficients in a multiple regression.

Each slope coefficient is the estimated change in the dependent variable for a one-unit change in that independent variable, holding the other independent variables constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

F-statistic

A
  • An F-test assesses how well a set of independent variables, as a group, explains the variation in the dependent variable. In multiple regression, the F-statistic is used to test whether at least one independent variable in a set of independent variables explains a significant portion of the variation of the dependent variable.
  • F = MSR / MSE
  • Always a one-tailed test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Limitation of Regression

A
  • Relationships change over time (Parameter instability
  • Public knowledge of relationships eliminate usefulness to traders
  • Assumption violations
30
Q

p-value

A

The p-value is the smallest level of significance for which the null hypothesis can be rejected.

  • If the p-value is less than significance level, the null hypothesis can be rejected.
  • If the p-value is greater than the significance level, the null hypothesis cannot be rejected.
31
Q

Dummy Variables

A
  • Independent variables that fall into this category are called dummy variables and are often used to quantify the impact of qualitative events.
  • Dummy variables are assigned a value of “0” or “1.”
  • Whenever we want to distinguish between n classes, we must use n – 1 dummy variables. Otherwise, the regression assumption of no exact linear relationship between independent variables would be violated (multicollinearity).
32
Q

Heteroskedasticity

A

Heteroskedasticity occurs when the variance of the residuals is not the same across all observations in the sample. This happens when there are subsamples that are more spread out than the rest of the sample.

Unconditional heteroskedasticity occurs when the heteroskedasticity is not related to the level of the independent variables. it usually causes no major problems with the regression.

Conditional heteroskedasticity is heteroskedasticity that is related to the level of (i.e., conditional on) the independent variables. It creates significant problems for statistical inference.

33
Q

Four effects of heteroskedasticity

A
  • The standard errors are usually unreliable estimates.
  • The coefficient estimates (the ˆbjb^j ) aren’t affected.
  • If the standard errors are too small, but the coefficient estimates themselves are not affected, the t-statistics will be too large and the null hypothesis of no statistical significance is rejected too often. The opposite will be true if the standard errors are too large.
  • The F-test is also unreliable.
34
Q

Two methods to detect heteroskedasticity

A
  1. Examining scatter plots of the residuals
  2. Using the Breusch-Pagan chi-square (χ2) test. The more common way to detect conditional heteroskedasticity is the Breusch-Pagan test, which calls for the regression of the squared residuals on the independent variables. If conditional heteroskedasticity is present, the independent variables will significantly contribute to the explanation of the squared residuals
35
Q

Breusch Pagan Test

A

BP chi-square test = n × (Rresid )2 with k degrees of freedom

36
Q

Two way of correcting heteroskedasticity

A
  • Robust standard errors (also called White-corrected standard errors or heteroskedasticity-consistent standard errors). These robust standard errors are then used to recalculate the t-statistics using the original regression coefficients.
  • A second method to correct for heteroskedasticity is the use of generalized least squares, which attempts to eliminate the heteroskedasticity by modifying the original equation.
37
Q

Serial Correlation

A

Serial correlation, also known as autocorrelation, refers to the situation in which the residual terms are correlated with one another.

  • Positive serial correlation exists when a positive regression error in one time period increases the probability of observing a positive regression error for the next time period.
  • Negative serial correlation occurs when a positive error in one period increases the probability of observing a negative error in the next period.
38
Q

3 primary assumption violations of regression analysis

A
  1. Heteroskedasticity
  2. Serial correlation (autocorrelation)
  3. Multicollinearity
39
Q

Two methods to detect serial correlation

A
  • Residual plots: a scatter plot of residuals versus time
  • Durbin-Watson Statistic (DW)
40
Q

Durbin-Watson Statistic (DW)

A

If the sample size is very large: DW = 2 ( 1 - r )

where: r = correlation coefficient between residuals from one period and those from the previous period

  • DW = 2, the error terms are homoskedastic and not serially correlated (r = 0)
  • DW < 2, the error terms are positively serially correlated (r > 0)
  • DW > 2, the error terms are negatively serially correlated (r < 0)
  • If DW < dl, the error terms are positively serially correlated (i.e., reject the null hypothesis of no positive serial correlation).
  • If dl < DW < du, the test is inconclusive.
  • If DW > du, there is no evidence that the error terms are positively correlated. (i.e., fail to reject the null of no positive serial correlation).
41
Q

Correcting Serial Correlation

A
  • Adjust the coefficient standard errors, using the Hansen method, which also corrects for conditional heteroskedasticity. Also called serial correlation consistent standard errors or Hansen-White standard errors.
    • Only use the Hansen method if serial correlation is a problem. The White-corrected standard errors are preferred if only heteroskedasticity is a problem. If both conditions are present, use the Hansen method.
  • Improve the specification of the model. The best way to do this is to explicitly incorporate the time-series nature of the data
42
Q

Multicollinearity

A
  • Multicollinearity refers to the condition when two or more of the independent variables, or linear combinations of the independent variables, in a multiple regression are highly correlated with each other.
  • Even though multicollinearity does not affect the consistency of slope coefficients, such coefficients themselves tend to be unreliable. Additionally, the standard errors of the slope coefficients are artificially inflated. Hence, there is a greater probability that we will incorrectly conclude that a variable is not statistically significant (i.e., a Type II error).
  • The most common way to detect multicollinearity is the situation where t-tests indicate that none of the individual coefficients is significantly different than zero, while the F-test is statistically significant and the R2 is high.
  • The most common method to correct for multicollinearity is to omit one or more of the correlated independent variables.
43
Q

3 broad categories of model misspecification

A
  1. The functional form can be misspecified.
    • Important variables are omitted.
    • Variables should be transformed.
    • Data is improperly pooled.
  2. Explanatory variables are correlated with the error term in time series models.
    • A lagged dependent variable is used as an independent variable.
    • A function of the dependent variable is used as an independent variable (“forecasting the past”).
    • Independent variables are measured with error.
  3. Other time-series misspecifications that result in nonstationarity.
44
Q

Effects of Model Misspecification

A

Model misspecification leads to biased and inconsistent regression coefficients, which further leads to unreliable hypothesis testing and inaccurate predictions

45
Q

Qualitative dependent variable

A

A dummy variable that takes on a value of either zero or one. An ordinary regression model is not appropriate for situations that require a qualitative dependent variable.

There are several different types of models that use a qualitative dependent variable.

  • Probit model is based on the normal distribution, while a logit model is based on the logistic distribution.
  • Discriminant models are similar to probit and logit models but make different assumptions regarding the independent variables. Discriminant analysis results in a linear function similar to an ordinary regression, which generates an overall score, or ranking, for an observation. The scores can then be used to rank or classify observations.
46
Q

Assessment of Multiple Regression Model

A
47
Q

Penalized regression

A

A special case of generalized linear model (GLM) is penalized regression. Penalized regression models seek to minimize forecasting errors by reducing the problem of overfitting. They seek to minimize the sum of square errors (same as in multiple regression models) as well as a penalty value. This penalty value increases with the number of independent variables (features) used by the model.

48
Q

Classification trees

A

Classification trees are appropriate when the target variable is categorical while regression trees are appropriate when the target is continuous. More typically, classification trees are used when the target is binary

Classification trees assign observations to one of two possible classifications at each node. At the top of the tree, the top feature (the one most important in explaining the target) is selected and a cutoff value “c” is estimated. The tree stops when the error cannot be reduced further resulting in a terminal node

A random forest is a collection of randomly generated classification trees from the same data set. Because each tree only uses a subset of features, random forests can mitigate the problem of overfitting. Using random forests can increase the signal-to-noise ratio because errors across different trees tend to cancel each other out.

49
Q

Steps in model training

A

Specify the algorithm.

Specify the hyperparameters (before the processing begins).

Divide data into training and validation samples. In the case of cross validation, the training and validation samples are randomly generated every learning cycle.

Evaluate the training using a performance parameter, P, in the validation sample.

Repeat the training until adequate level of performance is achieved. In choosing the number of times to repeat, the researcher must use caution to avoid overfitting the model.

50
Q

Time Series

A

A time series is a set of observations for a variable over successive periods of time. The series has a trend if a consistent pattern can be seen by plotting the data on a graph.

51
Q

Linear Trend

A

A linear trend is a time series pattern that can be graphed using a straight line. A downward sloping line indicates a negative trend, while an upward-sloping line indicates a positive trend.

yt = b0 + b1(t) + εt

52
Q

Log-Linear Trend Model

A

Time series data, particularly financial time series, often display exponential growth (growth with continuous compounding). When a series exhibits exponential growth, it can be modeled as:

yt=eb0+b1(t)

We take the natural log of both sides of the equation and arrive at the log-linear model.

ln(yt) = ln(eb0+b1(t)) = b0+b1(t)

53
Q

Autoregressive Model (AR)

A

When the dependent variable is regressed against one or more lagged values of itself, the resultant model is called as an autoregressive model (AR).

xt = b0 + b1xt–1 + εt

54
Q

Covariance Stationary

A

Autoregressive model is only valid when the time series being modeled is covariance stationary:

  • Constant and finite expected value
  • Constant and finite variance
  • Constant and finite covariance between values at any give lag.
55
Q

Chain rule of forecasting

A

It is necessary to calculate a one-step-ahead forecast before a two-step-ahead forecast can be calculated.

56
Q

Serial Correlation in an Autoregressive Model

A
  • Cannot use Durbin Watson to test for serial correlation in AR models
  • Use a t-test on residual autocorrelations
  • If serial correlation exists, the model is incomplete
  • Solution: Increase order of model by adding more lagged variables
57
Q

Mean reverting Level

A

the mean-reverting level is expressed as xt = b0 / (1−b1).

  • if xt > b0 / (1−b1) , the AR(1) model predicts that xt + 1 will be lower than xt, and
  • if xt < b0 / (1−b1) , the model predicts that xt + 1 will be higher than xt.
58
Q

Comparing AR Model Accuracy

A
  • In-sample: Data used to develop model
  • Out-of-sample: Any data outside above range
  • Forecasting accuracy is measured by square root of the mean squared error (RMSE). (=SEE)
  • Use the model with the lowest RMSE based on out-of-sample forecasting errors
59
Q

Random Walks

A
  • Defining characteristic: b1 = 1 (Unit Roots)
    • without a drift: b0 = 0
    • with a drift: b0 ≠ 0
  • Random Walks is not covariance stationary
  • If a time series is a random walk, the best forecast of xt that can be made in period t − 1 is xt−1.
60
Q

Dickey and Fuller model transformation (Unit Root Testing)

A
  • (1) xt = b0 + b1xt−1 + ε
  • (2) xt - xt−1 = b0 + b1xt−1 - xt−1 + ε
  • xt - xt−1 = b0 + (b1 - 1)xt−1 + ε
  • H0: b1 - 1 = 0 Ha: b1 - 1 < 0
61
Q

First Differencing

A

If we believe a time series is a random walk, we can transform the data to a covariance stationary time series using a procedure called first differencing.

yt = xt − xt–1 ⇒ yt = εt

Then, stating y in the form of an AR(1) model:

yt = b0 + b1yt-1 + ε1

where:

b0 = b1 = 0

This transformed time series has a finite mean-reverting level of 0 / (1−0) = 0 and is, therefore, covariance stationary.

62
Q

ARCH

A
  • When examining a single time series, such as an AR model, autoregressive conditional heteroskedasticity (ARCH) exists if the variance of the residuals in one period is dependent on the variance of the residuals in a previous period.
    • εt2 = a0 + a1ε(t−1)2 + μt,
    • If a1 is statistically significant, the time series is ARCH
  • ARCH can be used to predict the variance of the residuals in future periods (volatility)​
    • Variancet+12 = a0 + a1εt2 ​​
63
Q

Test whether regression using two time series is reliable

A
  1. Both time series are covariance stationary. (Reliable)
  2. Only the dependent variable time series is covariance stationary. (Not Reliable)
  3. Only the independent variable time series is covariance stationary. (Not Reliable)
  4. Neither time series is covariance stationary and the two series are not cointegrated. (Not Reliable)
  5. Neither time series is covariance stationary and the two series are cointegrated. (Reliable)
64
Q

Cointegration

A

Cointegration means that two time series are economically linked (related to the same macro variables) or follow the same trend and that relationship is not expected to change. If two time series are cointegrated, the error term from regressing one on the other is covariance stationary and the t-tests are reliable.

65
Q

Determine an appropriate time-series model

A
  1. If there is no seasonality or structural shift, use a trend model.
    • If the data plot on a straight line with an upward or downward slope, use a linear trend model.
    • If the data plot in a curve, use a log-linear trend model.
  2. Run the trend analysis, compute the residuals, and test for serial correlation using the Durbin Watson test.
  • If you detect no serial correlation, you can use the model.
  • If you detect serial correlation, you must use another model (e.g., AR).
  1. If the data has serial correlation, reexamine the data for stationarity before running an AR model. If it is not stationary, treat the data for use in an AR model as follows:
    • If the data has a linear trend, first-difference the data.
    • If the data has an exponential trend, first-difference the natural log of the data.
    • If there is a structural shift in the data, run two separate models as discussed above.
    • If the data has a seasonal component, incorporate the seasonality in the AR model as discussed in the following.
  2. After first-differencing in 5 previously, if the series is covariance stationary, run an AR(1) model and test for serial correlation and seasonality.
    • If there is no remaining serial correlation, you can use the model.
    • If you still detect serial correlation, incorporate lagged values of the variable (possibly including one for seasonality—e.g., for monthly data, add the 12th lag of the time series) into the AR model until you have removed (i.e., modeled) any serial correlation.
  3. Test for ARCH. Regress the square of the residuals on squares of lagged values of the residuals and test whether the resulting coefficient is significantly different from zero.
    • If the coefficient is not significantly different from zero, you can use the model.
    • If the coefficient is significantly different from zero, ARCH is present. Correct using generalized least squares.
  4. If you have developed two statistically reliable models and want to determine which is better at forecasting, calculate their out-of-sample RMSE.
66
Q

Step in Simulations

A
  1. Determine the probabilistic variables
  2. Define probability distributions for these variables (3 approaches to specify a distribution):
    • Historical data: Examination of past data may point to a distribution that is suitable for the probabilistic variable. This method assumes that the future values of the variable will be similar to its past.
    • Cross-sectional data: When past data is unavailable (or unreliable), we may estimate the distribution of the variable based on the values of the variable for peers.
    • Pick a distribution and estimate the parameters.
  3. Check for correlations among variables. When there is a strong correlation between variables, we can either:
    • Allow only one of the variables to vary
    • Build the rules of correlation into the simulation.
  4. Run the simulations
67
Q

The number of simulations needed for a good output is driven by:

A
  • The number of uncertain variables. The higher the number of probabilistic inputs, the greater the number of simulations needed.
  • The types of distributions. The greater the variability in types of distributions, the greater the number of simulations needed.
  • The range of outcomes. The wider the range of outcomes of the uncertain variables, the higher the number of simulations needed.
68
Q

2 advantages and 3 constraints of simulation:

A

2 advantages:

  • Better input quality.
  • Provides a distribution of expected value rather than a point estimate.

3 constraints:

  • Book value constraints
    • Regulatory capital requirements
    • Negative equity
  • Earnings and cash flow constraints (internal & external)
  • Market value constraints
69
Q

Limitations of using simulations as a risk assessment tool

A
  • Input quality
  • Inappropriate statistical distributions
  • Non-stationary distributions
  • Dynamic correlations
70
Q

Compare simulations, scenario analysis, and decision trees

A