Quantitative Methods Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Numerical Data (e.g. Discrete, Continuous)

A

Values that can be counted.

We can perform mathematical operations only on numerical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Categorical Data (e.g. Nominal, Ordinal)

A

consist of labels that can be used to classify a set of data into groups. Categorical data may be nominal or ordinal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete Data

A

Countable data , such as the months, days, or hours in a year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous Data

A

Can take any fractional value (e.g., the annual percentage return on an investment).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Nominal Data

A

Data that cannot be placed in a logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ordinal Data

A

Can be ranked in logical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Structured Data

A

Data that can be organised in a defined way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Time series

A

A set of observations taken periodically e.g. at equal intervals over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cross-sectional data

A

Refers to a set of comparable observations all taken at one specific point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Panel Data

A

Time series and cross-sectional data combined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Unstructured Data

A

A mix of data with no defined structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

One-dimensional array

A

represents a single variable (e.g. a time series)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two-dimensional array

A

Represents two variables (e.g. panel data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Contingency table

A

A two-dimensional array that displays the joint frequencies of two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Confusion matrix

A

A contingency table (two variables) that displays predicted and actual occurrences of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Relationship between geometric and arithmetic mean

A

The geometric mean is always less than or equal to the arithmetic mean, and the difference increases as the dispersion of the observations increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Trimmed mean

A

Estimate the mean without the effects of a given percentage of outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Winsorized mean

A

Decrease the effect of outliers on the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Harmonic mean

A

Calculate the average share cost from periodic purchases in a fixed dollar amount.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Empirical probability

A

established by analysing past data (outcomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Priori probability

A

determined using reasoning and inspection process (not data) e.g. looking at a coin and deciding there is a 50/50 chance of each outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Subjective probability

A

Established using personal judgement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Unconditional probability (marginal probability)

A

the probability of an event regardless of the past or future occurrence of other events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Conditional probability

A

where the occurrence of one event affects the probability of the occurrence of another event. e.g. Prob (A I B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Multiplication rule of probability

A

P (AB) = P (A I B) * P (B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Addition rule of probability

A

P (A or B) = P (A) + P (B) - P (AB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Probability distribution

A

The probabilities of all the possible outcomes for a random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

A discrete random variable

A

when the number of possible outcomes in a probability can be counted and there is a measurable/positive probability. e.g. the number of days it may rain in a month.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

A continuous random variable

A

When the number of possible outcomes is infinite, even if upper and lower bands exist. e.g. the amount of rainfall per month.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The probability function , p(x)

A

gives the probability that a discrete random variable will equal X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

A cumulative probability function (cdf) , F(x)

A

gives the probability that a random variable will be less than or equal to a given value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Binomial Random Variable - E(X) = np

A

Binomial Random Variable - Var(X) = np(1-p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

For a continuous random variable X, the probability of any single value of X is

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

The normal distribution has the following key properties:

A
  • It is completely described by its mean, μ, and variance, σ2, stated as X ~ N(μ, σ2).
    In words, this says that “X is normally distributed with mean μ and variance σ2.”
  • Skewness = 0 (symmetrical),
    meaning that the normal distribution is symmetric about its mean, so that P(X ≤ μ) = P(μ ≤ X) = 0.5, and mean = median = mode.
  • Kurtosis = 3;
    this is a measure of how flat the distribution is. Recall that excess kurtosis is measured relative to 3, the kurtosis of the normal distribution.
  • A linear combination of normally distributed random variables is also normally distributed.
  • The probabilities of outcomes further above and below the mean get smaller and smaller but do not go to zero (the tails get very thin but extend infinitely).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

univariate distribution

A

 the distribution of a single random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

A multivariate distribution

A

the distribution of two or more random variables (takes into account correlation coefficients)

  • specifies the probabilities associated with a group of random variables and is meaningful only when the behavior of each random variable in the group is in some way dependent on the behavior of the others.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Number of correlations in a portfolio

A

0.5n*(n-1)

n = no. of assets in portfolio / variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Normal distribution: +/-1 s.d. from the mean

A

68% confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Normal distribution: +/-1.65 s.d. from the mean

A

90% confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Normal distribution: +/-1.96 s.d. from the mean

A

95% confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Normal distribution: +/-2.58 s.d. from the mean

A

99% confidence interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

“standardizing a random variable” (finding z)

A

measuring how far it lies from the arithmetic mean

z = the no. of standard deviations the variable is from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

How to calculate z

how many standard deviations a variable is from the mean

A

z = ( x - pop. mean ) / s.d.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

shortfall risk

A

probability that a portfolio return or value will be below a target return or value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Roy’s safety first ratio (SF ratio)

A

no. of standard deviations the target return is from the expected return/value

The larger the SF ratio, the lower the probability of falling below the minimum threshold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

For a standard normal distribution, F(0) is:

A

0.5

By the symmetry of the z-distribution and F(0) = 0.5. Half the distribution lies on each side of the mean. (LOS 4.j)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Holding period of return –> Continuously compounded rate

A

ln ( 1 + holding period of return)

ln = natural log

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Continuously compounded rate –> Holding period of return

A

e^ continuously compounded rate -1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

t-distribution

A
  • Symmetrical.
  • Defined by degrees of freedom (df), where the degrees of freedom = no. of sample observations - 1 (n – 1), for sample means.
  • More probability in the tails (“fatter tails”) than the normal distribution.
  • As the degrees of freedom (the sample size) gets larger, the shape of the t-distribution more closely approaches a standard normal distribution.

NOT PLATYKURTIC - is less peaked than normal dist. but has fatter tails.

For t-distribution, the lower the degrees of freedom, the fatter the tails and the greater the probability of extreme outcomes.

50
Q

chi-square distribution (x^2)

used to test variance of a normally distributed population

A
  • Distribution of the sum of squared values of n independent standard normal random variables (all positive values)
  • Asymmetric
  • Degrees of freedom = n-1
  • As degrees of freedom increases, approaches normal distribution
51
Q

Degrees of freedom, k (in context of distribution charts)

A

the number of values a random variable can vary from the mean

52
Q

F-distribution

used to test variance of two population variances

A
  • Quotient of two chi-square distributions with m and n degrees of freedom (all positive values)
  • Asymmetric
  • As degrees of freedom increase, approaches normal distribution
53
Q

F-stat formula

F-distribution

A

F-stat = ( x^2 / m ) / ( x^2 / n )

= (chi-square for sample 1 / m ) / (chi-square for sample 1 / n )

54
Q

Monte Carlo Simulation

used to estimate a distribution of asset prices

A

Generating 1000s of simulations of the asset using its variables, then calculate the mean/variance of the outcomes and price the asset accordingly

55
Q

Use of Monte Carlo Simulation

A
  • Value complex securities.
  • Simulate the profits/losses from a trading strategy.
  • Calculate estimates of value at risk (VaR) to determine the riskiness of a portfolio of assets and liabilities.
  • Simulate pension fund assets and liabilities over time to examine the variability of the difference between the two.
  • Value portfolios of assets that have abnormal returns distributions.
56
Q

Binomial random variable

A

When there are only two possible outcomes of a given event.

57
Q

Sampling error

A

The difference between a sample statistic and its corresponding population parameter

sampling error of the mean = sample mean – population mean = x – µ

58
Q

The standard error of the sample mean (when population is known)

A

standard deviation of the distribution of the sample means

σx = σp / √n

59
Q

Effect on the standard deviation of the sample, if the sample (n) increases?

A

n ↑ σs ↓

60
Q

Desirable characteristics of an estimator (sample statistic)

A

Unbiased
Efficient
Consistent

61
Q

Simple random sampling

A

Selecting a sample where each item in the population is has the same probability of being chosen

62
Q

Stratified random sampling

A

randomly selecting samples proportionally from sub-groups. Sub-groups are formed based on one or more defining characteristics

63
Q

Cluster sampling

A

similar to strat. random sampling - but subgroups are not necessarily based on the data.

1 stage: sample is chosen from random subgroups (clusters)
2 stage: sample is chosen from each subgroup (cluster)

64
Q

Central limit theorem

A

For a population with a mean (µ) and a variance (σ^2) - ta sample distribution of 30+ will reflect the distribution of the population

65
Q

Confidence interval

A

A range of values in which the population mean is expected to lie within a given probability

66
Q

Reliability factor for 90% confidence interval

A

1.645

67
Q

Reliability factor for 95% confidence interval

A

1.96

68
Q

Reliability factor for 99% confidence interval

A

2.575

69
Q

Confidence interval for a single item selected from the population

A

population mean (µ) +/- reliability factor * σ

70
Q

Confidence interval for a point estimate (values used to estimate population parameters) selected from a sample

A

Mean of sample +/ reliability factor * standard error

71
Q

Confidence interval for a sample mean

A

population mean (µ) +/- reliability factor * standard erro

72
Q

Which test statistic should be used for a normal distribution with a known variance?

A

z-statistic

73
Q

Which test statistic should be used for a normal distribution with an unknown variance?

A

t-statistic

74
Q

Which test statistic should be used for a non-normal distribution with a known variance?

A

z-statistic

NB not available with a small sample n<30

75
Q

Which test statistic should be used for a non-normal distribution with an unknown variance?

A

t-statistic

NB not available with a small sample n<30

76
Q

Jackknife method of estimating standard error of the sample mean

A

Calculate the s.d. of multiple sample means (each sample with one observation removed from the sample).

  • Computationally simple
  • Used when population is small
  • Removes bias from statistical estimates
77
Q

Bootstrap method of estimating standard error of the sample mean

A

Calculate the s.d. of multiple sample means (each sample possible).

78
Q

Two issues of the idea that larger samples increase accuracy of understanding the population

A
  • May contain wrong observations (from other populations)

- Additional cost

79
Q

Data snooping

A

Using a sample of observations to form an opinion - leads to ‘data snooping bias’

80
Q

Sample selection bias

A

When certain observations are systematically excluded from the analysis (usually due to lack of available data)

81
Q

Survivorship bias

A

Only including active/live data. e.g. only including active funds in an analysis of fund performance.

82
Q

Time-period bias

A

Using data within a time period that is either too long or too short

83
Q

Look-ahead bias

A

When a study tests a relationship with data that was not available on the test date.

84
Q

Stratified random sampling is most often used to preserve the distribution of risk factors when creating a portfolio to track an index of:

A

Corporate bonds

risk factors e.g. ‘stratas’ can be more easily identified - which forms the basis of the sample

85
Q

If random variable Y follows a lognormal distribution then the natural log of Y must be:

A

normally distributed.

86
Q

Steps involved in hypothesis testing:

A
  • Hypothesis
  • Test statistic
  • Level of significance
  • Decision rule for hypothesis
  • Collect sample and calculate statistics
  • Make decision on hypothesis
  • Make decision on test results
87
Q

Null hypothesis (Ho)

A
  • Always includes ‘=’ sign
  • Two tailed test
  • The test the researcher wants to reject
88
Q

Alternative hypothesis (Ha)

A
  • What is concluded if null hypothesis is wrong
89
Q

General decision rule for a two-tailed test:

A

Reject Ho (null hypothesis) if:
test statistic > upper critical value, or
test statistic < lower critical value

(in one of the outer tails)

90
Q

test statistic equation

A

(sample statistic - hypothesized value) / SE of sample statistic

91
Q

Type I error

A

Rejecting the null hypothesis when it is true

92
Q

Type II error

A

Failing to reject null hypothesis when it is false

determined by sample size and choice of significance level

93
Q

Probability of making a Type I error

wrongly rejecting null hypothesis

A

The significance level (α)

94
Q

Probability of correctly rejecting null hypothesis?

A

The power of the test

1 - the prob. of making a type 2 error

95
Q

What is the decision rule for rejecting or failing to reject the null hypothesis based on?

A

the distribution of the test statistic

96
Q

Statistical significance

A

refers to the use of a sample to carry out a statistical test meant to reveal any significant deviation from the stated null hypothesis.

97
Q

Economic significance

A

the degree to which is the statistical significance is economically viable

98
Q

p-value

A

Probability of obtaining a test statistic that would lead to a rejection of the null hypothesis (assuming the null hypothesis is true)

The smallest level of significance where the null can be rejected

99
Q

When is it appropriate to use a z-test as the appropriate hypothesis test of the population mean?

A

Normal distribution and known variance

100
Q

When is it appropriate to use a t-test as the appropriate hypothesis test of the population mean?

A

Unknown variance

101
Q

Critical z-values for 10% level of significance

A

Two-tailed test: +/-1.65

One-tailed test: +1.28 or -1.28

102
Q

Critical z-values for 5% level of significance

A

Two-tailed test: +/-1.96

One-tailed test: +1.65 or -1.65

103
Q

Critical z-values for 1% level of significance

A

Two-tailed test: +/-2.58

One-tailed test: +2.33 or -2.33

104
Q

Difference in means test

A

Two populations that are independent and normally distributed

105
Q

Paired comparisons test

A

Two populations that are dependent of each other and normally distributed

106
Q

How to test for the variance of a normally distributed population

A

The chi-squared test

107
Q

How to test whether the variances of two normal populations are equal

A

The F -test

108
Q

Parametric tests

A

based on assumptions about population distribution and parameters (e.g. mean = 3, variance = 100)

109
Q

Non-parametric tests

A

based on minimal/no assumptions of population and test things other than parameter values (e.g. rank correlation tests, runs tests,)

110
Q

How to test whether two characteristics in a sample of data are independent of each other?

A

The X^2 test

111
Q

The appropriate test statistic for a test of the equality of variances for two normally distributed random variables, based on two independent random samples, is:

A

the F-test.

112
Q

The appropriate test statistic to test the hypothesis that the variance of a normally distributed population is equal to 13 is:

A

the χ2 test.

A test of the population variance is a chi-square test.

113
Q

The test statistic for a Spearman rank correlation test for a sample size greater than 30 follows:

A

a t-distribution.

The test statistic for the Spearman rank correlation test follows a t-distribution.

114
Q

Assumptions of Linear Regression

A
  • Linear relationship between the dependent and independent variables
  • Variance of the residual term is constant (homoskedasticity)
  • Residual terms independently and normally distributed
115
Q

Coefficient of Variation (R^2)

A

= SSR / SST
measures the percentage of total variation in Y variable explained by the variation in X

For simple regression R^2 = correlation^2 XY

116
Q

Factorial function

A

The factorial function, denoted n!, tells how many different ways n items can be arranged where all the items are included.

117
Q

Coefficient of Variation

A

σ/µ

118
Q

For a test of the equality of two variances

A

F-statistic.

119
Q

unbiased estimator

A

the expected value equals the parameter it is intended to estimate.

120
Q

A consistent estimator

A

the probability of estimates close to the value of the population parameter increases as sample size increases.