QM LM3 Statistical measures of asset returns Flashcards
What is the arithmetic mean?
- Describes a representative possible outcome
- sensitive to outliers (extreme values)
- AKA the simple mean
What is the median?
- Middle value for odd no. of observations when ranked
- Average of middle 2 observaitons for even no. when ranked
- Not sensitive to outliers (extreme values
What is the mode?
- Most frequently occuring value in a dataset
- a dataset can have one or two or more modes (unimodal, bimodal, trimodal…)
What are 2 ways of dealing with outliers?
- Delete them: trimmed mean
- Replace: winsorized mean
You can also just leave them, if the values are legitimate and correct. Elimination requires judgement.
What are box and whisker plots used for?
- Rank performance of portfolios and investment managers in terms of percentile/quartile in which they fall
- in investment research we can look at the bottom return and top return decile and may go long/short if we’re a hedge fund
What is the upper fence?
(1.5 x IQR). This to the upper bound may contain outliers.
What is the lower fence?
(1.5 x IQR). This to the lower bound may contain outliers
What is dispersion?
- variability around the central tendency
- a measure of risk or uncertainty
- includes range (though no info about shape of distribution), mean absolute deviation, sample variance, sample standard deviation etc
What is target downside deviation?
Deviation only below the mean
You calculate deviation for all XsubI below some minimum level B
Also called target semideviation by the reading
What is the coefficient of variation?
A measure of relative dispersion
This gives you an idea of level of risk per unit of return
CV = S / XBar
What parameters do you need to describe a normal distribution?
ND is completely described by mu and sigma squaraed
This is an idealised, well understood distribution
Why do we start from the normal distribution?
It is easier to start from an idealised well understood distribution and describe deviations from it than start from not well understood. Thus even if distributions are not usually normal distributions it is a good place to begin in analysis
What is skew?
Positive skew is where the mean is greater than the median and the mode. The top of the curve will be far to the left (which is the mean) with the median and mode further right and a long right tail.
Negative skew is the opposite: mean is less than median is less than mode. Top of curve is far to right, with long left tail
What kind of portfolios will have positive skew?
Positive skew: lots of long options. A few will pay off very big and form the long right tail
Negative skew: lots of short options. A few will lead to heavy losses and form the long left tail.
What do leptokurtic and platykurtic mean?
Normal distribution is mesokurtic and kurtosis = 3
Leptokurtic means overweight in the tails, k > 3
Platykurtic means underweight in the tains, k < 3
What is a scatter plot useful for?
- Identifying outliers
- Visualising the joint variation in 2 numerical variables
- A scatter plot matrix can be used to assess pairwise association among many variables
What is covariance?
- The joint variability of 2 random variables
- Expressed in the same units as the variables
- Sum of deviations for all observations for each data set, multiplied together, over n - 1
- When Ssubxy > 0 they covary together
What is correlation?
- Measures the linear association between 2 variables
- Covariance of xy over (SD of X * SD of y)
- 0 means no linear relationship (meaning max diversificaiton)
- negative 1 is perfect negative correlation (perfect replication), positive 1 is perfect positive correlation (perfect hedge)
What are the limitations of correlation?
- Measures linear association only ie doesn’t pick up nonlinear relationships
- Unreliable when outliers are present
- Correlation does not imply causation, so you have to be careful of wrongful causal inference (spurious relationships)
Why is visual inspection of scatter plots important?
You can’t get all the understanding of the dataset just from the metrics. You can have the same metrics for very different distributions and relationships!