Statistics Flashcards
Measurement Scales
“N” Nominal
“O” Ordinal
“I” Interval
“R” Ratio
Population Mean Formula

Sample Mean Formula

Weighted Mean Formula

Geometric Mean Formula
Used when calculating investment returns over multiple periods (TWM) or when measuring compound growth rates.
Harmonic < Geometric < Arithmetic

Geometric Mean Return Formula

Harmonic Mean Formula
Used to compute average cost of shares purchased over time.
Harmonic < Geometric < Arithmetic

Position of a percentile in an array with n entries

Mean Absolute Deviation Formula (MAD)

Population Variance Formula

Population Standard Deviation Formula

Sample Variance Formula

Sample Standard Deviation Formula

Coefficient of Variation Formula
Measures risk (variability) per unit of expected return (mean). Higher CV is riskier.

Sharpe Ratio Formula
R(p) = portfolio return R(f) = risk-free return S(p) = standard deviation of portfolio returns

Chebyshev’s Inequality
For any distribution with finite variance, the proportion of the observations within k standard deviations of the arithmetic mean is at least 1-1/k^2 for all k>1
Dispersion
Measures the variability around the central tendency (mean). Addresses risk.
Skewness
the extent to which a distribution is not symmetrical.
Left Skewed Distribution

Right Skewed Distribution

Kurtosis
Statistical measure that tells us when a distribution is more or less peaked than a normal distribution. Kurtosis = 3 for normal distributions.
Leptokurtic distribution
Lung Measured Pulmonary Function Test
Leptokurtic More Peaked Fatter Tail
A distribution that is more peaked than a normal distribution. Kurtosis > 3 (excess kurtosis > 0)

Platykurtic distribution
A distribution that is less peaked than a normal distribution. Kurtosis < 3 (excess kurtosis < 0)

Mesokurtic
A distribution identical to the normal distribution.
Descriptive v. Inferential statistics
descriptive- summarizes important characteristics of large data sets while inferential- pertain to procedures used to make forecasts, estimates, and judgements on the basis of a smaller set (sample)
population
set of all possible members of a stated group; example- cross-section of the returns of all of stocks traded on the NYSE
sample
subset of the population of interest
nominal scales
nominal scales-least accurate level of measurement; counted or classified with no order; example assigning number 1 to a municipal bond fund, the number 2 to a corp bond fun, and so on
ordinal scales
every observation is assigned to one of several categories, which are then ordered with respect to a specified characteristic;
-example -the ranking of 1,000 small cap growth stocks by performance may be done by assigning the number 1 to the 100 best performing stocks, the number 2 to the next 100 best performing stocks, and so on
interval scale
provide relative ranking, like ordinal, but differences between scale values are equal (like temperature); WEAKNESS: 30 degrees F is not 3x as hot as 10 degrees F (called zero point as the origin) like ratio scales
Ratio scales
provide ranking and equal diff b/t scale values and have a true zero point as the origin so $4 is 2x as much as $2; think NOIR - nominal, ordinal, interval, ratio
parameter
characteristic of a population such as the mean return or the SD of returns
sample statistic
used to measure a characteristic of a sample
frequency distribution
summarizes statistical data by assigning it to specified groups, or intervals
intervals
aka classes
sample statistic
used to measure a characteristic of a sample
frequency distribution
summarizes statistical data by assigning it to specified groups, or intervals
how to construct a frequency distribution
- Define the intervals to which data measurements (observations) will be assigned. Make sure all are mutually exclusive.
- Tally the observations
- Count them and find the interval with the greatest frequency called the modal interval
Example- annual returns on a stock
modal interval
interval with the greatest absolute frequency
relative frequency
percentage of total observations falling within each interval
cumulative absolute frequency or cumulative relative frequency
all the frequencies added up in order; relative means percentage-wise
histogram
graphical representation of the absolute frequency distribution; bar chart of continuous data classified into a frequency distribution
frequency polygon
shape of the histogram with just a line like a line graph; however, the line intersects the midpoints
measures of central tendency
identify the center, or average, of a data set, which can be used to represent the typical, or expected, value in the data set
population mean
mean of all observed values in the population; it’s unique so that means there’s only one mean
sample mean
mean of all the values in a sample of a population; used to make inferences about the population
unimodal, bimodal, trimodal
unimodal- means there’s one value that appears most frequently; bimodal- two values that appear most frequently
harmonic mean
average cost of shares over a certain period of time
mean absolute deviation MAD
average of the absolute values of the deviations of individual observations from the arithmetic mean; calculating SD, basically
biased estimator
mean of all observed values in the population; it’s unique so that means there’s only one mean
Chebyshev’s inequality
minimum percentage of any distribution that will lie within a certain SD of the mean; 1-(1/k)^2
Limitations of the Sharpe ratio
if 2 portfolios have negative sharpe ratios, the higher one isn’t necessarily superior because increasing risk moves a negative Sharpe ratio closer to zero
2. if asymmetic return distributions, not normal
kurtosis
measure of how much a distribution is more or less peaked than a normal distribution
Leptokurtic v. platykurtic v. mesokurtic
lepto-more peaked or sharper! v. platy - less peaked or flatter! and meso- the same!!
how are kutosis and skewness useful and why?
they are critical in a risk management setting because when securities returns are modeled using an assumed normal distribution, the predictions from the models will not take into account for the potential for extremely large, negative outcomes;
which is generally riskier? higher or lower skewness? kurtosis?
in general, greater positive kurtosis and more negative skew in returns distributions indicates increased risk
Sample skewness
when is skewness signficant?
when S k is in excess of 0.5
how to find excess kurtosis?
excess kurtosis = sample kurtosis - 3
sample kurtosis
measured using deviations raised to the fourth power
Coffecient of Variation
Population Variance
The population variance is equal to the sum of the squared differences between each population member and the population mean divided by the number of items in the population
Alternatively
It is equal to the average squared observations less the squared mean .
Mean Deviation
The mean deviation is the absolute values of the deviation from the mean and then taking there average
Characteristics of Skewness
Charateristics of Mean
Skewness
SKewness
Relationship between Sharpe Ratio and Risk
Higher the Sharpe Ratio higher the risk
Relative Dispersion
Coefficient of Variation
Rishab Dev Ka CV is high
Survey questionnarire where choices has to be selected
Mode and Median are the best options
Normal Distribution and Standard Deviation

Which measures of central tendencies are not affected by extreme high or low values ?
Mode and Median
Normal Distribution and Standard Deviation

Relationship between Airthmetic Mean and Geometric Mean
The airthmetic mean and the geometric mean are equal when volatility in the rate of return is zero.
For a non zero volatility the mean exceeds the geometric mean and as the difference is larger higher the volatility
What is the disadvantage of the range
It only takes into account 2 values