Reading 8 - Statistical Concepts and Market Returns Flashcards
Population
A population is defined as all members of a specified group. A sample is a subset of a population.
Parameter
A parameter is any descriptive measure of a population. A sample statistic (statistic, for short) is a quantity computed from or used to describe a sample.
Four major scales for data measurements
Data measurements are taken using one of four major scales: nominal, ordinal, interval, or ratio. Nominal scales categorize data but do not rank them. Ordinal scales sort data into categories that are ordered with respect to some characteristic. Interval scales provide not only ranking but also assurance that the differences between scale values are equal. Ratio scales have all the characteristics of interval scales as well as a true zero point as the origin. The scale on which data are measured determines the type of analysis that can be performed on the data.
Frequency distribution
A frequency distribution is a tabular display of data summarized into a relatively small number of intervals. Frequency distributions permit us to evaluate how data are distributed.
The relative frequency of observations in an interval
The relative frequency of observations in an interval is the number of observations in the interval divided by the total number of observations. The cumulative relative frequency cumulates (adds up) the relative frequencies as we move from the first interval to the last, thus giving the fraction of the observations that are less than the upper limit of each interval.
Histogram
A histogram is a bar chart of data that have been grouped into a frequency distribution. A frequency polygon is a graph of frequency distributions obtained by drawing straight lines joining successive points representing the class frequencies.
List of sample statistics and purpose
Sample statistics such as measures of central tendency, measures of dispersion, skewness, and kurtosis help with investment analysis, particularly in making probabilistic statements about returns.
List measure of central tendency and purpose
Measures of central tendency specify where data are centered and include the (arithmetic) mean, median, and mode (most frequently occurring value). The mean is the sum of the observations divided by the number of observations. The median is the value of the middle item (or the mean of the values of the two middle items) when the items in a set are sorted into ascending or descending order. The mean is the most frequently used measure of central tendency. The median is not influenced by extreme values and is most useful in the case of skewed distributions. The mode is the only measure of central tendency that can be used with nominal data.
A portfolio’s return
A portfolio’s return is a weighted mean return computed from the returns on the individual assets, where the weight applied to each asset’s return is the fraction of the portfolio invested in that asset.
Geometric mean
The geometric mean, G, of a set of observations X1, X2, …, X**n is G = (X1*X2*X3…Xn)^(1/n) with Xi >= 0 for i = 1, 2, …, n. The geometric mean is especially important in reporting compound growth rates for time series data.
Quantiles and types
Quantiles such as the median, quartiles, quintiles, deciles, and percentiles are location parameters that divide a distribution into halves, quarters, fifths, tenths, and hundredths, respectively.
Dispersion and types
Dispersion measures such as the variance, standard deviation, and mean absolute deviation (MAD) describe the variability of outcomes around the arithmetic mean.
Range
Range is defined as the maximum value minus the minimum value. Range has only a limited scope because it uses information from only two observations.
MAD
MAD for a sample is MAD = SUM(ABS(Xi-Xmean))/n where Xmean is the sample mean and n is the number of observations in the sample.
Variance & Standard deviation
The variance is the average of the squared deviations around the mean, and the standard deviation is the positive square root of variance. In computing sample variance (s2) and sample standard deviation, the average squared deviation is computed using a divisor equal to the sample size minus 1.