Week 2 Flashcards
What is the arithmetic mean?
- The arithmetic mean of a set of data is the sum of the data values divided by the number of observations.
What is the median?
The middle observation of a set of observations that are arranged in increasing (or decreasing) order
If the sample size is an even number, the median is average of the two middle observations
What is the mode?
The most frequently occurring value
1 mode = unimodal
2 modes = bimodal
3+ modes = multimodal
Which of these can best describe categorical data?
Categorical data is best described by the median or mode, not the mean.
However, the mode may not represent the true center of the numerical data. For this reason, the mode is used less frequently than either the mean or the median in business applications
What data is best described by the mean?
- Numerical data
- However, in addition to the type of data, another factor to consider is the presence of outliers
What is skewness?
Skewness is the degree of asymmetry observed in a probability distribution
When is something skewed?
When data points on a bell curve are not distributed symmetrically to the left and right sides of the median, the bell curve is skewed
What are the types of skewness?
- Distributions can be positive and right-skewed, or negative and left-skewed
- A normal distribution exhibits zero skewness
What do the different types of skewness mean?
- Negative or left-skewed refers to a longer or fatter tail on the left side of the distribution, while positive, or right-skewed refers to a longer or fatter tail on the right
What happens with the mean and median when the data is positively skewed?
The mean of positively skewed data will be greater than the median
What about negatively skewed data?
The mean of the negatively skewed data will be less than the median
When the distribution is right-skewed, what do we know about the mean?
- A right-side or positive distribution means its tail is more pronounced on the right side than on the left
- Since the distribution is positive, the assumption is that its value is positive
- As such, most of the values end up on the left of the mean
- This means that some of the most extreme values are on the right side
What about the left?
- Negative or left-skewed means the tail is more pronounced on the left rather than the right
- ## Most values are found on the right side of the mean in negative skewness
How can you measure skewness?
There are 2 methods to measuring skewness - Pearson’s first and second coefficients of Skewness
What is Pearson’s first coefficient of skewness?
Subtracts the mode from the mean and divides the difference by the standard deviation
What is Pearson’s second coefficient of skewness?
Subtracts the median from the mean, multiplies the difference by 3 and divides the product by the standard deviation
When would you use the 2 different coefficients?
- Pearson’s first coefficient is used if the data exhibits a strong mode
- Pearson’s second coefficient is used may be preferable if the data has a weak mode or multiple modes
What does skewness tell investors?
- Investors value skewness because it highlights extremes, which are important for short- and medium-term decisions.
- Unlike standard deviation, skewness doesn’t assume a normal distribution, making it better for predicting returns. - - Skewness risk arises when models underestimate the chance of extreme outcomes in skewed data.
What is the geometric mean?
‘The nth root product of n numbers’
- Unlike the arithmetic mean, which adds values and divides by the number of values, the geometric mean multiplies them and then takes the nth root
Why is the geometric mean useful?
It is very useful for calculating portfolio performance, because it takes into account compound interest
The calculation is based solely on the return figures and provides a direct, “apples-to-apples” comparison when evaluating two investment options across multiple time periods.
What are percentiles and quartiles?
These are measures that indicate the location, or position, of a value relative to the entire set of data
Why are percentiles and quartiles used?
They are generally used to describe large data sets, for example surveys covering a nation
How do you find percentiles?
Data must be arranged in order from the smallest to the.largest values
The ‘P’th percentile is a value such that approximately P% of the observations are at or below that number
Percentile formula?
Rank = P / 100 * (N + 1)
P = desired percentile
N = Number of data points in your set
What are quartiles?
- Descriptive measures that separate large data sets into four quarters
How are quartiles created?
The first quartile, Q1, separates approx the smallest 25% of the data
Q2 separates 50% (Median) and Q3 separates 75% of the data
What is the five-number-summary?
- A simple way to describe the distribution of a data set
- Consists of:
- Minimum, Q1, Median (Q2), Q3, Maximum
What is the range?
Difference between largest and smallest observations
Why is the range important?
- It shows data spread
- Quick summary of variability
- Identifies outliers
What is the interquartile range?
- The IQR shows the spread of the middle 50% of the data set