Organizing, Visualizing, and Describing Data Flashcards
Continuous data
- can take on any numerical value in a specified range of values
- ex. future value
Discrete data
- number has a limited number of values.
- ex. monthly = 12, quarterly = 4, etc
Nominal data (2)
AKA quantitative data
- continuous
- discrete
Categorical data (2)
aka qualitative data
- describe a quality or characteristic of a group of observations
- nominal data
- ordinal data
Nominal data
- grouping names
- cannot be organized in a logical order
- ex. classifying stocks into different sectors, such as energy, information tech, etc
Ordinal data
- can be organized in logical order or ranked
ex. rating of mutual funds with the worst performance - there is an order, but can’t distinguish values of magnitude
Time-series data
- observations of 1 subject taken at specif and equal spaced intervals of time
ex quarterly returns of Apple 2019-2020
Cross-sectional data
- observations of multiple subjects taken at specific points in time
ex. 2019 Q1 quarterly returns of a group of simial stocks
Panel data
- presented as a table
- groups observations through time on one or more variables for multiple subjects
- quarterly returns for MSFT, Orcal, and HP from 2019 - 2020
One-Dimensional array
- one row of data
- a single variable - closing price of a stock on x day
Two-dimensional array
- consists of columns and rows to hold multiple variables and multiple observations
- a firm’s quarterly revenue, EPS, and DPS for past two years
Tree-map
- graphical tool to display categorical data
-
Arithmetic Mean
- simple mean
- the center of gravity of a data set
- sensitive to extreme values (outliers)
- appropriate for forecasting single period returns and expected returns
Sample mean
- arithmetic mean of a sample
- ^x sample mean
- mue (^m) population mean
Winsorized mean
- a way of dealing with outliers
- a 95% winsorized mean takes the bottom 2.5% off and the top 2.5% off
Median of even number of observations
- n = 4
- 3, 9, 10, 20: take value 2&3 and add then / 2
- (9+10)/2
Geometric Mean
- used to calculate the average return of an investment
- represents the growth rate of an investment
- represents the compound rate of return of an investment
- appropriate to measure past performance over multiple periods
= [(1+r)(1+r2)(1+rn)]^1/n -1
Harmonic mean
- used to find average purchase price for equal periodic investments
= n / sum of 1/xi
3 years / (1/$10) + (1 / $15) + (1/$20) = $13.85
Relationship of Geometric mean to the arithmetic mean
-geo mean will always be less than arithmetic mean
Quantiles:
quartiles, quintiles, deciles, percentiles
formula for the position of a percentile in a data set
4 quarters, 5 quintiles, 10 deciles, hundredths
- arrange data in ascending order (low to high)
= Ly = (n + 1) * (y / 100)
When to use each mean:
a. Arithmetic mean
b. Geometric mean
c. Weighted mean
d. Harmonic mean
e. Trimmed mean
f. Winsorized mean
a. with single period or cross-sectional data
b. with time-series data
c. when different observations have different weights
d. find avg purchase price for equal periodic investments
e. when data has extreme outliers
f. when the data has extreme outliers
Interquartile range:
- the difference between the third and first quartiles
List the Measures of Dispersion
- range
- Mean Absolute Deviation
- Variance (population, sample)
- Standard Deviation (population, sample)
Range formula
= max value - min value