Week 1 Flashcards
Descriptive statistics
summarize data using measures (e.g. of location and dispersion) that describe a distribution.
Inferential statistics:
use probability theory to analyze phenomena subject to randomness (e.g. hypothesis testing).
Parameter definition
A measurable characteristic of a population
Statistic definition
A measurable characteristic of a sample
Objective of Statistics
Estimate parameters using the data. Given a sample we want to infer the parameters that characterize the population, or the law of motion of a process (in general the DGP)
Note about INPUT and OUTPUT
Bear always in mind that our INPUT is the observed data and the OUTPUT is estimates for the unknown parameters.
Law of Large Numbers
A theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
Central Limit Theorem
establishes that, in most situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (a bell curve) even if the original variables themselves are not normally distributed.
Absolute frequency
The absolute frequency is simply the total number of observations or trials within a given range.
Relative frequency
How often something happens divided by all outcomes.
Frequency distribution
The association of modalities and their frequencies. Frequency is how often something occurs.
Histogram definition
A diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval.
Calculating the density of the modality
Use relative frequencies and divide each modality by the width of the modality.
Sample Statistics: Measures of Location
- Arithmetic mean
- Median
- Percentile
- Quartile
Arithmetic mean
the average of a set of numerical values, as calculated by adding them together and dividing by the number of terms in the set.
Median
The middle number (in a sorted list of numbers).
Percentile
Values such that the distribution is divided in 100 equal parts. Computation:
1) Arrange the sample in ascending order
2) Observe the index i of the position of the pth percentile

Sample Statistics: Measures of Variability
- Variance
- Standard deviation
Variance
The average of the squared differences from the Mean.
- First step is to find the Mean
- Now we calculate each dog’s hight difference from the Mean
- To calculate the Variance, take each difference, square it, and then average the result
- In Population divide by N
- In Sample divide by N-1
Standard Deviation
And the Standard Deviation is just the square root of Variance
Shape of distributions
- Skewness: measure of asymmetry (third moment of the data)
- Kurtosis: measure of “tailedness” (fourth moment of the data)
Skewness
Data can be “skewed”, meaning it tends to have a long tail on one side or the other.

Kurtosis
