Week 2 Flashcards
Define Descriptive / Summary Statistics (5)
- A quantitive description of main features of data
- A useful summary
- Before actual analysis
- What are the players of the game?
- What are the types of variables? (discrete, continuous, categorical, dummy)
- Get a feel for your data (are there any problems?)
What to pay attention to in summary statistics? (4)
- Min value
- Max value
- Negative values
- Range
What are the measures of central tendency? (3)
Mean, median, mode
How do extreme values affect the mean, median, and mode
- Mean: Influenced by extreme values
- Median: Relatively unaffected by extreme values
- Mode: Not often affected by extreme values (unless there are identical outliers)
Which measure of central tendency do you use for the Nominal Variable?
Mode:
- The numbers in nominal variables only refer to the category
- Calculating the mean would be pointless
Which measure of central tendency do you use for the Ordinal Variable?
Median:
- Median splits to create further categories or creates dichotomies
Dichotomy
A division of 2 things that are being represented as different or opposed.
Using the interquartile range + median of an ordinal variable would split the data into 4 categories.
Example:
x<Q1 = Small, Q1<x<Median (Q2) = Small-Medium, Median (Q2) <x< Q3 = Middle-Large, x> Q3 = Large
Which measure of central tendency do you use for the Interval (scale) or Ratio Variables?
Mean or Median:
Depending on the skewness, this would indicate which central tendency to go for.
Not skewed –> Mean
Skewed –> Median
Define Skewness
- Describes the shape of the distribution –> Symmetry
- Deviation from the normal bell-shape –> (a)symmetry of a distribution
- Skewness = 0 –> Symmetric, Skewness not = 0 –> Asymmetric, Skewed
What is the name for when the skewness values go outside the -1 to +1 range?
Substantially skewed
What kind of skew is a distribution with a longer right tail?
Positively skewed
What is a negatively skewed distribution?
A distribution which has a longer tail to the left
Kurtosis
- Kurtosis describes the degree to which values are found at the tails of the distribution (compared to a normal distribution)
- Can also be seen as how pointy a distribution is (peakedness or flatness)
- It is important to mention whether a value has heavy (lepotkurtic) or light (platykurtic) tails.
Leptokurtic (kurtosis)
- This is where there are few values by the tails and, therefore is pointy (heavy-tailed)
- Kurtosis > 3
- Think of “lep”tokurtic as “leap” –> Tends to be more pointy (leaping upwards)
Platykurtic (kurtosis)
- This is where there are more values found at the tails of the distribution, therefore more rounded and flatter (light-tailed).
- Kurtosis < 3
- Think of “plat”-ykurtic as in “platform”- this is where the distribution is more rounded and flatter.