Assesing normality and QQ plots - Flashcards
1
Q
Model distribution - 3
A
- Theoretical probability distribution
- Describes unknown true population distribution
- Example: t-dist, chi square district, exponential dist
2
Q
Model distribution - 3
A
- Theoretical probability distribution
- Describes unknown true population distribution
- Example: t-dist, chi square district, exponential dist
3
Q
Assessing normality - 1 + 3
A
- When to assume normal distribution as model distribution?
- Shape of histogram, if it deviates too much from bell then not likely.
- If QQ plot points follow approximately a straight line
- If QQ is straight line y = a + bx then mean is intercept of line and sd is b
4
Q
Sample size in assessing normality - 1
A
- For small n there is more variation, so histograms and QQ plots are more reliable for large n
5
Q
QQ plot - 4
A
- Quantile plot
- Sorts data in ascending order
- Plots data against quantiles calculated from theoretical distribution
- Zai is value to which (2i - 1) / 2n of the area is to the left.
6
Q
Location-scale family - 4
A
- Family of probability distributions
- Each member of family can be obtained by changes in location or scale
- Changes are linear transformations Y = a+ bX
- Normal distributions form a location scale family
7
Q
Empirical assess QQ plot - 3
A
- X axis: Theoretical quantiles
- Y axis: sample quantiles of data set
- Used to assess if distribution can be used as model distribution
8
Q
Theoretical QQ plot - 5
A
- X axis: Theoretical quantiles of probability distribution
- Y axis: Theoretical quantiles of another probability distribution
- Used to compare shape of two probability distributions
- Verifies whether they belong to same location-scale family
9
Q
Empirical Comparison QQ - 4
A
- X axis: Sample quantiles of data set
- Y axis: Sample quantiles of another data set
- Compare shapes of data distributions
- Assesses whether they could originate from model distribution belonging to same location-scale family
10
Q
How to interpret QQ p-lots - 1 + 4
A
- By drawing imaginary straight line through the middle of QQ plot
- If left part of plot is below straight line –> left tail of sample heavier than left tail of snd
- Left part is above straight line –> left tail of snd heavier than left tail of sample
- If right part of plot is above straight line –> right tail of sample is heavier than right tail of snd
- If right part of plot is below straight line –> right tail of snd heavier than right tail of sample
11
Q
Hoe to assess normality of data with QQ plot - 4
A
- Make normal qq plot, qqnorm()
- If points follow approx straight line y = a + bx (b > 0), then N(a,bˆ2) is reasonable distribution
- If points don’t follow straight line then most likely not from normal distribution
- If points don’t follow straight line probably they come from location scale family, with lighter or heavier tails