Topic 4: Normal Model Flashcards
What is the normal curve?
It is a probability distribution symmetric about the mean, showing that data around the mean are more frequent in occurrence than data far from the mean
It is also recognised as a ‘bell curve’
Discovered by De Moivre
It is the probability density function (pdf) - f(x). The PDF is a special function describing the chance associated with a continuous variable X, over all values of X
What are the two types of normal curves?
The standard normal curve (z)
The general normal curve (x)
What is the standard normal curve?
It has a mean of 0 and SD of 1
What is the general normal curve?
It can have any mean and SD
Which scenario allows us to use the normal curve as an approximation to the area under the histogram?
When the normal curves seems to fit the histogram
What is the normal curve notation?
X ~ N(μ, σ^2)
What are the 2 special properties of a normal curve?
1) All normal curves satisfy the ‘68%-95%-99.7%’ rule
2) Any general normal can be rescaled into the standard normal through the use of standard units (z score)
What does the z score measure
How many SDs a point is above or below the mean
What is the 68%-95%-99.7% rule?
Area under 1sd out from the mean in both directions is 68% of total values
Area under 2sd out from the mean in both directions is 95% of total values
Area under 3sd out from the mean in both directions is 99.7% of total values
How is standard unit (z score) calculated?
(data point - mean) / SD
What are the limitations of using statistical models?
All models are approximations, and thus they are all wrong. However, only some models are useful, so we have to ask ‘which model is best for this particular application’?
How do we know when to use the normal curve?
Does the histogram look normal?
Do the proportions look right?
Does the quantile - quantile (QQ) plot look like a straight line?
Shapiro Test
What does it mean by ‘does the histogram look normal?’
Check whether the histogram looks bell shaped, and if it has any outliers or long tails –> if yes we can use the normal approximation/curve / distribution
What does it mean by ‘do the proportions look right?’
Involves checking does 1sd above and below the mean give us ~68% of results? and continuing on for other sds –> if yes we can use the normal approximation/curve / distribution
What does it mean by ‘does the quantile-quantile (QQ) plot look like a straight line?’
A Q–Q plot is a probability plot, a graphical method for comparing two probability distributions by plotting their quantiles against each other. A point on the plot corresponds to one of the quantiles of the second distribution plotted against the same quantile of the first distribution.
If it looks like a straight diagonal line, then we can use the normal curve/approximation / distribution