Normal distribution and standardising data Flashcards
What is the standard deviation?
Measures the average amount by which all the values deviate from the mean
Represented in brackets after the mean
What does a lower standard deviation mean?
The mean is more reliable as there is less spread in the data
When should we change the data?
If there is an error in the results inputted we could go back to the original data collected
Or you may need to transform the data for accurate comparison
What is scaling data?
Multiplying all of the values in a data set by a constant
What is standardising data?
Transforming data into a common, consistent forming
What happens when we add or subtract by a constant number to each value in a data set?
Changes the mean by the same amount added or subtracted
The SD remains the same
What happens when we multiply or divide to scale?
The mean increases or decreases by the proportion being multiplied or divided by
SD also increases or decreases by the proportion being multiplied or divided by
What are Z scores?
They measure the number of SDs an observation is from the mean
Positive Z score means the observation is above the mean
Negative Z score means observation is below the mean
0 Z - score means the observation equals the mean
How do we calculate a Z-score?
Z score = observation - mean
SD
What are the rules of Z-scores?
The mean of all the Z-scores is always 0
The SD of all the Z-scores is always 1
Only works with the whole data set from which the mean and SD were calculated from
What are the properties of the normal distribution (Gaussian distribution)?
The curve is symmetrical about the mean
The mean is equal to the median
Most observations are closer to the mean
Few observations at any distance from the mean on either side of it
Lines don’t touch the X-axis
What does the entire area under the normal curve equal?
1
What are the percentages under each section of a normal curve?
Between -1SD and the mean 34.1%
Between +1SD and the mean 34.1%
so all together is 68.2%
Between -1SD and -2SD is 13.6%
Between +1SD and +2SD is 13.6%
Beyond -2SD is 2.3%
Beyond +2SD is 2.3%
What if we don’t have a whole number Z score?
We use normal tables
Cells shows the proportion of the area under the entire curve that lies between the mean and a positive Z-score
1st column gives the first decimal place
Top row gives 2nd decimal place
How do we answer a question such as what proportion of 26 month old girls in our sample have weight for age Z scores greater than 0.39?
We draw a normal curve
Use the Normal table to get the proportion associated with the Z score
Then minus this from 0.5 as this is the total area under this side of the curve
How do we answer a question such as Health experts say that the top 15% of girls are likely to be overweight what does this equate to in terms of weight?
We draw a normal curve
We know that 50 - 15 is 35 so use the normal table in reverse to find the Z score
We then re arrange the formula to calculate the observation