week 2 (central tendency, variance and z scores) Flashcards
mode
the most common score in a data set. The mode is unaffected by extreme scores. Reflects a score which actually occurred
median
score corresponding to the point at which 50% of the scores fall below, when data is arranged in numerical order. unaffected by extreme scores. If have even data, median is the average of the middle 2 placements. eg data 4, 8, 10, 12. median is 9.
mean
the average score in a data set. (sum all X’s and then divide by N). Most common measure for describing central tendency, but is affected by extreme scores. Easier to work with and manipulate with statistics. The sample mean is usually a better measure of a population mean, compared with mode and median.
trimmed means
mean where some of the extreme data scores have been discarded.
variability
tells us how far the scores vary from the mean.
range
reports minimum and maximum scores for a data set
inter-quartile range
extreme scores (upper 25% and lower 25%) removed, and then report the remaining minimum and maximum which is the inter-quartile range
average deviation
calculate each deviation from the mean, sum them, then divide by N to give the average deviation BUT this is likely to equal zero, THEREFORE do MEAN ABSOLUTE DEVIATION (m.a.d.) where use absolute deviations (no negatives) and then sum them, then divide by N.
variance (S2)
S2=SS/(N-1)
SS = sum of the squared deviations
SS= Σ(X-X-)2
Note the (N-1) is a statistical correction because the variance in a sample will be less than the population.
S=standard deviation
S= square root of (S2)
this is the most common statistic for variability. Provides a measure of the averagedeviation from the mean
Normal distribution
Standard normal distribution has a mean of 0 and a standard deviation of 1. Any score in a normally distributed data set, can be converted to a standard normal distribution by using the formula for Z scores. This converts each score, into how many standard deviation units it is from the mean. eg. A z-score of +2 indicates the score is 2 standard deviations above the mean, and a z-score of -1.5 indicates it is 1.5standard deviations below the mean.
Z= deviation from mean/ standard deviation
Z=(X-mean) / S
Note that is data is skewed, and not normally distributed, cannot make use of z scores.
Note that when an attribute is normally distributed, there will always be 34.13% of the population between the mean and +1 standard deviation, 13.59% population scoring between +1 standard deviation and +2 standard deviations, and 2.28% scoring above 2 standard deviations from the mean.
Z table
When look at Z table, find the Z score and then can see larger portion and smaller portion. The smaller portion would be the same for same magnitude but negative z. The smaller portion tells us the probability of obtaining the said score or higher, by chance.
Confidence limits around Z
To be 95% certain that a score will lie between 2 values, need to remove 2.5% of scores from either end.
- consult Z table. Find smaller portion of 0.025 (this is 2.5%), here see that Z= 1.96.
- Calculate scores that represent 1.96 standard deviations below and above the mean.
X= X-+/-1.96S. This will give you the values within which you are 95% confident any score should randomly score between in a normal distribution.