interpreting numerical summaries Flashcards
mean
the average or ‘typical’ value of a dataset
median
the midpoint, separating to lower 50% with the upper 50% of the ordered data
- > odd dataset - the middle number
- > even dataset - the average of the 2 middle numbers (A+B)/2
mode
the most common value(s) in a dataset
symmetry
degree to which the distribution looks like a mirror image when split down the centre
modality
the number of prominent (import) peaks in the distribution
skewness
the degree to which one tail of a distribution is spread farther than the other
- > if data is spread out more towards the right, its referred as right-skewed (mean is more than median)
- > if data is spread out more towards the left, its referred as left-skewed (mean is less than median)
data of measure of spread
jennifer’s data is inconsistent
harry’s data is consistent
both have similar means
range
difference between maximum and minimum values
problem with range
the maximum or minimum values of the data are usually very off from the main data
could give misleading picture or graph of data
variance and standard deviation
the ‘average’ distance between values and the mean
1) mean is calculated
2) difference from mean is taken
3) square each difference from mean values
4) add up the square differences then divide by n-1, gives variance (unit2)
5) then square root gives the standard deviation
percentiles and quartiles
percentile a value below which a particular percentage of a distribution lies
quartiles divided the distribution into 4 equal-sized groups
Q1: 25th percentile
Q2: 50th percentile (median)
Q3: 75th percentile
finding quartiles
split the ordered data in half (median)
find the median of the lower half or the upper half
the median of the lower half or upper half is the quartile
for an odd number of values, include the median in both halves
for an even number of values, split the distribution into equal-sized halves
five-number summary
- > minimum
- > median (centre of the distribution)
- > maximum
- > Q1
- > Q3
interquartile range IQR = Q3 - Q1, to find the range of the middle 50%
interquartile range (IQR)
Q3 - Q1 = interquartiler range (IQR)