AFM 112 - Chp 3 Flashcards
Define a frequency table
number of observations in each class/category/data
define a relative frequency table
define a relative frequency table
percentage of observations that fall in each class/category (frequency/total number of observations)
when to use a bar graph vs pie chart
bar graphs uses a bar to represent each category + height of the bar equal to the frequency or relative frequency of the class/category (more specific)
pie chart - each class/category is represented by a slice and the size of each slice is proportional to the relative frequency of the class/category (more general)
What do we use to summarize + describe quantitative data
measures of central tendency + measures of variability of dispersion
define measures of central tendency
capture the tendency of the data to cluster around some central values
define measure of variability
captures the spread of data
what’s the use of a histogram in graphing?
to capture the shape of distribution
what are the histogram distributions described as? and define them
- symmetric - looks the same on both sides from the centre
- skewed left - majority of the data is on the right but there is a little bit of data on the left
- skewed right - majority of the data is on the left but there is a little bit of data on the right
4.bimodal - 2 spikes
5.multimodal - multiple spikes
how do we use mean + median to predict the shape of distribution
- if mean = median, distribution = likely symmetric
- if mean > median, distribution = likely skewed to the right - if difference is significant, it indicates there’s are outliers at the upper end (right side)
- if mean < median, distribution = likely skewed to the left. if the difference is significant, there are outliers at the lower end (left side)
define the interquartile range
distance between the first and third quartile
what’s the spreadsheet formula for first and third quartile?
1st quartile =percentile (e1:e39, 0.25)
3rd quartile = percentile (e1:e39, 0.75
define variance
averaged squared deviation from the mean
define standard deviation
square root of the variance - higher the value, higher the variability
what are the 3 assumptions we can predict if the shape of distirbution is bell shaped and symmetric
- 68% of the observations will fall within 1 standard deviation from the mean - (range = x-1s to x+1s)
- 95% of the observations will fall within 2 standard deviation from the mean, (range = x-2s to x+2s)
- 99.7% of the observations will fall within 3 standard deviation from the mean (range = x-3s to x+3s)
What’s the importance of data understanding?
structure of the data + data captured in each variable