Unit 1 - Exploring Data Flashcards
What are 2 measures of center?
Mean and median
What are 3 measures of spread?
Standard Deviation, Interquartile Range, Range
What is the difference between quantitative and categorical data?
Quantitative data = number data
Categorical data = word data
What graphs are appropriate for categorical data?
Pie charts and bar charts
What graphs are appropriate for quantitative data?
Histograms, box plots, stem plots, dot plots.
Which graphs retain individual observations?
Stem plots and dot plots
About how many bars should your histogram have?
About 5 or 6
When describing a distribution, what 3 things should you include?
Shape, center, spread
A distribution that has approximately the same frequency for each data value is ___________.
Uniform
A distribution that has a greater frequency of large values is __________________.
Left Skew
A distribution that has a greater frequency of small values is _________________.
Right Skew
The mean is pulled in the direction of the _______.
Skew and outliers
In a left skewed distribution, the median is ___________ than the mean.
greater than
In a unimodal and symmetrical distribution, the mean is _________________ the median.
about the same as
Define statistic
a number that describes a sample (like sample mean, sample median, sample min, etc)
Define parameter
a number that describes a population (like population mean, population range, etc)
What is a 5 number summary?
Min, Q1, Median, Q3, Max
What is the formula for calculating the upper boundary for outliers?
Q3 + 1.5(IQR)
What is the formula for calculating the lower boundary for outliers?
Q1 - 1.5(IQR)
For what kind of distributions do we prefer to use the median and IQR?
Skewed distributions
What measure of center and spread do we use if a distribution is approximately symmetrical?
Mean and Standard Deviation
What does it mean if a statistic is resistant?
The statistic is not easily influenced or changed by skew and outliers
What does it mean if a statistic is non-resistant?
The statistic is easily influenced and changed by skew and outliers
Name 2 resistant statistics.
Median, IQR
Name 2 non-resistant statistics
Mean, Standard Deviation
The median of a data set of the price of school lunch items is $2.35. Interpret this number.
The average price of school lunch items is $2.35.
The standard deviation of a data set of the price of school lunch items is $0.48. Interpret this number.
0.48 is the average deviation of school lunch prices from the mean school lunch price
Which graphs are most appropriate for large data sets, or data with a large range of numbers?
Histograms and box plots
What is the symbol for sample mean?
X-bar
What is the symbol for population mean?
Mew
What is the symbol for sample standard deviation?
s
What is the symbol for population standard deviation?
sigma
If you add or subtract the same value to every number in a data set, what happens to the measures of location (mean, median, min, max, percentiles)?
They change.
If you add or subtract the same value to every number in a data set, what happens to the measures of spread (range, IQR, S.D.)?
The stay the same.
If you multiply or divide the numbers in a data set, which summary statistics will change.
All of them: mean, median, range, IQR, S.D. etc.