Ch 1 Flashcards
Define Statistics
the science of data
What is data analysis?
the process of organizing, displaying, summarizing, and asking questions about data
Individuals
objects described by a set of data
Variable
any characteristic of an individual
Categorical (qualitative) variable
places an individual into one of several groups or categories (doesn’t make sense to find an average)
quantitative variable
takes numerical values for which it makes sense to find an average
Would grade level be categorical or quantitative?
categorical
Variables change or do not change?
variables change
Distribution
graph (or table), tells us what values a variable takes and how often it takes those values
inference
making a conclusion
population is or is not a whole?
population is a whole
Most common graph for categorical data?
pie chart and bar graph
what does a two way table describe?
describes two categorical variables, organizing counts according to a row variable and a column variable
Example of a two way table?
gender and richness chance charted
Marginal distribution
the totals (usually in the margin)
conditional distribution
percents (specific value)
what to use when describing graphs?
SOCS
shape, outliers, center, and spread
ex of shape
symmetric, skewed
ex of center
mean, median (middle)
ex of spread
standard deviation, IQR, range
what does sigma mean?
add, sum of
the mean is also what?
-
x , the average
The mean is also known as the what?
balance point
the ___ is sensitive to extreme values and is not resistant to extremes and/or outliers
mean
standard deviation
the median also means what?
the middle number
when a roughly symmetric distribution is presented, the median and mean are ____
close together
when an exactly symmetric distribution is presented, the median and mean are ____
exactly the same
when a skewed distribution is presented, the median and mean are ____
the mean is farther out in the long tail than is the median
How to find the range?
highest # - lowest #
what are quartiles?
“medians” of each half of the data
how to find the interquartile range?
Q3-Q1
standard deviation
typical distance from the mean
varience
standard deviation squared
5 number summary
(usually represented with a boxplot) max, min, first q, third q, and median
How to use the 1.5 * IQR rule for outliers?
find IQR
multiple by 1.5
Q3 + ^ = what the outliers are
Q1 - ^^ = what the outliers are