ap stat vocab Flashcards
individuals
Objects described by a set of data. Individuals may
be people, animals, or things.
variable
Any characteristic of an individual. A variable can take
different values for different individuals.
categorical variable
Variable that places an individual into one
of several groups or categories.
quantitative variable
Variable that takes numerical values for
which it makes sense to find an average.
discrete variables
Takes a fixed set of possible values with
gaps between. The probability distribution of a discrete random variable gives its possible values and their probabilities.
continuous
Variable that takes all values in an
interval of numbers.
univariate data
looks at only one variable
bivariate data
looks at the relationship between two variables
population
In a statistical study, the entire group of individuals
we want information about.
sample
Subset of individuals in the population from which we
actually collect data.
census
Study that attempts to collect data from every individual
in the population.
distribution
Tells what values a variable takes and how often it
takes these values.
inference
Drawing conclusions that go beyond the data at hand.
frequency table
Table that displays the count (frequency) of
observations in each category or class.
relative frequency table
Table that shows the percents (relative
frequencies) of observations in each category or class.
roundoff error
Difference between the calculated approxima-
tion of a number and its exact mathematical value.
pie chart
Chart that shows the distribution of a categorical vari-
able as a “pie” whose slices are sized by the counts or percents for the categories.
bar graph
Graph used to display the distribution of a categorical
variable or to compare the sizes of different quantities. The horizontal axis of a bar graph identifies the categories or quantities being compared.
two-way table
Table of counts that organizes data about two cat-
categorical variables.
marginal distribution
The distribution of one of the categorical variables in a two-way table of counts among all individuals described by the table.
conditional distribution
Term that describes the values of one variable among individuals who have a specific value of another variable.
segmented bar graph
Graph used to compare the distribution of
a categorical variable in each of several groups. For each group, there is a single bar with “segments” that correspond to the different values of the categorical variable.
side-by-side bar graph
Graph used to compare the distribution
of a categorical variable in each of several groups. For each value of the categorical variable, there is a bar corresponding to each group.
association
Knowing the value of one variable helps predict the
value of the other.
simpson’s paradox
a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.
dotplot
Simple graph that shows each data value as a dot above
its location on a number line.
shape
describes the type of graph (symmetrical, skewed left, skewed right)
mode
Value or class in a statistical distribution having the greatest frequency.
center
Study that attempts to collect data from every individual
in the population.
spread
the extent to which a distribution is stretched or squeezed.
range
The maximum value minus the minimum value for a set
of quantitative data.
outlier
Individual value that falls outside the overall pattern of a
distribution.
symmetric
A graph in which the right and left sides are approxi-
mately mirror images of each other.
skewed right
the right side of the graph (containing the half of the observations with larger values) is much longer than the left side.
skewed left
the left side of the graph is much longer than the right side.
unimodal
A graph of quantitative data with a single peak.
bimodal
A graph of quantitative data with two clear peaks.
multimodal
A graph of quantitative data with more than two
clear peaks.
stemplot
Simple graphical display for fairly small data sets that gives a quick picture of the shape of a distribution while including the actual numerical values in the graph.
splitting stems
Method for spreading out a stemplot that has too few stems.
back-to-back stem plots
Plot used to compare the distribution of a quantitative variable for two groups.
histogram
Graph that displays the distribution of a quantitative variable. The horizontal axis is marked in the units of measurement for the variable. The vertical axis contains the scale of counts or percents.
mean
Arithmetic average.
median
The midpoint of a distribution; the number such that
about half the observations are smaller and about half are larger.
interquartile range (IQR)
IQR = Q3 – Q1
five-number summary
Smallest observation, first quartile, median, third quartile, and largest observation, written in order from smallest to largest.
boxplot
Graph of the five-number summary. The box spans the quartiles and shows the spread of the central half of the distribution. The median is marked within the box. Lines extend from the box to the smallest and largest observations that are not outliers. Outliers are marked with a special symbol such as an asterisk (*).
standard deviation
Statistic that measures the typical distance
of the values in a distribution from the mean. It is calculated by finding an “average” of the squared distances and then taking the square root.
variance
“Average” squared deviation of the observations in a
data set from their mean.