Ap Stats Unit 2 Vocab Flashcards
Sample
(Sum of all the Items in Sample) / (Number of Items in Sample - 1)
Fractiles
numbers that partition, or divide, an ordered data set into equal parts
Frequency Distributions
a table that shows classes or intervals of data entries with a count of the number of entries in each class
Relative Frequency
the portion or percentage of the data that falls into that class
(Class Frequency) / (Sample Size) = f/n
Cumulative Frequency
the sum of the frequencies of that class and all the previous classes
Univariate Data Set
a data set consisting of observations on a single variable
Can be categorical (qualitative) if each observation is a categorical response
Can be numerical (quantitative) if each observations is a number
Multivariate Data Set
a category or value of each of two or more attributes
Continuous Data
a numerical variable whose possible values form an entire interval on a number line; it has an uncountable number of possible outcomes
Pie Graph
A circle that is divided into sectors that represent categories
The area of each sector is proportional to the frequency of each category
Steam-and-Leaf Plot
Each number is separated into a stem and leaf
Paired data sets
when each entry in one data set corresponds to one entry in a second data set
Weighted Mean
The mean of a data set whose entries have varying weights
Symmetric
a vertical line can be drawn through the middle of a graph creating mirror images
Mean = Median = Mode
Uniform (rectangular)
all entries, or classes, in the distribution have equal or approximately equal frequencies (symmetric)
Mean = Median = Mode
Population Mean
(Sum of all items) / (Number of items)
Quartiles
a division of data set into four equal parts
Interquartile Range (IQR)
a measure of variation that gives the range of the middle 50% of the data
it is the difference between the third and first quartiles
Fences
scores used to determine presence of outliers
Frequency (f)
the number of data entries in the class
Upper Class Limit
the greatest number that can belong to the class
Lower Class Limit
the least number that can belong to the class
Class Width (Class Interval)
the distance between lower (or upper) limits of consecutive classes
(Largest data value - smallest data value)/(Desired number of classes)
ALWAYS ROUND UP!!!!!
Range
the difference between the max and min data entries
the data must be quantitative
Midpoint
the sum of the lower and upper limits of the class divided by two (sometimes called the class mark)
(Lower class limit + Upper class limit)/2
Frequency Histogram
A histogram that displays the data of a frequency distribution
Bivariate Data Set
a data set consisting of observations for two variables
Can be written as ordered pairs
Discrete Data
a numerical variable whose possible values correspond to isolate points on a number line; it has a finite or countable number of possible outcomes
Bar Graph
display for categorical data where the horizontal axis is categorical and the vertical axis is frequency
Comparative Bar Chart
constructed by using the same horizontal and vertical axis for the bar charts of two or more groups
Segmented Bar Chart
The bar is divided into segments, with different segments representing different categories
Stem
the entry’s left-most digit(s)
Leaf
the entry’s right-most digit (Laves should always be single digits)
Outlier
an unusually small or large data value
Gaps
spacing in the data set caused by one or more outliers
Dot Plot
data points are plotted on a graph with a horizontal axis
Scatter Plot
Ordered pairs are graphed as points in a coordinate plane
Used to show the relationship between two quantitative variables
Time Series
A data set that is composed of quantitative entries taken at regular intervals over a period of time
Time Series Chart
A chart composed of a time series data set
Median
The value that lies in the middle of the data when the data set is ordered
If the data set has an odd number of entries, the median is the middle number
If the data set has an even number of entries, the median is the mean of the two middle entries
Measure of Central Tendency
a value that represents a typical, or central, entry of a data set
Most commonly used are the mean, median and the mode
Mode
The data entry that occurs with the greatest frequency
Mean
the sum of the data entries divided by the number of entries
Unimodal
a histogram with a single peak (one mode)
Multimodal
a histogram with more than two peaks
Skewed Left (negatively)
he tail extends left
Mean < Median < Mode
Bimodal
when two entries occur with the same greatest frequency
a histogram with two peaks (two modes)
Skewed Right (positively)
the tail extends to the right
Mode < Median < Mean
Sum of Squares
adding the squares of the deviations
Deviation
The different between the entry and the mean of the data set
Population Variance
the mean of the squares of the deviations
Population Standard Deviation
the square root of the population variance
The Empirical Rule (68-95-99.7 Rule)
For the data with a symmetric, bell shaped distribution, the standard deviation has the following characteristics:
About 68% if the data lie within one standard deviation of the mean
About 95% if the data lie within two standard deviations of the mean
About 99.7% if the data lie within three standard deviations of the mean