Descriptive Statistics Flashcards
Box plot
A graph that gives a quick picture of the middle 50% of the data. To graph a box plot, calculate: minimum, Q1, median, Q3, and maximum.
First Quartile
The value that is the median of the lower half of the ordered data set. Also called Q1 or the 25th percentile.
Frequency
The number of times a value of the data occurs.
Frequency Polygon
A line graph that uses intervals to display ranges of large amounts of data; useful when graphing repeating data points.
Frequency Table
A data representation in which grouped data is displayed along with the corresponding frequencies.
Histogram
A graphical representation of the distribution of data with x representing data and y representing frequency. Used for large, continuous, quantitative datasets.
Time Series Graph
Helpful when viewing large amounts of data for one variable over time. Frequency on x-axis, variable values on y-axis.
Interquartile Range (IQR)
The range of the middle 50% of the data values, found by subtracting Q1 from Q3. Used to detect outliers with Q3 + 1.5IQR or Q1 - 1.5IQR.
Interval
Also called a class interval; represents a range of data and is used when displaying large datasets.
Mean
A number measuring central tendency, also called ‘average’. For sample: x̄ = sum/n; for population: μ = sum/N.
Median
The middle value in ordered data; half the values are below and half above. Preferred when data has outliers.
Midpoint
The mean of an interval in a frequency table; approximated by (lower boundary + upper boundary)/2.
Mode
The value that appears most frequently in a data set.
Outlier
An observation that does not fit the rest of the data. Can be detected using the IQR rule.
Paired Data Set
Two data sets that are the same size and each point in one is matched with a point in the other.
Percentile
Divides ordered data into hundredths. For example, the 50th percentile is the median.
Quartiles
Values dividing the data into quarters. Q1 = 25th, Q2 = 50th (median), Q3 = 75th percentile.
Relative Frequency
The ratio of the number of times a value occurs to the total number of outcomes.
Skewed
Describes asymmetry in data. Skewed left: lower values spread out. Skewed right: higher values spread out.
Standard Deviation
Measures spread of data from the mean. Sample: s, Population: σ. Used for comparison of values to the mean.
Variance
The mean of the squared deviations from the mean. Square of standard deviation. Sample variance: sum of squares/(n-1).
Skewness & Center
In skewed distributions: mean ≠ median ≠ mode. In symmetric data, mean ≈ median ≈ mode.
Approximating the Mean
When data is grouped, estimate the mean by using midpoints multiplied by frequencies, then divide by total frequency.