Unit 2: Univariate Data Analysis - Grade 9 Flashcards
Mean
average
- add up numbers and divide by amount
Median
- order data from least to greatest
- find middle observation (if there are two, find average of them)
- n+1/2
Mode
most frequent observation
If number of median is odd
location of median: n + 1/ 2
- n = number of observations
Data can be
Qualitative: no numerical value
- ex: eye color, hair color, nationality, etc…
Quantitative: numerical value
- ex: height, weight, length, capacity, volume, etc…
Quantitative data can be:
Discrete = can be counted (how many?)
- ex: number of cars on a highway
- has no overlapping boundary values and there are gaps between the class intervals. Both inequalities are ≤ , for example 35 ≤ x ≤ 40, 41 ≤ x ≤ 45
Continuous = can not be counted (measured)
- ex: weight, height, speed, etc…
- has overlapping boundary values and is written with continuous intervals, with one < sign and one ≤, for example, 35 < x ≤ 40, 40 < x ≤ 45
Stem-and-leaf diagram
- the stem represents the category figure
- the leafs represent the final digit(s) of each point
- the key tells you how to read the values
Remember to add a key
ex - key 3: 9 represents 39 surveys
Stem-and-leaf diagram meaning
a visual representation of ordered raw data which then can be analyzed.
Any set of data has
5 number/point summary
- minimum value
- Q1: lower quartile (first quartile)
- median (second quartile)
- Q3: upper quartile (third quartile)
- maximum value
Lower quartile (Q1)
the median of the values less than Q2
Interquartile range
Q3 - Q1
Box and whisker diagram
Great for the 5 point summary
- each piece contains 25% of the data
Outliers
a member of a data set which does not fit with the general pattern of the rest of the data.
To determine if a data value is an outlier
1) find Q1 and Q3
2) find IQR = Q3-Q1
3) Q1 - 1.5(IQR), Q3 = 1.5(IQR)
the outlier will be the number either less than or more than these two numbers.
Frequency tables
shows the distribution of observations based on the options in a variable
- consists of quantitive data table, frequency and cumulative frequency
Class width
the difference between the maximum and the minimum possible values in a class interval.
Range from a group frequency table
(upper bound of highest - biggest - class interval) - (lower bound of lowest - smallest - class interval)
Modal class
is the class containing the most data. It has the highest frequency.
Mean for class interval
To find the mean, use the midpoint of each class interval to represent the data values in that class interval.
- ex: (1.20 + 1.30)/ 2 =1.25
Standard deviation
is the measure of the deviation of the data from the mean
- symbol: sigma σ
Variance
average of the sum of squares of deviation (the spread between numbers in a data set).
Cumulative frequency graph
use the cumulative frequency from the table to graph, find the Q1, Q2, and Q3.
- x axis quantitive data
- y axis cumulative frequency