AP Stat Ch 2 Flashcards
Strategy for exploring data:
Always plot data: make a graph (histogram or stemplot)
Look for overall pattern (shape, center, spread) and for outliers
Calculate a numerical summary to briefly describe center and spread
Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. The curve is a mathematical model for the distribution. It is an idealized description that gives an overall pattern of the data but ignores minor irregularities as well as any outliers.
Empirical Rule
In a normal distribution, 68% of the data is within 1 standard deviation of the mean, 95% is within 2 standard deviations and 99.7% is within 3 standard deviations.
How value of standard deviation affects bell curve
Larger the standard deviation, the wider the curve is.
Bell curve with s=1 is much taller and narrower than one with S=3
Percentile
One way to describe performance, or location in a distribution, is to use percentiles.
The pth percentile of a distribution is the value with p percent of the observations less than it.
Example of percentile:
Jenny’s score is the 22nd highest score. There are 25 scores. What percentile did she score.
There are 21 values less than Jenny’s– 21/25 = 84th percentile.
N-1 / total
Another example of percentile: Katie got the highest score in the class on the exam. There are 25 people in the class.
24 scores less than Katie.
24/25= 96th percentile.
Relative cumulative frequency
Instead of wanting to know which percent of the data falls into a particular class, we often want to know which percent falls below a certain value. To make this possible, we will compute the relative cumulative frequency for each class, which is the sum of the relative frequency of that class and all the classes below it.
Add up relative frequency from groups at or below–this sum is the percent that falls at or above a value
Example of relative cumulative frequency:
Say that 2 presidents were inaugurated from 40-44, 7 from 45-49, and 13 from 50-54, …
44 total presidents
What’s the RCF in the 40-44, 45-49, 50-54 groups?
In 40-44, 2/44 presidents = 4.5%. So rel f = 4.5%. RCF = 4.5% too
45-49– 7/44 presidents, rel f =15.9%. RCF = 15.9+4.5 = 20.5%. 20.5% of the presidents inaugurated were 49 or less
50-54 – 13/44 = 29.5%. RCF = 20.5+29.5=50%. 50% of the presidents were 54 or less when inaugurated.
Ogive
Graph of a cumulative relative frequency distribution is referred to as an ogive.
How to graph an ogive
Label and scale your axes and title your graph Plot a point corresponding to the RCF in each class interval at the left endpoint of the next class interval. For example, plot a point at 4.5% above the age value 45 to indicate that 4.5% of presidents were inaugurated before they were 45 years old. Begging your ogive with an height of 0% at the left endpoint of the lowest class interval. Last point should be at a height of 100% Y axis is RCF and x axes is the variable-- for presidents it is the ages 4.5% of presidents in 40-44. So point at (45,4.5%)
Standardizing
Another way to describe position is to tell how many standard deviations above or below the mean it is. Converting scores from original values to standard deviation units is known as standardizing.
Why do we standardize?
To allow for us to compare different approx. normal distributions.
Every set of data has different set of values. For example, heights of people might range from 18 inches to 8 feet and weights can range from one pound to 500 pounds. Those wide ranges make it difficult to analyze data so we standardize the normal curve, setting it to have a mean of zero and a standard deviation of one.
Standardized score/ z score
Z score tells us how many standard deviations an observation is from the mean, and in which direction.
How to calculate z score
(X- x bar)/ s
Value - mean divided by standard deviation.
Example of z score:
Jenny got an 86 on stat test. Mean is 80 and S = 6.07.
82 on physics test. Mean is 76, s = 4.
Which did she performs better relative to her class?
ZSTAT = 86-80 / 6.07 = .99 ZPHYS = 82-76 / 4 = 1.5 Higher z score on physics, so did better relative to her class on the physics.