2. Describing Data with Tables and Graphs Flashcards
Frequency Distribution
A collection of observations produced by sorting observations into classes and showing their frequency (f) of occurence in each class.
Frequency Distribution for Ungrouped Data
A frequency distribution produced whenever observations are sorted into classes of single values.
Students in a theater arts appreciation class rated the classic film The Wizard of Oz on a 10-point scale, ranging from 1 (poor) to 10 (excellent), as follows:
3 7 2 7 8 3 1 4 10 3 2 5 3 5 8 9 7 6 3 7 8 9 7 3 6
Since the number of possible values is relatively small, - only 10 - it’s appropriate to construct a frequency distribution for ungrouped data. Do this.
10 1 9 2 8 3 7 5 6 2 5 2 4 1 3 6 2 2 1 1 Total 25
Frequency Distribution for Grouped Data
A frequency distribution produced whenever observations are sorted into classes of more than one value.
Unit of Measurement
The smallest possible difference between scores.
What are the 3 essentials guidelines for frequency distributions ?
- Each observation should be included inone, and only one, class.
- List all classes, even those with zero frequencies.
- All classes should have equal intervals.
What are the 4 optional guidelines for frequency distributions?
- All classes should have both an upper boundary and a lower boundary.
- Select the class interval from convenient numbers, such as 1, 2, 3, … 10, particularly 5 and 10 or multiples of 5 and 10.
- The lower boundary of each class interval should be a multiple of the class interval.
- Aim for a total of approximately 10 classes.
Real Limits
Located at the midpoint of the gap between adjacent tabled boundaries.
The IQ scores for a group of 35 high school dropouts are as follows: 91 85 84 79 80 87 96 75 86 104 95 71 105 90 77 123 80 100 93 108 98 69 99 95 90 110 109 94 100 103 112 90 90 98 89
a) Construct a frequency distribution for grouped data.
b) Specify the real limits for the lowest class interval in this frequency distribution.
a) Calculating the class width,
(123-69)/10 = 54/10 = 5.4
Round off to a convenient number, such as 5.
IQ f 120-124 1 115-119 0 110-114 2 105-109 3 100-104 4 95-99 6 90-94 7 85-89 4 80-84 3 75-79 3 70-74 1 65-69 1 TOTAL 35
b) 64.5-69.5
What are some possible poor features of the following frequency distribution?
Estimated weekly TV viewing time (hrs) for 250 sixth graders
VIEWING TIME f 35-above 2 30-34 5 25-30 29 20-22 60 15-19 60 10-14 34 5-9 31 0-4 29 TOTAL 250
- Not all observations can be assigned to one and only one class (because of gap between 20-22 and 25-30 and overlap between 25-30 and 30-34).
- All classes are not equal in width (25-30 versus 30-34).
- All classes do not have both boundaries (35-above).
What are the 9 steps for constructing frequency distributions?
- Find the range, that is the difference between the largest and smallest observation.
- Find the class interval required to span the range by dividing the range by the desired number of classes (ordinarily 10).
- Round off to the nearest convenient interval.
- Determine where the lowest class should begin (ordinarily, this number should be a multiple of the class interval).
- Determine where the lowest class should end by adding the class interval to the lower boundary then subtracting one unit of measurement.
- Working upward, list as many equivalent classes as are required to include the largest observation.
- Indicate with a tally the class in which each observation falls.
- Replace the tally count for each class with a number - the frequency (f) - and show the total of all frequencies.
- Supply headings for both columns and a title for the table.
Outlier
A very extreme score.
Identify any outliers in each of the following sets of data collected from nine college students.
SUMMER INCOME AGE FAMILY SIZE GPA
$6,450 20 2 2.30
$4,820 19 4 4.00
$5,650 61 3 3.56
$1,720 32 6 2.89
$600 19 18 2.15
$0 22 2 3.01
$3,482 23 6 3.09
$25,700 27 3 3.50
$8,548 21 4 3.20
Outliers are:
- a summer income of $25,700;
- an age of 61;
- and a family size of 18.
No outliers for GPA.
Relative Frequency Distribution
A frequency distribution showing the frequency of each class as a fraction of the total frequency for the entire distribution.
How do you convert a frequency distribution into a relative frequency distribution?
You divide the frequency for each class by the total frequency for the entire distribution.
GRE scores for a group of graduate school applicants are distributed as follows:
GRE f 725-749 1 700-424 3 675-699 14 650-674 30 625-649 34 600-624 42 575-599 30 550-574 27 525-549 13 500-524 4 475-499 2 TOTAL 200
Convert to a relative frequency distribution. When calculating proportions, round numbers to two digits to the right of the decimal point.
GRE f 725-749 0.01 700-424 0.02 675-699 0.07 650-674 0.15 625-649 0.17 600-624 0.21 575-599 0.15 550-574 0.14 525-549 0.07 500-524 0.02 475-499 0.01 TOTAL 1.02
Cumulative Frequency Distribution
A frequency distribution showing the total number of observations in each class and all lower-ranked classes.
How to convert a frequency distribution into a cumulative frequency distribution?
Add to the frequency of each class the sum of the frequencies of all classes ranked below it.
GRE f 725-749 1 700-424 3 675-699 14 650-674 30 625-649 34 600-624 42 575-599 30 550-574 27 525-549 13 500-524 4 475-499 2 TOTAL 200
a) Convert this distribution of GRE score to a cumulative frequency distribution.
b) Convert the distribution of GRE scores obtained in a) to a cumulative percent frequency distribution.
GRE cumulative f 725-749 200 700-424 199 675-699 196 650-674 182 625-649 152 600-624 118 575-599 76 550-574 46 525-549 19 500-524 6 475-499 2
GRE cumulative % 725-749 100 700-424 100 675-699 98 650-674 91 625-649 76 600-624 59 575-599 38 550-574 23 525-549 10 500-524 3 475-499 1
Percentile Rank of an Observation
Percentage of scores in the entire distribution with equal or smaller values than that score.
Find the approximate percentile rank of any weight in the class 200-209.
Weight f Cumulative f Cumulative %
200-209 2 49 92
The approximate percentile rank for weights between 200 and 209 lbs is 92 (because 92 is the cumulative percent for this interval).
Movie ratings reflect ordinal measurement because they can be ordered from most to least restrictive: NC-17, R, PG-13, PG, and G. The ratings of some films shown recently in San Francisco are as follows:
PG PG PG PG-13 G
G PG-13 R PG PG
R PG R PG R
NC-17 NC-17 PG G PG-13
a) Construct a frequency distribution.
b) Convert to relative frequencies, expressed as percentages.
c) Construct a cumulative frequency distribution.
d) Find the approximate percentile rank for those films with a PG rating.
Ranking f % Cumulative f NC-17 2 10 20 R 4 20 18 PG-13 3 15 14 PG 8 40 11 G 3 15 3 TOTAL 20 100
Percentile rank for films with a PG rating is 55 (from 11/20 multiplied by 100).
Histogram
A bar-type graph for quantitative data. The common boundaries between adjacent bars emphasize the continuity of the data, as with continuous variables.
Frequency Polygon
A line graph for quantitative data that also emphasizes the continuity of continuous variables.
Stem and Leaf Display (diagramme branche-et-feuille)
A device for sorting quantitative data on the basis of leading and trailing digits.
Construct a stem and leaf display for the following IQ scores obtained from a group of four-year-old children.
120 98 118 117 99 111
126 85 88 124 104 113
108 141 123 137 78 96
102 132 109 106 143
7 8 8 5 8 9 8 9 6 10 8 2 9 6 4 11 8 7 1 3 12 6 3 4 13 2 7 14 1 3
Positively Skewed Distribution
A distribution that includes a few extreme observations in the positive direction (to the right of the majority of observations).
Negatively Skewed Distribution
A distribution that includes a few extremes observations in the negative direction (to the left of the majority of observations).
Describe the probable shape for each of the following distribution:
female beauty contestants’ scores on a masculinity test, with a higher score indicating a greater degree of masculinity
Positively skewed
Describe the probable shape for each of the following distribution:
scores on a standardized IQ test for a group of people selected from the general population
Normal
Describe the probable shape for each of the following distribution:
test scores for a group of high school studends on a very difficult college-level math exam
Positively Skewed
Describe the probable shape for each of the following distribution:
reading achievement scores for a third-grade class consisting of about equal numbers of regular students and learning-challenged students
Bimodal
Describe the probable shape for each of the following distribution:
scores of students at the Eastman School of Music on a test of music aptitude (designed for use with the general population)
Negatively Skewed
Bar Graph
A bar-type graph for qualitative data. Gaps between adjacent bars emphasize the discontinuous nature of the data.
What are the 7 steps for constructing graphs?
- Decide on the appropriate type of graph.
- Draw the horizontal axis, then the vertical axis.
- Identify the string of class intervals that eventually will be superimposed on the horizontal axis.
- Superimpose the string of class intervals (with gaps for bar graphs) along the entire length of the horizontal axis.
- Along the entire length of the vertical axis, superimpose a progression of convenient numbers.
- Construct bars (or dots and lines) to reflect the frequency of observations within each class interval.
- Supply labels for both axes and a title (or even an explanatory sentence) for the graph.