Week 2 Chapter 2 Flashcards
frequency distribution
All of these:
1) is an organized tabulation showing the number of individuals located in each category on the scale of measurement.
2) allows the researcher to see “at a glance” the entire set of scores. It shows whether the scores are generally high or low, whether they are concentrated in one area or spread out across the entire scale, and generally provides an organized picture of the data.
3) allows you to see the location of any individual score relative to all of the other scores in the set.
A frequency distribution can be structured either as a
table or as a graph, but in either case, the distribution presents the same two elements:
- The set of categories that make up the original measurement scale
- A record of the frequency, or number of individuals in each category
a frequency distribution presents
a picture of how the individual scores are distributed on the measurement scale—hence the name frequency distribution
It is customary to list categories from highest to lowest, but this is an
arbitrary arrangement. Many computer programs list categories from lowest to highest.
It is customary to use an X as the column heading for the scores and an
f as the column heading for the frequencies.
With an ordinal, interval, or ratio scale, the categories are listed
usually highest to lowest). For a nominal scale, the categories can be listed in any order.
Notice that the X values in a frequency distribution table represent the scale of measurement, not the
actual set of scores. For example, the X column lists the value 10 only one time, but the frequency column indicates that there are actually two values of . Also, the X column lists a value of , but the frequency column indicates that no one actually had a score of .
the frequencies can be used to find the total number of scores in the distribution. By adding up the frequencies, you obtain
the total number of individuals
Proportion measures the fraction of the
total group that is associated with each score. Because proportions describe the frequency (f) in relation to the total number (N), they often are called relative frequencies.
Although proportions can be expressed as fractions (for example, ), they more commonly appear as
decimals
To compute the percentage associated with each score, you first
find the proportion (p) and then multiply by 100
When the scores are whole numbers, the total number of rows for a regular table can be obtained by
finding the difference between the highest and the lowest scores and adding 1
There are several guidelines that help guide you in the construction of a grouped frequency distribution table. Note that these are
simply guidelines, rather than absolute requirements, but they do help produce a simple, well-organized, and easily understood table.
Guideline 1
The grouped frequency distribution table should have about 10 class intervals. If a table has many more than 10 intervals, it becomes cumbersome and defeats the purpose of a frequency distribution table. On the other hand, if you have too few intervals, you begin to lose information about the distribution of the scores. At the extreme, with only one interval, the table would not tell you anything about how the scores are distributed. Remember that the purpose of a frequency distribution is to help a researcher see the data. With too few or too many intervals, the table will not provide a clear picture. You should note that 10 intervals is a general guide. If you are constructing a table on a blackboard, for example, you probably want only 5 or 6 intervals. If the table is to be printed in a scientific report, you may want 12 or 15 intervals. In each case, your goal is to present a table that is relatively easy to see and understand.
Guideline 2
The width of each interval should be a relatively simple number. For example, 2, 5, 10, or 20 would be a good choice for the interval width. Notice that it is easy to count by 5s or 10s. These numbers are easy to understand and make it possible for someone to see quickly how you have divided the range of scores into class intervals
Guideline 3
The bottom score in each class interval should be a multiple of the width. If you are using a width of 10 points, for example, the intervals should start with 10, 20, 30, 40, and so on. Again, this makes it easier for someone to understand how the table has been constructed.
Guideline 4
All intervals should be the same width. They should cover the range of scores completely with no gaps and no overlaps, so that any particular score belongs in exactly one interval.
you should note that after the scores have been placed in a grouped table, you lose information about
the specific value for any individual score. In general, the wider the class intervals are, the more information is lost
when a continuous variable is measured, the resulting measurements correspond to
intervals on the number line rather than single points. If you are measuring time in seconds, for example, a score of X = 8 actually represents an interval bounded by the real limits 7.5 seconds and 8.5 seconds. Thus, a frequency distribution table showing a frequency of f = 3 individuals all assigned a score of does not mean that all three individuals had exactly the same measurement. Instead, you should realize that the three measurements are simply located in the same interval between 7.5 and 8.5.
To construct a histogram,
you first list the scores (measurement categories), equally spaced along the X-axis. Then you draw a bar above each X value so that:
a. the height of the bar corresponds to the frequency for that category.
b. for continuous variables, the width of the bar extends to the real limits of the category. For discrete variables, each bar extends exactly half the distance to the adjacent category on each side.
For both continuous and discrete variables, each bar in a histogram extends to the midpoint between adjacent categories. As a result, adjacent bars touch and there are no spaces or gaps between bars.
To construct a polygon,
you begin by listing the scores (measurement categories), equally spaced along the X-axis. Then,
a) a dot is centered above each score so that the vertical position of the dot corresponds to the frequency for the category.
b) a continuous line is drawn from dot to dot to connect the series of dots.
c) the graph is completed by drawing a line down to the X-axis (zero frequency) at each end of the range of scores. The final lines are usually drawn so that they reach the X-axis at a point that is one category below the lowest score on the left side and one category above the highest score on the right side.
When the scores are measured on a nominal or ordinal scale (usually non-numerical values), the frequency distribution can be displayed in a
bar graph
bar graph
is essentially the same as a histogram, except that spaces are left between adjacent bars. For a nominal scale, the space between bars emphasizes that the scale consists of separate, distinct categories. For ordinal scales, separate bars are used because you cannot assume that the categories are all the same size.
To construct a bar graph,
list the categories of measurement along the X-axis and then draw a bar above each category so that the height of the bar corresponds to the frequency for the category.
Although you usually cannot find the absolute frequency for each score in a population, you very often can obtain
relative frequencies. For example, no one knows the exact number of male and female human beings living in the United States because the exact numbers keep changing. However, based on past census data and general trends, we can estimate that the two numbers are very close, with women slightly outnumbering men. You can represent these relative frequencies in a bar graph by making the bar above Female slightly taller than the bar above Male. Notice that the graph does not show the absolute number of people. Instead, it shows the relative number of females and males.
Although it is still possible to construct graphs showing frequency distributions for extremely large populations, the graphs usually involve two special features:
relative frequencies and smooth curves.
When a population consists of numerical scores from an interval or a ratio scale, it is customary to draw the distribution with
a smooth curve instead of the jagged, step-wise shapes that occur with histograms and polygons.
The smooth curve indicates that you are not connecting a series of dots (real frequencies) but instead are
showing the relative changes that occur from one score to the next.
One commonly occurring population distribution is the
normal curve
The word normal refers to a
specific shape that can be precisely defined by an equation.
Less precisely, we can describe a normal distribution as being
symmetrical, with the greatest frequency in the middle and relatively smaller frequencies as you move toward either extreme.
Whenever the term distribution appears, you should conjure up an image of a
frequency distribution graph
There are three characteristics that completely describe any distribution:
shape, central tendency, and variability
central tendency
measures where the center of the distribution is located
variability
measures the degree to which the scores are spread over a wide range or are clustered together
the shape of a distribution is defined by
an equation that prescribes the exact relationship between each X and Y value on the graph.
Nearly all distributions can be classified as being either
symmetrical or skewed.
In a symmetrical distribution, it is possible to draw
a vertical line through the middle so that one side of the distribution is a mirror image of the other
In a skewed distribution, the scores tend to
pile up toward one end of the scale and taper off gradually at the other end
tail of the distribution
The section where the scores taper off toward one end of a distribution
positively skewed
A skewed distribution with the tail on the right-hand side is positively skewed because the tail points toward the positive (above-zero) end of the X-axis.
negatively skewed
If the tail points to the left, the distribution is negatively skewed
Not all distributions are perfectly symmetrical or obviously skewed in one direction. Therefore, it is common to
modify these descriptions of shape with phrases likely “roughly symmetrical” or “tends to be positively skewed.” The goal is to provide a general idea of the appearance of the distribution.
Which of the following measuring scales are displayed by frequency distribution polygons
Either interval or ratio scales
A group of quiz scores is shown in a histogram. If the bars in the histogram gradually decrease in height from left to right, what can you conclude about the set of quiz scores?
There are more low scores than there are high scores.
Instead of showing the actual number of individuals in each category, a population frequency distribution graph usually shows
relative frequency
One descriptive technique is to place the data in a frequency distribution table or graph that shows
exactly how many individuals (or scores) are located in each category on the scale of measurement.
A frequency distribution graph lists scores on the
horizontal axis and frequencies on the vertical axis.
For interval or ratio scales, you should use
a histogram or a polygon
Bar graphs are similar to histograms except
that gaps are left between adjacent bars.
When constructing or working with a grouped frequency distribution table, a common mistake is to calculate the interval width by using the highest and lowest values that define each interval. For example, some students are tricked into thinking that an interval identified as 20–24 is only 4 points wide. To determine the correct interval width, you can
a) Count the individual scores in the interval. For this example, the scores are 20, 21, 22, 23, and 24 for a total of 5 values. Thus, the interval width is 5 points.
b) Use the real limits to determine the real width of the interval. For example, an interval identified as 20–24 has a lower real limit of 19.5 and an upper real limit of 24.5 (halfway to the next score). Using the real limits, the interval width is 24.5 - 19.5 = 5 points
What information can you obtain about the scores in a regular frequency distribution table that is not available from a grouped table?
The exact frequency for each category on the scale of measurement