Week 2 Chapter 3 Flashcards
mean
sum of the scores divided by the number of scores
population mean
formula wherein all scores in the population are added, and then divided by N
sample mean
formula with symbols to signify population subset values
weighted mean
formula combining multiple sets of scores and dividing to find overall mean for combined group
central tendency
All of these:
1) statistical measure to determine a single score that defines the midpoint of a distribution
2) the concept of an average or representative score
median
midpoint in a list of scores listed in order from smallest to largest. More specifically, the median is the point on the measurement scale below which 50% of the scores in the distribution are located. The median as the midpoint of a distribution means that that the scores are being divided into two equal-sized groups. We are not locating the midpoint between the highest and lowest X values.
mode
score or category that has the greatest frequency in a frequency distribution. Its common usage means “the customary fashion” or “a popular style.”
bimodal
distribution with two scores with greatest frequency
multimodal
a distribution with more than two scores with greatest frequency
major mode
taller peak when two scores with greatest frequency have unequal frequencies
minor mode
shorter peak when two scores with greatest frequency have unequal frequencies
line graph
diagram used when values on horizontal axis are measured on an interval or ratio scale
In addition to describing an entire distribution, measures of central tendency are also useful for making comparisons between
groups of individuals or between sets of data.
number crunching
take a distribution consisting of many scores and crunch them down to a single value that describes them all
Unfortunately, there is no single, standard procedure for determining central tendency. The problem is that no single measure
produces a central, representative value in every situation.
If the two samples are combined, what is the mean for the total group?
To calculate the overall mean, we need two values:
- the overall sum of the scores for the combined group ΣX, and
- the total number of scores in the combined group (n).
Note that the overall mean is not halfway between the original two sample means. Because the samples are
not the same size, one makes a larger contribution to the total group and therefore carries more weight in determining the overall mean. For this reason, the overall mean we have calculated is called the weighted mean.
changing a single score in the sample has produced a new mean. You should recognize that changing any score also changes
the value of ΣX (the sum of the scores), and thus always changes the value of the mean.
Adding a new score to a distribution, or removing an existing score, will usually change the mean. The exception is
when the new score (or the removed score) is exactly equal to the mean.
It is easy to visualize the effect of adding or removing a score if you remember that the mean is defined as the
balance point for the distribution.
Adding a score (or removing a score) has the same effect on the mean
whether the original set of scores is a sample or a population.
If a constant value is added to every score in a distribution, the same constant
will be added to the mean. Similarly, if you subtract a constant from every score, the same constant will be subtracted from the mean.
If every score in a distribution is multiplied by (or divided by) a constant value, the mean will
change in the same way.
Multiplying (or dividing) each score by a constant value is a common method for changing the
unit of measurement. Although the numerical values for the individual scores and the sample mean have changed, the actual measurements are not changed.
To change a set of measurements from minutes to seconds, for example, you multiply by 60; to change from inches to feet, you
divide by 12
One common task for researchers is converting measurements into
metric units to conform to international standards.
1 inch equals
2.54 centimeters.
A sample has a mean of . If one person with a score of is removed from the sample, what effect will it have on the sample mean?
The sample mean will increase
the definition and the computations for the median are identical for
a sample and for a population.
To find the median,
list the scores in order from smallest to largest. Begin with the smallest score and count the scores as you move up the list. The median is the first point you reach that is greater than 50% of the scores in the distribution. The median can be equal to a score in the list or it can be a point between two scores. Notice that the median is not algebraically defined (there is no equation for computing the median), which means that there is a degree of subjectivity in determining the exact value.
the concept of a balance point focuses on
distances rather than scores
it is possible to have a distribution in which the vast majority of the scores
are located on one side of the mean
the mean does not necessarily divide the scores into
two equal groups. In this example, 5 out of the 6 scores have values less than the mean.
The median defines the middle of the distribution in terms of
scores. The median is located so that half of the scores are on one side and half are on the other side.
the mean and the median are both methods for defining and measuring
central tendency. it is important to point out that although they both define the middle of the distribution, they use different definitions of the term middle.
To find the precise median,
we calculate the precise median for a continuous variable as follows:
1) Determine the real limits of the interval that contains the precise midpoint
2) Count the number of scores below the identified interval.
3) Find the number of additional scores needed to reach exactly 50%
4) Calculate a fraction=number of additional scores needed/ total number of scores in the interval
5) Add the fraction to the lower real limit of the interval
As with the median, there are no symbols or special notation used to identify the mode or
to differentiate between a sample mode and a population mode. In addition, the definition of the mode is the same for a population and for a sample distribution.
The mode is a useful measure of central tendency because
it can be used to determine the typical or most frequent value for any scale of measurement, including a nominal scale
The mode is a score or category,
not a frequency. For this example, the mode is Luigi’s, not f = 42
The mode also can be useful because it is the only measure of central tendency that corresponds to an actual score in the data; by definition, the mode is the most
frequently occurring score. The mean and the median, on the other hand, are both calculated values and often produce an answer that does not equal any score in the distribution. For example, in Figure 3.5 we presented a distribution with a mean of 4 and a median of 2.5. Note that none of the scores is equal to 4 and none of the scores is equal to 2.5. However, the mode for this distribution is and there are three individuals who actually have scores of X = 2 .
In a frequency distribution graph, the greatest frequency will appear as the tallest part of the figure. To find the mode,
you simply identify the score located directly beneath the highest point in the distribution.
Although a distribution will have only one mean and only one median, it is possible to have
more than one mode.
In a frequency distribution graph, the different modes will correspond to distinct, equally high peaks. A distribution with two modes is
said to be bimodal, and a distribution with more than two modes is called multimodal. Occasionally, a distribution with several equally high points is said to have no mode.
Because the mean, the median, and the mode are all trying to measure the same thing, it is reasonable to
expect that these three values should be related. In fact, there are some consistent and predictable relationships among the three measures of central tendency. Specifically, there are situations in which all three measures will have exactly the same value. On the other hand, there are situations in which the three measures are guaranteed to be different. In part, the relationships among the mean, median, and mode are determined by the shape of the distribution.
For a symmetrical distribution, the right-hand side of the graph is a mirror image of the left-hand side. If a distribution is perfectly symmetrical, the median
is exactly at the center because exactly half of the area in the graph will be on either side of the center. The mean also is exactly at the center of a perfectly symmetrical distribution because each score on the left side of the distribution is balanced by a corresponding score (the mirror image) on the right side. As a result, the mean (the balance point) is located at the center of the distribution. Thus, for a perfectly symmetrical distribution, the mean and the median are the same. If a distribution is roughly symmetrical, but not perfect, the mean and median will be close together in the center of the distribution.
If a symmetrical distribution has only one mode, it will also be in the center of the distribution. Thus, for a perfectly symmetrical distribution with one mode, all three measures of central tendency, the mean, the median, and the mode,
have the same value. For a roughly symmetrical distribution, the three measures are clustered together in the center of the distribution.
a bimodal distribution that is symmetrical
will have the mean and median together in the center with the modes on each side.
A rectangular distribution
has no mode because all X values occur with the same frequency. Still, the mean and the median are in the center of the distribution.
The positions of the mean, median, and mode are not
as consistently predictable in distributions of discrete variables
In skewed distributions, especially distributions for continuous variables, there is a strong tendency for the mean, median, and mode to be
located in predictably different positions.
a positively skewed distribution with the peak (highest frequency) on the left-hand side. This is the position of the mode. However, it should be clear that the vertical line drawn at the mode does not divide the distribution into two equal parts.
To have exactly 50% of the distribution on each side, the median must be located to the right of the mode. Finally, the mean is typically located to the right of the median because it is influenced most by the extreme scores in the tail and is displaced farthest to the right toward the tail of the distribution. To have exactly 50% of the distribution on each side, the median must be located to the right of the mode. Finally, the mean is typically located to the right of the median because it is influenced most by the extreme scores in the tail and is displaced farthest to the right toward the tail of the distribution.
Negatively skewed distributions are lopsided in the opposite direction, with the scores piling up on the right-hand side and the tail tapering off to the left.
For a distribution with negative skew, the mode is on the right-hand side (with the peak), while the mean is displaced toward the left by the extreme scores in the tail. As before, the median is usually located between the mean and the mode. Therefore, in a negatively skewed distribution, the most probable order for the three measures of central tendency from smallest value to largest value (left to right), is the mean, the median, and the mode.
whenever the scores are numerical values (interval or ratio scale) the mean is usually
the preferred measure of central tendency. Because the mean uses every score in the distribution, it typically produces a good representative value. Remember that the goal of central tendency is to find the single value that best represents the entire distribution. Besides being a good representative, the mean has the added advantage of being closely related to variance and standard deviation, the most common measures of variability. This relationship makes the mean a valuable measure for purposes of inferential statistics. For these reasons, and others, the mean generally is considered to be the best of the three measures of central tendency. But there are specific situations in which it is impossible to compute a mean or in which the mean is not particularly representative. It is in these situations that the mode and the median are used.
extreme scores
scores that are very different in value from most of the others
when a distribution is skewed or has a few extreme scores
then the mean may not be a good representative of the majority of the distribution. The problem comes from the fact that the extreme values can have a large influence and cause the mean to be displaced. In this situation, the fact that the mean uses all of the scores equally can be a disadvantage.
The breaks in the X-axis are the conventional way of notifying the reader that
some values have been omitted.
When to Use the Median
All of these:
1) Extreme Scores or Skewed Distributions
2) Undetermined Values
3) Open-Ended Distributions
4) Ordinal Scale
The median is not easily affected by extreme scores.
the median commonly is used when reporting the average value for a skewed distribution. For example, the distribution of personal incomes is very skewed, with a small segment of the population earning incomes that are astronomical. These extreme values distort the mean, so that it is not very representative of the salaries that most of us earn. As in the previous example, the median is the preferred measure of central tendency when extreme scores exist.
Many researchers believe that it is not appropriate to use the mean to describe central tendency for ordinal data. When scores are measured on an ordinal scale,
the median is always appropriate and is usually the preferred measure of central tendency. You should recall that ordinal measurements allow you to determine direction (greater than or less than) but do not allow you to determine distance. The median is compatible with this type of measurement because it is defined by direction: half of the stores are above the median and half are below the median. The mean, on the other hand, defines central tendency in terms of distance. Because the mean is defined in terms of distances, and because ordinal scales do not measure distance, it is not appropriate to compute a mean for scores from an ordinal scale.
When to Use the Mode
All of these:
1) Nominal Scales
2) Discrete Variables
3) Describing Shape
Because nominal scales do not measure quantity (distance or direction), it is impossible
to compute a mean or a median for data from a nominal scale. When the scores are numerical values from an interval or ratio scale, the mode is usually not the preferred measure of central tendency.
Bar graphs are used to present means (or medians) when the groups or treatments shown on the horizontal axis are measured on a
nominal or an ordinal scale.
When constructing graphs of any type
you should recall the basic rules:
1) The height of a graph should be approximately two-thirds to three-quarters of its length.
2) Normally, you start numbering both the X-axis and the Y-axis with zero at the point where the two axes intersect. However, when a value of zero is part of the data, it is common to move the zero point away from the intersection so that the graph does not overlap the axes
Although it is possible to construct graphs that distort the results of a study
researchers have an ethical responsibility to present an honest and accurate report of their research results.