Describing Data Flashcards
Describing Data
Frequency Distributions
Measures of Central Tendency
Measures of Variability
Skewness
Kurtosis
set of test scores arrayed for recording or study.
distribution
straightforward, unmodified accounting of performance that is usually numerical.
reflect a simple tally, as in number of items responded to correctly on an achievement test.
raw score
the number of times each score occurred might be listed in tabular or graphic form
Table 3–2
frequency
distribution,
individual scores have been used and the data have not been grouped.
simple frequency distribution
frequency distribution used to summarize data
Table 3–3,
grouped frequency distribution
test-score intervals, also called ____ replace the actual test
scores
class intervals
diagram or chart composed of lines, points, bars, or other symbols
describe and illustrate data
graph
Three kinds of graphs used to illustrate frequency distributions
histogram,
bar graph,
frequency polygon
graph with vertical lines drawn at the true limits of each test score (or class interval), forming a series of contiguous rectangles.
test scores along the graph’s horizontal axis (also referred to as the abscissa or X-axis)
frequency of occurrence along the graph’s vertical axis (also referred to as the ordinate or Y-axis).
histogram
numbers indicative of frequency on the Y-axis
reference to some categorization (yes/no/maybe, male/female) on the X-axis.
rectangular bars typically are not contiguous.
bar
graph
expressed by a continuous line connecting the points where test scores or class intervals (on the X-axis) meet frequencies (on the Y-axis).
frequency polygon
Measures of Central Tendency
arithmetic mean
median
mode
the statistic that indicates the average or midmost score between the extreme scores in a distribution.
the measure of central tendency
most commonly used measure of central tendency
average
takes into account the actual numerical value of every score
arithmetic mean
standard statistical shorthand called
“summation notation”
summation meaning “the sum of”
the symbol used to signify “sum”
Greek uppercase letter sigma, Σ,
X represents
a test score
expression Σ X means
add all the test scores
denoted by the symbol X (and pronounced “X
bar”)
equal to the sum of the observations (test scores) divided by the number of observations.
the appropriate measure of central tendency for interval or ratio data when the distributions are believed to be approximately normal.
arithmetic mean
the formula for the arithmetic mean
X = Σ(X/n)
computed from a
frequency distribution. The formula is..
X = Σ(fX)
n
Σ( f X) means
multiply the frequency of each score by
its corresponding score and then sum
calculation
of the mean from a grouped frequency distribution
Table 3–4
middle score in a distribution
ordering
the scores in a list by magnitude, in either ascending or descending order
median
If the total number of
scores ordered is an odd number…..
median will be the score that is exactly in the middle
When the total number of scores ordered is an even number….
median can
be calculated by determining the arithmetic mean of the two middle scores
most frequently occurring score in a distribution of scores
tends not to be a very commonly used measure of central tendency
the modal score is not calculated
mode
there are two scores (51 and 66) that occur with the highest frequency (of two).
bimodal distribution
is not calculated in a true sense, it is a
nominal statistic and cannot legitimately be used in further
calculations
mode
the statistic that takes into account the order of scores and is itself ordinal in nature.
median
interval-level statistic is generally the most stable and useful measure of central tendency.
mean
Measures of Variability
range
interquartile and semi-interquartile ranges
average deviation
standard deviation
indication of how scores in a distribution are scattered or dispersed
Variability
Statistics that describe the amount of variation in a distribution
measures
of variability
provides a quick but gross description of the spread of scores.
Figure 3–3
range
distribution of test scores (or any data) can be divided into four parts
interquartile and semi-interquartile ranges
e dividing points between the four quarters in the
distribution
Figure 3–5
quartiles
quartiles are labeled as…
Q1, Q2, and Q3.
refers to a specific point
quartile
refers to an interval
quarter
Q2 is the same as
median
Q1 and Q3 are the..
quarter-points in a distribution of score
measure of variability equal to the difference between Q3
and Q1.
an ordinal statistic
interquartile range
is equal to the interquartile range divided by 2.
semi-interquartile range
perfectly
symmetrical distribution
Q1 and Q3 will be exactly the same distance from the median.
If
these distances are unequal then there is a lack of symmetry
skewness
AD = ∑∣x∣
n
average deviation
AD = ∑∣x∣
n
lowercase italic x in the formula signifies
score’s deviation from the mean
AD = ∑∣x∣
n
value of x obtained by
subtracting the mean from the score (X − mean = x)
AD = ∑∣x∣
n
bars on
each side of x indicate
absolute value of the deviation score (ignoring the positive
or negative sign and treating all deviation scores as positive)
a measure of variability equal to the square
root of the average squared deviations about the mean.
standard deviation
is equal to the arithmetic mean of the squares of the
differences between the scores in a distribution and their mean.
variance
s2 = ∑x2
n
calculate
the variance (s
2) using deviation scores
s2 = ∑x2
n
variance is calculated by
squaring and summing all the deviation scores and
then dividing by the total number of scores.
s2 = ∑X2
n
− X2
Table 3–1
calculate the summation of the raw scores squared, divide
by the number of scores, and then subtract the mean squared.
(square root with n)
Σ(X − M)2
n
X represents a sample mean
M a population mean
formula for the population
standard deviation
nature and extent to which
symmetry is absent.
an indication of how the measurements in a distribution are
distributed
skewness
when relatively few of the scores fall at the
high end of the distribution.
positive skew
when relatively few of the scores fall at the low end of the distribution.
examination results may indicate that the test was too easy.
Figure 3–3
negative
skew
Q3 − Q2 will be greater than the distance of Q2 − Q1.
positively skewed distribution
Q3 − Q2 will be less than the distance of Q2 − Q1
negatively skewed distribution,
distances from Q1 and Q3 to the median are the same
symmetrical
refer to the steepness of a distribution in its center
kurtosis
describe
the peakedness/flatness of three general types of curves
Figure 3–6
platy-, lepto-, or meso-
Distributions are relatively flat
platykurtic
Distributions are relatively peaked
leptokurtic
Distributions are in the
middle
mesokurtic
Distributions that have high kurtosis are
characterized by
high peak and “fatter” tails compared to a
normal distribution.
lower kurtosis values indicate
distribution with a rounded peak and thinner tails.
normal bell-shaped curve would have a
graph A from Figure 3–3
kurtosis value of 3
normal distribution would have
kurtosis of 0