Exam 3 Flashcards
Descriptive Statistics, Frequency Distributions and Frequency Distribution Tables, Measures of Central Tendency, and Measures of Variability.
Frequency Distribution
Is an organized tabulation of the number of individuals in each category on the scale of measurement.
Absolute Frequency (f)
The number of participants that fall in each category.
Relative Frequency (rf)
The proportion of participants that fall in each category.
-f/N where N=total number of scores
Percent
rf*100
Cumulative Frequency (cf)
The number of people that score AT or LOWER than a given score.
Cumulative Relative Frequency (crf)
- Use rf column
- Proportion of people that scored AT or BELOW a given score.
Cumulative Percent
crf*100
Real Limits and Frequency Distributions
Begin 1 unit below lowest score and end 1 unit above highest score.
Frequency Graphs
- Grouped or Ungrouped (interval) Scores
- Used when you have discrete variables
Frequency Histogram
No spaces or gaps between bars.
-Used when the data consist of numerical scores that have been measured on an interval or ratio scale.
Frequency Polygon
Dots connected by a continuous line that begins and ends on the x-axis.
-Used when the data consist of numerical scores from an interval or ratio scale.
Bar Graph
Much like a histogram, except there are spaces between the bars.
-Used with nominal and ordinal scales.
Symmetrical Distribution
It is possible to draw a line down the middle so that one side of the distribution is a mirror image of the other.
Skewed Distribution
The scores tend to pile up toward one end of the scale and taper off gradually at the other end.
Positively Skewed Distribution
When the scores pile up on the left side of the distribution. The tail points toward the positive end of the x-axis (right).
Negatively Skewed Distribution
When the scores pile up on the right side of the distribution. The tail points toward the negative end of the x-axis (left).
Line Plot
Same as polygon, but left and rightmost points are NOT closed at the abscissa.
Stem and Leaf
Obtains the raw data in the display.
Tukey’s Tallies
Does not maintain raw data.
Heuristic for Grouping Scores
1) Determine the number of groups
- 5-15
2) Determine the size (width) of the interval (2,3, or some multiple of 5)
- Highest score-Lowest score
- Divide by number of groups
- Round to nearest commonly used interval size
3) Determine the beginning lowest interval (should be a multiple of width)
Measures of Central Tendency
Mode, Median, Mean
Mode
Most commonly occurring score.
Advantages: score that actually occurs in the data set, unaffected by extreme scores, represents most common observation
Disadvantage: may not be representative of entire distribution of scores
Median
Divides the distribution into equal halves.
(N+1)/2=Median Location
Advantages: takes into account all the data in the distribution, unaffected by extreme scores (use when distributions are skewed), use if have missing values in data set, open-ended distribution, use with an ordinal scale
Disadvantages: value may not exist in the data, does not enter readily into equations and more difficult to work with, treats all scores alike; differences in magnitude not taken into account
Mean
Arithmetic average of scores.
Advantages: representative of every score in the distribution, closely related to variance and standard deviation
Disadvantages: affected by extreme scores or “outliers” when you have only a limited number of scores in your distribution, value may not exist in data
Central Tendency
An attempt to find a single score that represents the center of a distribution. Strives to find a number representative of the whole distribution.
Variability
A qualitative measurement of the differences between scores and the degree to which they are different. Are they spread out or clustered?
Biased Variance
An average value of the statistic that is equal to the population parameter.
Unbiased Variance
An average value of the statistic that either over or underestimates the corresponding population parameter.
Sum of Squares
How each score varies from the mean.
Advantage: takes into account ALL the scores in a distribution.
Disadvantage: size of SS depends on the amount of variability and is influenced by N.
Population Variance
Average of the SUM of squared deviations from the mean (Mean Squared Distance).
-Takes into account N.
Population Standard Deviation
Represents an average distance or direction from the mean.
Sample Variance
Average SUM of squared deviations from the mean.
-Takes into account n.
Standard Deviation
Represents an average distance or direction from the mean.