Descriptive Statistics Flashcards
Frequency Distributions (three types)
- Simple (ungrouped) frequency distribution
- Grouped frequency distribution
- Cumulative Frequency distribution
Simple (ungrouped) frequency distribution
Response option in the left column (ethnicity), frequency in the middle, and percent in right hand columns
Grouped frequency distribution
Left hand column has responses grouped so data can be summarized
Cumulative frequency distribution
Cumulative percent included
Stem and Leaf Display
- Like a grouped frequency distribution without loss of information
- Stem: The intervals on the left
- Leaf: Digits on the right side indicating frequency and number
Histograms
- Vertical columns indicating frequency
- Baseline (or horizontal) axis corresponds with observed scores
- Vertical axis labeled with frequencies
- Bar graph is the same as a histogram except it presents qualitative data (e.g., gender, ethnicity, etc.)
Normal Distribution
Most of the scores are clustered near the middle of the continuum of observed scores
-Resembles bell shape curve
Skewed Distribution
Most of the scores are clustered on one end of the continuum
- Positively skewed: Scores cluster at the lower end of the continuum (higher than zero statistic)
- Negatively skewed: Scores cluster at the higher end of the continuum (lower than zero statistic)
Kurtosis
Measure of the degree of peakedness of a distribution
- Leptokurtosis: Distribution is too peaked with thin tail (higher than zero statistic)
- Platykurtosis: Distribution is too flat with many cases in the tail(s) (lower than zero statistic)
Multimodal Shapes
Scores tend to congregate around more than one point
- Bimodal: Scores are clustered in two places
- Trimodal: Scores are clustered in three places
Mode
Most frequently occuring score
Median
Midpoint
Mean
Average
Degree of dispersion of scores
Similarity and dissimilarity between scores
Homogenous scores
Similar and have no variability
Heterogeneous scores
Dissimilar and have high variability
Range
Difference between the highest and lowest scores
Interquartile range
Spread between the middle 50% of the scores
- Upper Quartile: Top 25%
- Lower Quartile: Bottom 25%
Box-and-whisker plot
Summarizes the degree of variability with a picture
- “Box” indicates the middle 50% of scores
- “Whiskers” extend to highest score, 1.5 times the height of the rectangle, or to the 5th and 95th percentile
- Line in the middle corresponds with median
- Helps identify outliers
Outliers
- Scores that lie far away from the data set
- Why do they occur?
- -Sabotage
- -Misunderstandings
- -Extreme thinking
What should you do with outliers?
- Conduct analyses with and without
- Some outliers are of interest (e.g., they can call attention to a poorly worded question)
Standard Deviation (SD)
- Figuring out how much each score deviates from the mean
* Putting deviation into formula
Variance (S^2)
Standard deviation squared
Standard Scores
- Describe relative position
* Derived from the manipulation of a raw score that indicates distance from the mean
T & Z-scores
Both indicate how many standard deviations a raw score is above or below the group mean
Z-scores
Scores begin at 0 and thus a score of 2.0 would indicate two standard deviations about the mean
T-Scores
Take z-score, multiply it by 10, and then add 50