Descriptive Statistics Flashcards
Collapsing data
Take large data set and condense it into what you need.
Central Tendency Measures
- Mean
- Median
- Mode
Variability
Spread in the data
Distribution =
total set of scores (n)
Frequency distribution
- Rank that shows the number of times each value occured or frequency
Allows to examine the distribution of scores
Ideal frequency distribution shape is a
bell curve
Cummulative Percentage
% of people that fall below a score
What are 4 ways to describe the distribution of the shapes of graphs?
- Symmetrical
- Uniform
- Normal
- Skewed (doesn’t follow normal curve)
Positive vs Negative Skewed Bell Curve
- Negative skew: tail goes out toward negative side
- Positive skew: tail goes out toward positive side
Mode
- Most frequent number
- Not useful for continuous data
- bimodal = 2; multimodal = greater than 2
Median
- Middle Score
Advantage:
* Unaffected by extreme scores
* Average position in the distribution, not amount
* Useful for skewed data with extreme scores
Mean
- Average
Sum of scores/n
Most appropriate type of type of measurement and central tendency
- Interval/Ratio = Mean
- Ordinal = median or mode
- Nominal = Mode
- Mean is the most stable measure but is largely affected by skew
- Median/Mode are less affected by skew
Measures of variability
- Variability = dispersion of scores
Measures of variability:
* Range (max - min)
* Percentiles and quartiles
* Variance (spread)
* Standard Deviation
* Coefficent of variation
Range
- Range = maximum - minimum)
- Least useful
- Greatly affected by outliers
- Hard to compare different sample sizes
Percentile definition
- relative position within a distribution based on 100 equal portions
Quartiles
- Distribution split into 4 equal parts
- Often use quartiles to divide samples into subgroups
- Ex: compared those below 1st. (“lax”) vs. those above the 3rd. (“tight”) - two different groups.
Variance
- Index that effects the spread in scores
- The bigger the number the bigger the variance
- Not normal units
Standard Deviation
- Brings variability back to normal units
- Often seen as either: 38.6±3.2° or 38.6° (sd 3.2°)
Most biological, psycholigcal and social phenomena fall into a normal curve. What things do not?
income, socioeconomic class, politcs
Proportions of the Normal Curve
- Bell Curve
- Scores clusters around mean
- Mean, median, mode are the same
- Frequency changes as you get away from mean
SD and Percentages
1 SD = 68%
2 SD = 95%
3 SD = 99%
Outliers
- Outlier = an observation whose value is distant from the values of the majority of observations
- Influence skewness and move mean in direction of the outlier
Box and Whisker Plot
- Interquartile Range: 25%-Median-75%
- Outliers lie outside of the plot
Effect Size
- Calculated to determine the “meaningful” change
- Cohen’s d is often used to examine effect size
Cohen’s d
ES = Change in Score/Average Standard Deviation
Effect Size Interpretation
- Greater than 0.8 = large = meaningful
- Less than 0.8 = not meaningful