Exam 2: Summary Statistics Flashcards
Levels of Measurement
Nominal, Ordinal, Interval, Ratio
* Need to know in order to best present and analyze your data
Nominal
- Ideally exhaustive and mutually exclusive categories
- Was the sentence reportedly correct or incorrectly (or no response)?
- 2 categories: Binomial aka dichotomous
- More than 2: Multinomial
Ordinal
ordered categories
Interval
• Equal distances between scores
• Calculate differences but not proportions
(can be discrete or continuous)
Ratio
• Interval + a true zero
(can be discrete or continuous)
• Percent correct
• e.g., 50% is twice as accurate as 25% (note that this measure is bounded)
Levels of measurement applied to hearing measurement
Nominal: hearing loss or normal hearing
Ordinal: degrees, mild-profound
Descriptive Statistics
• Frequencies, Percentages, Proportions
• How often a phenomenon occurred • Primary summary measure for nominal measures • But can be used for other data types • Across participants or behaviors Mean, Median, and Mode
• Measures of Central Tendency
- Mean (Interval, Ratio “M”)
- Median (Ordinal, Interval, Ratio, “Mdn”)
- Mode (All types)
Measures of Variability
- Min, Max
- Range = Max – Min
- Interquartile range = score at 75th percentile – score at 25th percentile; Relevant if your data have extreme highs or lows
• Standard Deviation: Dispersion of scores around the mean; Colloquially: On average, how much do observations differ from mean?
SD = sqrt(variance)
• Standard Error of the Mean: How far is the sample mean likely to be from the population mean? • SE
Tables
• Rule of thumb: Data that requires ≤2 columns or rows for
a table should just be presented in the text
• Arrange/group information logically
Use APA format
Pie chart
use when a number adds up to 100%
Shapes of Distributions
- Important for knowing:
- The best way to summarize your data
- The type of statistical test you should perform
- Some tests assume a particular distribution (e.g., “normally distributed”)
Normal Distribution
• “Gaussian curve”
• Largest number of observations at the center
• Symmetric (as many below as above)
• Fewer as you get towards extreme values; 2/3 of observations will fall within 1 SD of the mean
Value
Skewed Distribution
• Not symmetric
• More extreme scores in one direction
• Mean most affected by skew
• More skewed, the more the difference between mean and median
negative skew: more extreme scores on the right
Other Distributions
- Distributions can have:
- Different centers
- Different variabilities
- Different kurtosis (peakedness and shape of tails)
Standardized Scores
• Accounts for both average and variability of the score
• z-score = (score – M) / SD
~Resulting Mz-score = 0 and SDz-score = 1
~How many SD above or below the mean is a given score?
• Straightforward way to relate a value to a normal distribution and to other z-scores
z scores have directional information
t distribution
• A t-distribution is a z-distribution that is shifted and scaled such that: • M = 50 • SD = 10 • Typically used: • Small sample size • Unknown population standard deviation
Outliers
- Extremely deviant values
- Not necessarily inaccurate!
• How to identify?
- Review your experiment notes
- Plot your raw data
-Set a priori criteria based on your measure and literature-
• Trimming of reaction times is common
• e.g., more than 2.5 SD slower than the mean, and/or <200 ms
• Box and whisker plots
How to deal with them?
• NEVER EVER remove data without minimally describing:
-how and why you did
-what the impact was on the results
-how much data you’ve removed and was removal equally distributed across conditions
• Problems can arise from interpreting data with real outliers or without “outliers”