PSY201: Chapter 3 - Central Tendency Flashcards
Central Tendency
Goal: Identify the single value that best represents entire set of data
Central Tendency
allows researchers to summarize/condense large set of data into single value
descriptive statistic - allows researchers to describe/present set of data in very simplified, concise form
possible to compare 2/more sets of data by comparing avg score (central tendency) for one set vs another set
Measure of Central tendency: The Mean, the Median, and the Mode
determined by objective + well‐defined statistical procedure so others will understand exactly how avg value was obtained + can duplicate process
No single procedure always produces a good representative value
Mean
Most commonly used measure of central tendency.
Computation requires scores numerical values measured
on interval/ratio scale
sum of entire set of scores, then dividing by # of scores
Mean
Widespread use
sample mean better estimate of pop’s mean
But, influenced by extreme scores
Mean
balance point of distribution - sum of distances below the mean = sum of distances above the mean
Changing the Mean
changing any score will change value of mean
Modifying distribution by discarding scores/adding new scores will usually change value of the mean
Changing the Mean
1) how # of scores affected
2) how sum of scores affected
Changing the Mean
constant value added to every score ⇒ same constant added to mean
every score multiplied by constant value ⇒ mean multiplied by same constant value
When the Mean Won’t Work
distribution contains few extreme scores/very skewed ⇒ mean pulled toward the extremes (displaced toward tail) mean will not provide a “central” value
When the Mean Won’t Work
nominal scale - impossible to compute mean ordinal scale (ranks) - inappropriate to compute a mean
Weighted Mean
May need to find overall mean for more than one group
ΣX(overall sum)/N(total n)
Characteristics of the Mean
Changing a score
Adding/subtracting a score
Multiplying/dividing a score
Mean: Advantages
Calculated from all the data
Can be manipulated using an equation
Related to variance and standard
Can estimate population mean
Mean: Disadvantages
Influenced by extreme scores
Value may not exist in data
The Median
scores listed in order from smallest to largest - midpoint
divides scores so 50% of scores have values =/less than median
requires scores that can be placed in rank order + measured on ordinal, interval/ratio scale
Relatively unaffected by extreme scores
The Median
- With an odd number of scores, list the values in order, and the median is the middle score in the list.
- With an even number of scores, list the values in order, and the median is half-way between the middle two scores.
The Median
continuous variable, possible to find median by first placing scores in a frequency distribution histogram with each score represented by a box in the graph.
draw vertical line through distribution so exactly half boxes are on each side of the line. The median defined by the location of the line
The Median: Advantage
Relatively unaffected by extreme scores
tends to stay in the “centre” of distribution even when few extreme scores or when distribution is very skewed
The Median: Disadvantages
Ignores most of the data.
May not have occurred
Difficult to work with.
Not stable between samples
The Mode
most frequently occurring category/score in distribution.
category/score at peak of distribution
Can be determined for data measured on any scale of measurement: nominal, ordinal, interval/ratio
The Mode: Advantage
Unaffected by extreme scores
Score actually occurred
Represents largest # of scores
Only one of the 3 that can be used for nominal data.
Can be used as supplemental measure reported along with mean/median
The Mode: Disadvantage
Based on only a few data points
Depends on how data grouped
Not representative of entire data set
Bimodal Distributions
Possible for distribution to have more than one mode - bimodal
mode often used to describe peak not really highest point major mode at highest peak + minor mode at secondary peak in a diff location
Which measure to use?
mode ⇒ data categorical + values can fit into only one class (hair colour, political affiliation)
median ⇒ have extreme scores (income in dollars)
mean ⇒ data has no extreme scores + not categorical (numerical scores on a test)
Which measure to use?
Mean most precise measure, then median, + lastly mode
Use the most precise measure if possible
When not to use…
mean ⇒ don’t have right data scale (nominal/ordinal), distribution not unimodal/is skewed (watch out for outliers).
When not to use…
median ⇒ data is nominal/distribution not unimodal
mode ⇒ None
Central Tendency and the Shape of the Distribution
3 measures often systematically related to each other
symmetrical distribution - mean + median will always be equal.
Central Tendency and the Shape of the Distribution
symmetrical distribution - one mode ⇒ mode, mean + median have same value
skewed distribution ⇒ mode at peak on one side + mean usually displaced toward tail on other side
median usually located betw mean + mode
Reporting Central Tendency in Research Reports
sample mean = M
no standardized notation for reporting median/mode
several means obtained for diff groups/treatment conditions, common to present all of means in a single graph
Reporting Central Tendency in Research Reports
diff groups/treatment conditions - horizontal axis + means are displayed by bar/point above each of groups
height = mean for each group.
Similar graphs also used to show several medians in 1 display
Central Tendency
Fails to give whole story
Need measure to indicate degree to which individual
observations clustered about/deviate from centre
centre may represent majority of scores/may be distributed over a wide range of values