PSY201: Chapter 4 - Variability Flashcards
Variability
distribution only partially described through a measure of central tendency
describe distributions in terms of central tendency + variability
Variability
describe how much scores differ from that average
Variability
to obtain a measure of how spread out the scores are in a distribution
usually accompanies measure of central tendency as basic descriptive statistics for a set of scores
Central Tendency
describes central point of the distribution
variability describes how scores are scattered around that central point
Central Tendency and Variability
2 primary values used to describe distribution of scores
Variability
distributions differ from each other in terms of how much scores deviating from mean
Variability
shows how well an individual score represents the entire
distribution
how much error to expect - important for making conclusions from small samples using inferential statistics.
Variability
both descriptive measure + important component of most inferential statistics
descriptive statistic - measures degree to which scores are spread out/clustered together in a distribution
Variability
inferential statistics - measure of how accurately any individual score/sample represents the entire population.
Variability
pop variability small ⇒ scores clustered close together + individual score/sample will provide good representation of entire set
variability large ⇒ scores widely spread, easy for 1/2 extreme scores to give distorted picture of general pop
Measuring Variability
Range: Diff betw highest + lowest score
Interquartile range: Point of the 25th percentile + point of 75th percentile.
Standard Deviation/Variance: Avg squared distance from mean - most important variability measure
variability determined by measuring distance
Range
distance from largest-smallest score in distribution
defined in terms of distance - interval/ratio scale measurements of continuous variable
Range
take diff betw upper real limit of largest X + lower real limit of smallest X value
Range= URLXmax–LRLXmin
Range
simple way to describe spread of scores
completely dependent on max + min scores
Outliers can have huge influence on this measure of dispersion
range considered to be least important measure of variability
Interquartile Range
avoid being influenced by extreme, potentially unrepresentative, scores
distance covered by middle 50% of distribution
25th, 50th + 75th %iles are quartiles” because they cut the sample into four equal parts.
= Q3 – Q1
Interquartile Range
figure out how many scores represent 1⁄4 of the data
refer to range of middle 2 quarters
Interquartile range = Q3 - Q1
Interquartile Range
semi-interquartile range: half of the interquartile range
measures distance from middle of the distribution to boundaries that define the middle 50% = (Q3−Q1)/2
Interquartile Range
more stable than the range - not influenced by outliers
disadvantage - using 50% of scores leaves out much of data doesn’t give complete pic of variability
considered to be a crude reduction of the data
Standard Deviation and Variance for a Population
better measure - considers distance of each score
want to measure standard/typical distance from the mean
Standard Deviation and Variance for a Population
Deviation score = X − μ
Standard Deviation and Variance for a Population
sign - direction of value from mean (above/below)
Standard Deviation and Variance for a Population
Because deviation scores built about mean - must sum to zero
Sum of deviations = Σ(X - μ)
Standard Deviation and Variance for a Population
avg deviation of scores around mean always zero
meaningless measure for variability
Standard Deviation and Variance for a Population
Square diff (deviation) before calculating sum of deviations, - sum of squares take into account magnitude but not direction of the difference from the means
Population variance
mean of the squared deviations
avg of squared distances from the mean (sum of squares)
Standard Deviation
we correct for squaring the means by taking the square root of variance
=√variance
Standard Deviation
square root of the avg squared deviation
avg distancefromthemean.
Standard Deviation
we cannot compute for nominal/ordinal scales
sum of squared deviations
SS=∑(X−μ)2
Find each deviation score
Square each deviation score + Add squared deviations
Variance
σ^2 = SS/N
mean squared deviation
Standard Deviation
σ = square root of mean squared deviations
√SS/N
estimate of average deviation
σ = √∑(X−μ)2/N
Sample Variance and Standard Deviation
need to estimate population parameters from sample
samples statistics give biased estimations of pop parameters
underestimate pop variance
Sample Variance and Standard Deviation
∑(X−M)2
Computational Formula: SS= ∑(X)^2-(∑X)2/n
s^2 = ∑(X−M)2/n-1
s=√∑(X−M)2/n-1
Summary of Computing Standard Deviations
Compute deviation (distance from the mean) for each score.
Square each deviation.
Compute mean of the squared deviations.
sum the squared deviations (SS) + divide by N
Summary of Computing Standard Deviations
divide the sum of the squared deviations (SS) by n - 1, rather than N.
n - 1 - df: sample variance will provide an unbiased estimate of the pop variance
square root of variance to obtain the standard deviation.
Degrees of freedom (df) and bias
represents # scores in sample independent + free to vary
We estimate pop mean with sample mean, M.
Sample Variance and Standard Deviation
apply corrections to formulas using sample data ⇒ unbiased estimates of pop variance
ag of all possible sample variances pop will produce accurate estimate of pop variance
More About Variance and Standard
Deviation
68% of scores lie within one standard deviation of the mean
95% within two standard deviations
Transformations of Scale
add constant to each score - value of mean changes
each score still same distance away from mean
Transformations of Scale
adding constant will move each score so entire distribution is shifted to a new location
centre of the distribution (the mean) changes, but standard deviation remains the same
Transformations of Scale
Multiplying by constant will multiply distance betw scores
standard deviation measure of distance so it will also be multiplied