week 4 -- SD as a ruler & z-scores Flashcards
Mean (equation)
mean y = sum of y / n
ȳ = sigma y / n
descriptive vs. inferential
descriptive comes from our sample
inferential are statements about the population
–>
to make inferences about a population, I have to construct a model of that population
Statistic
item of numerical info about the SAMPLE
Paramater
item of numerical info about the MODEL (i.e., the POPULATION)
Estimator
a statistic used to estimate a parameter (e.g., sample mean)
Error
NON-SYSTEMATIC difference btw estimator and parameter
Bias
SYSTEMATIC difference btw estimator and parameter
Standard deviation – a measure to quantify spread of a sample (or population)
Allows us to answer: “How remarkable is a single observed value”?
algebraically = square root of variance
(square root of Σ (y- ȳ)2 / n
or with Bessel’s correction: Σ (y- ȳ)2 / n - 1
shows how close a data point is to the mean of the sample – BUT observations in a sample are always closer to their own mean than to the population mean. SO uncorrected SD is a biased estimator (OK as purely descriptive statistic)
What is the trick for comparing performance btw very different-looking values (e.g., meters run vs. time ran)?
Standard deviation!
(use as a “ruler” to measure distance from the mean)
expressing distance with SD “standardizes” the performances
z-score 1
allows us to compare apples and oranges (eliminates units)
letter z denotes values that have been standardized!!
(with mean & SD)
z = y - ȳ / s
z-score = performance - mean performance / standard deviation
z-score 2
Comparsion shows us which score is more extraordinary
z-scores have NO UNITS
they tell us how far the data is from the mean
2 = 2 SD above the mean
-1.5 = 1.5 SD below the mean
shifting data
plus or minus
Only measures of position change (center, min, max)
Neither shape nor spread changes (range, IQR, SD)
rescaling data
multiply or divide
all measures of position (mean, median) and spread change
shape remains constant
standardizing into z-scores shifts data by the mean and rescales them by the standard deviation
Shape stays constant center changes (mean = 0) spread changes (SD = 1)
A statistical model is always wrong. Explain.
it is “wrong” in the sense that it doesn’t match reality exactly