Chance Variability Flashcards
Statistical models
much simpler than the “real” data-generating process but
(hopefully) capture the key features, at least in terms of the random variability of the data.
The box model
Very simple statistical model for numerical data.
* A collection of N objects, e.g. tickets, balls is imagined “in a box”.
* Each object bears a number.
* A random sample of a certain number of the objects is taken.
* The sampling may be with or without replacement.
Random samples
A sample of the appropriate size is taken in such a way that each possible sample is equally likely
Suppose that for a box model with a large number of tickets, the histogram of the list of numbers is “bell-shaped”
we only need to approximate
* certain areas under the histogram boxes and thus
* certain chances/probabilities
Random draw =
Expected value + Chance error
Standard error
the “root-mean-square” of the error box.
* Measures the “size” of the errors in some sense.
sample sum
- taking a sample and then
- computing the sum
(e.g taking sample size n=2 then getting the sum is the same as taking a single random draw from a larger box)
SE(mean) gets smaller as…
the sample size gets bigger
What is the new mean and SD if we multiply each value by c?
multiply the old mean and SD by c
sample mean =
sample sum/n
EV(mean) =
mean or EV(sum)/n
SE(mean) =
SD/sqrt(n) or SE(sum)/n
The sample mean actually “____” as the sample size n → ∞ .
converges to the population mean
Gambler’s fallacy
if in a long series of random draws with replacement, a number fails to be drawn in the early part of the sequence, it is more likely to be drawn later on.
The law of large numbers (or law of averages) implies the gambler’s fallacy.
False