1.2 Flashcards
Population
The collection of units to which we want to generalize a set of findings or a statistical model.
(i.e. people, plankton, plants, cities, suicidal authors, etc.)
Sample
A smaller (but hopefully representative) collection of units from a population used to determine truths about that population.
The mean is a model of
what happens in the real world: the typical score.
It is not a perfect representation of the data.
A deviation is…
the difference between the mean and an actual data point.
Sum of Squared Errors
- We could add the deviations to find out the total error.
- Deviations cancel out because some are positive and others negative.
- Therefore, we square each deviation.
- If we add these squared deviations we get the sum of squared errors (SS).
Variance
- The sum of squares is a good measure of overall variability, but is dependent on the number of scores.
- We calculate the average variability by dividing by the number of scores minus 1 (which is called the degrees of freedom).
- This value is called the variance (s^2).
SS / (N-1)
The variance has one problem:
It is measured in units squared. So difficult to interpret.
The standard deviation
Since the variance is measured in units squared. We take the square root to make it a meaningful metric.
The sum of squares, variance, and standard deviation represent the same thing:
- The ‘fit’ of the mean to the data
- The variability in the data
- How well the mean represents the observed data
- Error
Central Limit Theorem
The distribution of the sample means will be approximately normally distributed
Central Limit Theorem
How can we measure the accuracy of this average?
- We can use the standard deviation of the sample means.
- In fact, we could collect a very large number of samples, and calculate the standard deviation of the sample means from the population mean.
- Because this is tedious and almost impossible, statisticians have found an approximation.
—> approximation = standard error
Test Statistics
- A statistic for which the frequency of particular values is known.
- Observed values can be used to test hypotheses.
Type I error
- occurs when we believe that there is a genuine effect in our population when, in fact, there isn’t
- The probability is the α-level (usually .05)
Type II error
- occurs when we believe that there is no effect in the population when, in reality, there is.
- Or, put differently: when we use tests, do not find an effect, but there really is one.
- The probability is the β-level (often .2)
Examples type 1/2 error
Type 1: we believe pregnancy is there, but it’s actually not there
Type 2: we believe pregnancy is not there, but there’s actually a pregnancy present
Type 1: covid test, you think you have it, but you don’t
Type 2: covid test, you think you don’t have it, but you do