Weeks 1-3 Flashcards
Population
The entire collection of events in which you are interested
E.g. all men, all women, all Deakin students
Sample
Subset of the population that is being studied
Parameter
Any value we obtain that is characteristic of the population
E.g. the average income of Australian office workers
Descriptive statistics
Used to describe the data by summarising, determining averages and ranges.
Makes large amounts of data more manageable.
Inferential statistics
Used when we want to answer research questions
I.e. When we infer the behaviour of the population based on the dataset recovered from the sample
The difference between the sample statistic and the corresponding population parameter (because our data will never be 100% accurate)
Sampling error
Variable
Something that can take on different values
E.g. Age, speed, time
A variable that has a limited number of values
E.g. Gender, set categories
Discrete variable
A variable that can take on different valuesE.g. Time, age, IQ
Continuous variable
Dependant variable
The variable which is observed for differences / changes.
Influenced by the IV.
E.g. Levels of depression in control vs treatment groups
Independant variable
The variable which is manipulated by the research.
The IV influences the DV.
E.g. Group membership - participants assigned to either high or low anxiety groups
Measurement data
Generally the mean, variance, and standard deviation
E.g. Mean age of students
Categorical data
Generally percentages and frequencies
E.g. 25% were female, 12% had black hair
Nominal measurement scale
Categories with different names, no underlying scale, and no ordering.
E.g. Religion, hair colour, gender
Ordinal measurement scale
Categories with different names and organised into an ordered sequence, however distance between categories is unknown
E.g. Degree of illness (none, mild, moderate, severe)
Interval measurement scale
Equal distances between points on the scale.
Generally many more points than on an ordinal scale, usually continuous data.
No true zero point.
E.g. Temperature
Ratio measurement scale
Equal distances between points on the scale AND has true zero point.
E.g. Time, length, age
What are the different kinds of measurement scales?
Nominal
Ordinal
Interval
Ratio
Frequency distribution
How often each score appears on in a dataset.
Can be difficult to determine trends in larger datasets.
Same info as a frequency distribution, but graphically illustrated.
Histogram
Stem and leaf plots
Can summarise data in a simple way
Normal distribution
Most scores in the middle, fewer in the extremes
Bi-model distribution
When a frequency distribution has two peaks
Positive skew
Most scores at the low end of the scale
Negative skew
Most scores at the high end of the scale
Kurtosis
Refers to how flat or peaked the distribution appears
Leptokurtic
Distribution characterised by high peak at the centre of the scale
Platykurtic
Distribution is flatter, with less scores in the centre
Central tendency
The tendency of a random variable to cluster around is mean, median, or mode
Variability
How good is the mean as a representation of the data?
Low variability
The mean is a good representation of the data
High variability
The mean is a bad representation of the data. The mean deviates significantly from the data points. E.g. 12 1 78 10 148 Mean = 50
Average deviation
- Calculate the mean
- Calculate how much each score deviates from the mean
- Calculate average of the deviation
Absolute deviations
When only absolute values are used (remove the negative factor)
Variance
Represented by?
Useful for?
Equation?
Measures how far a set of numbers is spread out from their average value
Represented by s2 or σ2
Most common measures of variability.
Crucial for inferential statistical methods
s2 = Σ(x - x̅)2 / N - 1
I.e. sum of the squared deviations from the mean divided by N - 1
Standard deviation
Equation?
Correlation with variance?
Shows how much values differ from the mean.
Standard deviation is the square root of the variance.
Standard deviation = σ = √[Σ(x - x̅)2 / N - 1]
- Calculate the deviations
- Square the deviations to get absolute deviations
- Sum of the deviations
- Divide the sum by (n - 1)
- Square root of remaining value
Mean = 0 σ = 1
Standard normal distribution
Z-scores
Equation?
Indicates how far from the mean a data point is
z = (x - x̅) / σ
μ
mean of population
σ
Standard deviation
‘s’ commonly used in lieu
x̅
mean
Setting probable limits on z
Definition?
Use?
Equation?
Allows researchers to set limits on a score to establish a certain degree of confidence in their results.
Usually employ 95% confidence intervals so that they can say they are 95% confident in their results.
x = μ ± z-score x σ
Sampling error
Difference in means between population and sample
Hypothesis testing
Being able to test our hypothesis to determine whether we discount chance errors in the result or if there is a meaningful result
Sampling distributions
Degree of variability between samples we can expect to see by chance.
Using sampling distribution we can say how likely it is that we will find a particular sample mean within a population.
Standard error
Average distance between a sample mean and a population mean
Mean difference
/
Difference expected by chance (standard error)
Hypothesis
Specific, testable predictions
Null hypothesis
The hypothesis that there is no difference between certain characteristics in a population.
The starting point for any statistical test.
Type I error
When we erroneously reject a true null hypothesis
I.e. When we say something is significant, but it isn’t
Type II error
When we fail to a show that a statistic is significant when it really is
One-tailed test
One directionality displayed in a test
Two-tailed test
No directionality specific test
Directionality
Predicting the direction of difference
E.g. We predict that the mean will be higher than the population