Week 4 Flashcards
Statistically Significant
- If the probability is .05 or less, it is deemed to be statistically significant
- SPSS correlation tables will print values of .05 with one * and values of .01 with two **
- These values give the upper bound only, thus if the correlation is significant at .05, the probability that you got the correlation due to chance is less than .05
- Statistical significance requires an understanding of sampling, probability, and error
Population
The entire membership of the group you are interested in
For example…
• University students
• Australians
• Australians enrolled to vote
• Those suffering agoraphobia
• Left handed people born in a month with an r in it
Sample
- Representative subset of population
* Samples vary depending on who is included
Can a sample be generalised to a population?
• Is the outcome a fluke? • Individuals vary • Groups/samples vary • How much variation is random chance? -> Answered by using probability
Inferential statistics
Draws conclusions about a population based on the observation of a sample
• Uses probability to determine the conclusions that can be made
• If nothing else is known, the statistics of a sample (e.g., the mean) are the best estimates of the population parameters (e.g., height of GU students based on this class).
• But samples may fail to provide good estimates of population for two reasons:
Sampling bias and Sampling error
Probability
Probability allows prediction of random events
- unpredictable in the short term
- predictable over the long term
Sample space
list of all possible outcomes
Simple probability
Outcomes that satisfy condition / Sample space
Probability rules
- Any probability is a number between 0 and 1
- > 0 = will never happen
- > 1 = will always happen - All possible outcomes together have a total probability of p = 1
- The probability that one or another event occurs is the sum of their individual probabilities (Addition rule)
- The probability an event does not occur is 1 minus the probability it does occur
- The probability that two independent events occur together is the product of the probability of each separate event on its own (Multiplication Rule)
Statistical Inference
- foundation of hypothesis testing
• We use sample data to make inferences about population parameters
• This allows the researcher to determine the probability that a sample is from one population and not another
• It enables the researcher to evaluate the veracity (“truth”) of a hypothesis as if a whole population was available instead of just a small (but hopefully) representative sample
Sampling bias
• due to faulty sampling methods, some important subgroups of the population may
be over- or under-represented in our sample
• Systematic variation (e.g., inadvertently got very tall students)
-> avoidable through random sampling
Issues with sampling
• How they are chosen/measured
- Must be representative
- Measures must be reliable and valid
• How big they are
- Larger samples more reliably represent the population
- Samples vary to some degree randomly
• The size and form of this random variation can be estimated
Selecting a sample
• Identify the population of interest
• Identify the potential participant pool
-> Does the pool represent the population?
• Consider recruitment methods
- > Will the recruitment method give equal access to all?
- > Will any restrictions create bias in the sample?
• If the sample is biased, the results won’t be generalisable
Visualising a selected sample
- general population
- target population
- potential participants
- actual sample studied
- funnels down
Types of samples
- sample of convenience
- simple random samples (SRS)
Sample of convenience
- Recruiting those who are available/willing
* Open to selection bias
Simple random samples
• Method most likely to be unbiased
• All members of the population have equal chance of recruitment
• Difficult to achieve in practise
-> If everyone has a number, use random numbers
-> Eg. Use student numbers to choose a simple random sample of Uni students.
Unbiased measurement
• Reliability – repeated measurement gets a consistent outcome
- A reliable measure is dependable, consistent, stable, trustworthy, predictable, and faithful
• Validity – the extent to which you have actually measured what you are interested in
- A valid measure is accurate, truthful, authentic, genuine, and sound
Relationship between reliability and validity
- A test can be reliable without being valid
- Reliability is a prerequisite for validity
- If a test is valid, it is to some extent reliable
- If a test is neither valid nor reliable, it is irrelevant
Obtaining valid and reliable measures
- Use established and objective measurement methods
- Be consistent in your measurement
- Ensure that you obtain an honest measurement
• Be aware of privacy and sensitivity
-> Ensure that your individual is able to provide you with an informed and honest answer
Reporting sampling methods
• Satisfy the reader that the sample was unbiased
- Describe relevant characteristics of your sample
- Briefly state how they were recruited
• Report limitations in representing population
Report measurement process
- What measures were used?
- Under what conditions were measurements taken?
- Report anything which might influence results
Sample sizing
- The larger the sample size, the better!
- The Law of Large Numbers: the larger the sample, the more closely the statistic calculated from the sample will approximate that calculated from the population
- The statistics are said to be more reliable
Considerations in sample sizing
• However, the actual sample size used in research represents a trade-off between:
- Having a large sample size for a good estimate of the population
- Resources available to sample the population
- Statistical considerations of what is a sufficiently large sample size
• If the population is normally distributed a sample of 30 is usually a reasonable size
Sampling error
- no matter how careful, no two samples from the same population will be identical - by chance there would be natural variation in scores (sampling error).
- if I took a random sample of GU students, it would be a complete fluke if the mean for that sample was exactly the population mean.
- the term ‘sampling error’ implies a mistake but this is misleading … it’s a natural thing and can’t be helped.
- so the question is not whether the sample mean differs from the population mean (it almost always will) but how likely is it that the difference we observed could have occurred by chance?
Measuring sampling error
• We can use patterns in variability to estimate sampling error
- To work out sampling variability and sampling error, we need to consider distribution of statistics from multiple samples
• Our measure of sampling error can be applied to work out by how far we are likely to have missed the population parameter
Summary of sampling in stats
- Samples are imperfect representations of the population
- For your inference to be valid, it must be based on an appropriate sample
- The fundamental rule in sampling data is to reduce bias in how you select your individuals
- Good statistics can provide evidence in support of a conclusion and show it is likely to be true
- Probability is never proof