Sampling CI Flashcards
What is Sampling?
The process to determine who we are going to study/examine
What is the purpose of Sampling?
To find out information without talking to everyone.
What are two types of sampling?
Nonprobability
Probability
What is Probability sampling?
Systemic technique that is used to select respondents - goal is to create a sample as representative of the population as possible.
When is Probability sampling used most frequently?
Quantitative research
–> Leaving the selection up to chance
What is Non-probability sampling?
Based on researcher subjective judgment rather than random selection
What are traits of Non-probability Sampling?
- Less generalizability; problem with representativeness
- Lower confidence in findings
- Useful when probability sampling can’t be used
- Four common methods
What are the four common methods of Non-probability sampling?
- Purposive
- Convenience
- Snowball
- Quota
What are four traits of Probability Sampling?
- Used to Generalize population at large
- Works toward representativeness
- Used in all large-scale surveys/observational studies
- Avoids Sampling Bias
What is Sampling Bias?
Selecting atypical folks.
-Numerous ways to introduce bias into your sample.
What is a “Representative” sample?
- Your sample is like the population
- Random selection!
- All members have an equal chance of being selected…
Probability samples are. . .
. . .never perfect.
-More representative than non-probability.
What is an Element?
Individual members of the population to which the study would be generalized
What is the Population?
The entire set of elements. Doesn’t have to just be individuals - other entities - to which the study findings will be generalized.
What is the Sampling Frame?
LIST of all the elements in a population. Want to study students - registrar’s list of students - the students selected to be interviewed would be the elements
-Important for representativeness but not easy to acquire
What are examples of Sampling Frames?
- Telephone directories
- Tax records
- Registrar’s list
What is the key question with Probability Sampling?
Who can I generalize these findings to?
What is Parameter?
Summary of a given variable in a population
What is a Statistic?
Summary of a given variable in a sample
What is the Sampling Distribution?
All the possible random samples that could be selected
What are random samples?
Samples that represent a population
What are four commonly discussed sample types?
- Simple Random
- Systematic
- Stratified
- Multistage Cluster
- -PPS sampling (a form of cluster sampling)
What is Simple Random Sampling?
- Base of sampling
- -Need a list (sampling from)
- -Assign a number
- -Select by a random number
- –> Random number list
- Seldom used in this deliberate way; some use of computer generated random numbers
What is Systematic Sampling?
- Determine number needed
- Divide population by sample number desired (we call this our sampling interval, denoted here by ‘k’)
- List and number our elements
- Randomly select start point
- Select every k-th elements within groups
- Caution: avoid periodicity!
What is Stratified Sampling?
- Possible modification of previous techniques
- Random sample from subpopulation
- Better representativeness
- Decreases some sampling error
- -> Homogenous subsets
- Allows oversampling
What is Cluster Sampling?
- More complex methodologically (not conceptually, I hope)
- Cluster = Groups of elements
- Multi-stage
- –> Basic stages/steps: listing and sampling
- Helps with cost and dispersed populations
- Increases sampling error potential
- -> Two samples: double the error opportunity
What two techniques are used to make experiments Comparable (between control & experimental groups)?
- Randomization
- Matching
What is Randomization?
Recruited folks (who may have been selected using nonprobability sampling techniques) are randomly placed into control and exp. groups
What is Matching?
Assign people to group based on characteristics so groups match
As the sample size goes up, the shape of the sampling distribution takes on an important shape. . .
. . .the normal curve!
What is Sampling Error?
- Variation in values of your sample mean compared to the population mean
- Because of sampling error, we probably won’t always have completely accurate estimates
- Deviation between sample results and population
How can you reduce Sampling Error?
- Increase sample size
- Increase homogeneity
What are six characteristics of the Normal Curve from the Central Limit Theorem?
- Theoretical distribution of scores
- Perfectly symmetrical
- Bell-shaped
- Unimodal
- Tails extend infinitely in both directions
- Mean, median, and mode are equal
Assumption of normality of a given empirical distribution . . .
. . .makes it possible to describe this “real-world” distribution based on what we know about the (theoretical) normal curve.
What do we use the normal curve assumption for?
To generalize sample findings to a population.
How many cases/how much area falls within 1 standard deviation of the mean?
0.68 of the area, 0.34 on each side of the mean.
How many cases/how much area falls within 2 standard deviation of the mean?
0.95 of the area or 95% of cases
How many cases/how much area falls within 3 standard deviation of the mean?
0.997 of the area or 99% of cases
What is the Sampling Distribution used as?
An Estimate!
If an infinite number of samples were conducted and some outcome was plotted. . .
The resulting distribution (for mans and proportions) would be “normal”
Over the long run, any particular largish random sample estimate (outcome) has a 95% chance of being within. . .
. . .1.96 standard error units of the population parameter it represents.
What is the Z-distribution?
Just a special case of the normal distribution.
What is the mean and S.D. of the Z-distribution?
Mean = 0 S.D. = 1
What does the Z-distribution allow us to do?
Use a corresponding z-table to look up critical values
What are the common z-scores for each confidence level (90%, 95%, 99%)?
- 65 = 90% CL
- 96 = 95% CL
- 58 = 99% CL
What is the Confidence Level?
(Significance Level)
- Probability our sample statistics fall within a given confidence interval
- We set this ahead of time and denote as alpha. Most frequently, it’s alpha = 0.05 (95%).
What is the Confidence Interval?
- Range within ‘true’ parameters should lie, range of values around the estimate (point estimate)
- Upper and lower limit for the confidence level
What confidence interval do many of the biomedical books use?
CI = mean +/- 1.96 (standard errors), but this assumes a 95% confidence level (that’s where they are getting the z-score of +/- 1.96).
What does random selection allow us to do?
Connect our sample findings to ‘probability theory’ concepts so we can estimate how accurate our findings are.
I am x% confident that the population parameter falls between a-b. What is the confidence interval? What is the confidence level?
x% = confidence LEVEL (alpha)
Values between a - b = confidence INTERVAL
The large the confidence level. . .
. . .the narrower our confidence interval (CI).
How do you calculate Standard Error?
SE = SD/sq. rt. [N]
How do you calculate confidence interval?
Mean score +/- Z-score (which is usually 1.96) X SE
What is the SE for each Confidence level?
90% - 1.65
95% - 1.96
99% - 2.58
Wider the interval…
…weaker the evidence.
Narrower the interval…
…stronger our case.
What does the width of a confidence interval (CI) depend on (three things)?
- alpha/confidence level: The confidence level can be raised (e.g., to 99%) or lowered (e.g., to 90%)
- N: We have more confidence in larger sample sizes so as N increases, the interval decreases
- Variation: more variation = more error
- -> For proportions, % agree closer to 50%
- -> For means, higher standard deviations