Sampling Flashcards

1
Q

census data

A

data describing variables for every single case in the population of interest. Costly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

random sample

A

procedure to achieve an equal probability sample, randomly sampling each unit in the population to ensure they have the same probability of selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

probability/equal probability sample

A

sample in which every population unit has a known probability of selection. Equal probability sample- every unit has the same probability of selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

sampling error

A

Indicated by some statistic, sampling error is how far off the property characteristics of a variable are from the population. The larger the sample size, the more representative and closer to the population your sample is. -“random chance”
-data will be off but in no way systematically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

representative sample

A

very similar of what one would have found had they used census data, can generalize. no external validity problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

non-response bias

A

Systematic bias doing from missing data on some or all variables for a given sample of cases. Missing data is almost always systematically different than data that is accounted for, thus, you cannot generalize your data to the overall population. Not representative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

selection bias

A

Bias data set due to sampling bias and/or non-response bias that has both external and internal validity problems; case missing from the dataset are systematically different on the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

external validity problem w/ selection bias

A

The bias in one’s sample is correlated with one’s dependent variable, meaning that it is systematically different from the population in terms of “y” score- too many/few high “y” scores or too many/few low “y” scores. One’s sample clearly has external validity problems as it is not representative of the population on “y” and cannot generalize.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

internal validity problem w/ selection bias

A

The correlation between one’s “y” and any “x” variable will be biased toward zero, signifying no correlation, it will tend to be closer to zero than one would find in the full set of population data (census); i.e. your correlation will be closer to zero than is actually true. Because the correlation between x& y determines if “x” has a causal effect on “y”, we cannot make causal inferences by the existence of selection bias in sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

post-stratification weights

A

Identify: calculated after the data is collected based on things that went wrong w/ a sample, ex. non-response bias. They are used to create a more representative sample-if your sample has too many or too few of a certain group/groups, post-strata weights can count them as more or less than one.

Significance: creating a sample that looks more representative, however, the weights make the bias worse. Subjects who did not respond are most likely different than those that do, thus, applying weights to make these respondents more or less does not create a truly representative sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

sampling weights

A

Sampling weights are calculated before you collect data, used to create a population correct average when one has oversampled a particular group or groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

stratified random sample

A

identify: a type of sampling that breaks the population into groups called strata and draws a random sample within each group
significance: Gold standard, SRS eliminates sampling error on the stratifying variable by ensuring that the sample distribution is the same as the population distribution on that variable. It also reduces sampling error on any variable correlated with the stratifying variable. The sample’s representativeness is increased as the sampling error decreased. More dispersed sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

systematic random sample

A

-random draw to start then move forward.

When is this most commonly used? 1. economic research. 2. exit polling in politics

ex: pick a # 1-10, 7. Interview every 7th person you see; systematically choosing who to interview.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

cluster sampling

A

1) randomly select clusters. 2)randomly select cases w/in clusters.
ex: select 10 states, select 50 senators within each state. OR select 25 states, select 20 senators w/in each state.
- -this sample as a whole is less clustered and more dispersed as you have more locations. your sampling error decreases because your clusters are more representative of the population.

Why use it?

1) less costly (having to travel to less places, for ex)
2) forced into cluster sampling by not having a population list.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

convenience sample

A

Non-probabiliy sample that gathers a group of case however is most convenient. The sample has no ability to generalize to the population, an external validity problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

quota sampling

A

Non-probability sample that starts off by setting a quota for certain subgroups of your sample.

Significance: makes your sample look like the population- not representative because quotas get filled very selectively; quota allows you to construct a sample you want to see so it seems representative of the population.

17
Q

snowball sample

A

Non-probability sample that consists of two stages.

  • stage 1: compile a list of people to interview and get their referrals
  • stage 2: interview those referrals and continue this process until 1. your money runs out. 2. you reach the saturation point: all the names referred are already on your compiled list, at which point your snowball becomes a census- you have interviewed everyone in the population of interest.
18
Q

equal probability sample

A

-eace case/ population unit has the same probability of selection which gives a representative sample of the population.

probability of selection = sample size/ population size
n/n(pop)