Topic 8: Sample Surveys Flashcards

1
Q

What is the population?

A

The full amount of information being studied, collected through a census.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample?

A

A sample is part of the population (subset of original population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the limitations of a census?

A

Collecting every unit of a population :

is hard
costs a lot of money
takes a lot of time
requires a lot of resources

Thus, we need samples to continue moving forward without a census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a parameter?

A

A parameter is a numerical fact (fixed, known number) about the population which we are interest about.

I.e. population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an estimate (or statistic)

A

It is a calculation of sample values which best predicts the parameter

I.e. sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are 4 common types of bias?

A

Selection bias
Non response bias
Interviewer’s bias
Measurement bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is selection bias

A

A systematic tendency to exclude or include one type of person from the sample who is doing something different, which influences the survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is non response bias

A

Caused by participants who fail to complete surveys. Non respondents could be very different to respondents –. effects survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is interviewer’s bias

A

When the interviewer has to make a choice of participants in the survey, or when the characteristics of the interviewer have an effect on answer given by the participant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is measurement bias

A

When the form of the question in the survey affects responses to questionsh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some examples of measurement bias?

A

Bias in question wording and order, which impacts responses

Recall bias : people forget details

Sensitive questions: People may not tell the truth

Lack of clarity in the question

Attributes of interview process may cause bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Will increasing the sample size account for the biases present in survey data collection?

A

No, instead it would amplify bias by repeating mistake on a larger scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we pick a good sample

A

We use a probability method to pick the sample so that:

interviewer isnt involved in the selection and the method of selection is impartial

The interviewer can compute the chance of any particular individuals being chosen. I.e. defined procedure for selecting sample, which uses chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 3 ways of picking a sample?

A

Multi stage cluster sampling

Quota sampling

Convenience sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is multi stage cluster sampling

A

As simple random sampling isnt often practical, organisations often use this, which involves taking samples in stages, and individuals or clusters are chosen at random at each stage

I.e. randomly make a cluster –> randomly select a cluster –> randomly select individuals from that cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is quota sampling

A

A non probability sampling technique where the sample has the same proportion of individuals as the entire population, with respect to known characteristics, or traits.

This results in unintentional bias from interviewers when they choose subjects to survey

17
Q

What is convenience sampling

A

A non probabilistic sampling technique, where subjects are selected because of convenience of accessibility. Not recommended except to test the survey (pilot)

E.g. for psych experiments, uni students often make up samples, because its easy –> demographical problems

18
Q

Can we avoid all bias?

A

No there is some level of unavoidable bias. Even with a probability method determining the sample, bias (e.g. non response) could easily come in)

In addition, because sample is only part of the population, we have chance error

19
Q

What is the equation for estimates?

A

Estimate = parameter + bias + chance error

OR

Estimate = parameter + non sampling error + sampling error

20
Q

What are some common methods of surveys?

A

Mail
Face to face interviews
phone interviews
Online
Self administered surveys

21
Q

Can chance error occur in sample surveys?

A

Sample surveys involve chance error because each sample is just one possible draw from the population

Here we use the box model to quantify the likely size of chance error when estimating a proportion using simple random sampling. Standard errors (SE) measure variability across different samples from some population

Each different sample will give a different estimate

22
Q

What does proportion of a sample survey mean?

A

If it asks about proportion of a sample survey, it means the mean of the sample (not sum)

23
Q

What affects accuracy of samples

A

When sampling with replacement, SE is determined by the absolute size of the sample

When sampling without replacement, SE will be decreased by increasing the ratio of sample size to population size, as when a higher proportion of the population is sampled, the variability will increase

When sample is only a small part of population size, population has almost no effect on the SE of the estimate

24
Q

Can we apply drawing without replacement to the box model?

A

No, it would be a different context as box model assumes draws with replacement

25
Q

Do we need to adjust SE of the box model to account for without replacement?

A

Yes, we have to do it throguh a correction factor

26
Q

What are the formulas for SE without replacement?

A

SE without replacement = correction factor x SE with replacement

27
Q

What is the equation for correction factor?

A

sqrt ( (number of tickets - number of draws) / (number of tickets - 1) )

OR

sqrt ( (population size - sample size) / (population size - 1) )

28
Q

What is bootstrapping?

A

It helps with addressing the problem of predicting population proportion from sample proportion

Boostrapping is estimating the properties of the population by using the properties of a particular sample

When sampling from a 0-1 box, we replace the unknown properties of 1’s in the box (population) by the known proportion of 1’s in a particular sample

29
Q

What are the steps involved in bootstrapping?

A

1) create an approximate box which has the same proportions of 0’s and 1’s as the sample

2) use the box model

OR as internet says:

Choose a number of bootstrap samples to perform
Choose a sample size
For each bootstrap sample
Draw a sample with replacement with the chosen size
Calculate the statistic on the sample
Calculate the mean of the calculated sample statistics.

And then apply to the population

30
Q

What are confidence intervals?

A

A confidence interval, in statistics, refers to the probability that a population parameter will fall between two set values.

31
Q

What are the confidence intervals equal to? (68%, 95%, 99.7%)

A

68% –> sample proportion +- 1 x SE

95% –> sample proportion +- 2 x SE

99.7% –> sample proportion +- 3 x SE

32
Q

What do we say when we have a 95% confidence interval

A

It is a mistake to say that the probability that the interval contains the unknown parameter is 0.95.

Instead, we say that if we workout a series of CIs for a series of sample, then 95% of the CIs would contain the unknown parameter

33
Q

How can we stimulate a series of CIs? (using a random example)

A

I.e. create a population of size 1000000, with proportion of 1s (“yes” votes) is 0.67

Draw a sample of size 1000 from population, and calculate a 95% CI for the population proportion. Repeat the sampling 100 times, forming 100 CIs. Graph the 100 CIs

Draw a red line to represent the true population proportion (0.67) and calculate how many CIs fall inside and outside the red line, we expect 95% of CIs to cover the true population

Unless we draw without replacement, the fpc (pop - correction) applies on the SE

34
Q

How can we justify the CI formulas

A

We assume the sample proportion (estimating population proportion) follows a normal distribution

Recall all normal distributions satisfy the “68% - 95% - 99.7%” rule

35
Q
A