Topic 8: Sample Surveys Flashcards by James Makmur

What is the population?

The full amount of information being studied, collected through a census.

How well did you know this?

Not at all

Perfectly

What is a sample?

A sample is part of the population (subset of original population)

How well did you know this?

Not at all

Perfectly

What are the limitations of a census?

Collecting every unit of a population :

is hard
costs a lot of money
takes a lot of time
requires a lot of resources

Thus, we need samples to continue moving forward without a census

How well did you know this?

Not at all

Perfectly

What is a parameter?

A parameter is a numerical fact (fixed, known number) about the population which we are interest about.

I.e. population mean

How well did you know this?

Not at all

Perfectly

What is an estimate (or statistic)

It is a calculation of sample values which best predicts the parameter

I.e. sample mean

How well did you know this?

Not at all

Perfectly

What are 4 common types of bias?

Selection bias
Non response bias
Interviewer’s bias
Measurement bias

How well did you know this?

Not at all

Perfectly

What is selection bias

A systematic tendency to exclude or include one type of person from the sample who is doing something different, which influences the survey

How well did you know this?

Not at all

Perfectly

What is non response bias

Caused by participants who fail to complete surveys. Non respondents could be very different to respondents –. effects survey

How well did you know this?

Not at all

Perfectly

What is interviewer’s bias

When the interviewer has to make a choice of participants in the survey, or when the characteristics of the interviewer have an effect on answer given by the participant

How well did you know this?

Not at all

Perfectly

What is measurement bias

When the form of the question in the survey affects responses to questionsh

How well did you know this?

Not at all

Perfectly

What are some examples of measurement bias?

Bias in question wording and order, which impacts responses

Recall bias : people forget details

Sensitive questions: People may not tell the truth

Lack of clarity in the question

Attributes of interview process may cause bias

How well did you know this?

Not at all

Perfectly

Will increasing the sample size account for the biases present in survey data collection?

No, instead it would amplify bias by repeating mistake on a larger scale

How well did you know this?

Not at all

Perfectly

How do we pick a good sample

We use a probability method to pick the sample so that:

interviewer isnt involved in the selection and the method of selection is impartial

The interviewer can compute the chance of any particular individuals being chosen. I.e. defined procedure for selecting sample, which uses chance

How well did you know this?

Not at all

Perfectly

What are 3 ways of picking a sample?

Multi stage cluster sampling

Quota sampling

Convenience sampling

How well did you know this?

Not at all

Perfectly

What is multi stage cluster sampling

As simple random sampling isnt often practical, organisations often use this, which involves taking samples in stages, and individuals or clusters are chosen at random at each stage

I.e. randomly make a cluster –> randomly select a cluster –> randomly select individuals from that cluster

How well did you know this?

Not at all

Perfectly

What is quota sampling

Study These Flashcards

A non probability sampling technique where the sample has the same proportion of individuals as the entire population, with respect to known characteristics, or traits.

This results in unintentional bias from interviewers when they choose subjects to survey

What is convenience sampling

Study These Flashcards

A non probabilistic sampling technique, where subjects are selected because of convenience of accessibility. Not recommended except to test the survey (pilot)

E.g. for psych experiments, uni students often make up samples, because its easy –> demographical problems

Can we avoid all bias?

Study These Flashcards

No there is some level of unavoidable bias. Even with a probability method determining the sample, bias (e.g. non response) could easily come in)

In addition, because sample is only part of the population, we have chance error

What is the equation for estimates?

Study These Flashcards

Estimate = parameter + bias + chance error

Estimate = parameter + non sampling error + sampling error

What are some common methods of surveys?

Study These Flashcards

Mail
Face to face interviews
phone interviews
Online
Self administered surveys

Can chance error occur in sample surveys?

Study These Flashcards

Sample surveys involve chance error because each sample is just one possible draw from the population

Here we use the box model to quantify the likely size of chance error when estimating a proportion using simple random sampling. Standard errors (SE) measure variability across different samples from some population

Each different sample will give a different estimate

What does proportion of a sample survey mean?

Study These Flashcards

If it asks about proportion of a sample survey, it means the mean of the sample (not sum)

What affects accuracy of samples

Study These Flashcards

When sampling with replacement, SE is determined by the absolute size of the sample

When sampling without replacement, SE will be decreased by increasing the ratio of sample size to population size, as when a higher proportion of the population is sampled, the variability will increase

When sample is only a small part of population size, population has almost no effect on the SE of the estimate

Can we apply drawing without replacement to the box model?

Study These Flashcards

No, it would be a different context as box model assumes draws with replacement

Do we need to adjust SE of the box model to account for without replacement?

Yes, we have to do it throguh a correction factor

What are the formulas for SE without replacement?

SE without replacement = correction factor x SE with replacement

What is the equation for correction factor?

sqrt ( (number of tickets - number of draws) / (number of tickets - 1) ) OR sqrt ( (population size - sample size) / (population size - 1) )

What is bootstrapping?

It helps with addressing the problem of predicting population proportion from sample proportion Boostrapping is estimating the properties of the population by using the properties of a particular sample When sampling from a 0-1 box, we replace the unknown properties of 1's in the box (population) by the known proportion of 1's in a particular sample

What are the steps involved in bootstrapping?

1) create an approximate box which has the same proportions of 0's and 1's as the sample 2) use the box model OR as internet says: Choose a number of bootstrap samples to perform Choose a sample size For each bootstrap sample Draw a sample with replacement with the chosen size Calculate the statistic on the sample Calculate the mean of the calculated sample statistics. And then apply to the population

What are confidence intervals?

A confidence interval, in statistics, refers to the probability that a population parameter will fall between two set values.

What are the confidence intervals equal to? (68%, 95%, 99.7%)

68% --> sample proportion +- 1 x SE 95% --> sample proportion +- 2 x SE 99.7% --> sample proportion +- 3 x SE

What do we say when we have a 95% confidence interval

It is a mistake to say that the probability that the interval contains the unknown parameter is 0.95. Instead, we say that if we workout a series of CIs for a series of sample, then 95% of the CIs would contain the unknown parameter

How can we stimulate a series of CIs? (using a random example)

I.e. create a population of size 1000000, with proportion of 1s ("yes" votes) is 0.67 Draw a sample of size 1000 from population, and calculate a 95% CI for the population proportion. Repeat the sampling 100 times, forming 100 CIs. Graph the 100 CIs Draw a red line to represent the true population proportion (0.67) and calculate how many CIs fall inside and outside the red line, we expect 95% of CIs to cover the true population Unless we draw without replacement, the fpc (pop - correction) applies on the SE

How can we justify the CI formulas

We assume the sample proportion (estimating population proportion) follows a normal distribution Recall all normal distributions satisfy the "68% - 95% - 99.7%" rule

Topic 8: Sample Surveys Flashcards

(35 cards)