Sampling Error and Bias Flashcards

Question 1

Q

Why is the sampling distribution important?

Answer

A

We never draw lots of samples. We estimate the population parameter from a single or small number of samples. Our point estimate is drawn from a theoretical sampling distribution. Variation associated with this distribution is influenced by sample size.

Question 2

Q

What is sampling distribution?

Answer

A

A sampling distribution is a probability distribution of a statistic obtained through a large number of samples drawn from a specific population.

Question 3

Q

What is the central limit theorem?

Answer

A

Tells us the sampling distribution will approximate to a normal distribution with sufficient sample size, representative sample, random sampling.

Question 4

Q

What is a confidence interval?

Answer

A

defines a range in which we estimate the true value will fall, accept some error (level of confidence 95%)
2xME

Question 5

Q

What does a 95% confidence level mean?

Answer

A

We accept a 5% likelihood that our confidence interval will not contain the true value.

Question 6

Q

What is margin of error?

Answer

A

Confidence interval is constructed by ME either side of our point estimate (mean). SE x 1.96

Question 7

Q

Standard Error

Answer

A

Measure of how much our estimate differs from the true population value.

Question 8

Q

How would you get a precise estimate, with a narrow confidence interval?

Answer

A

Increase sample size

Question 9

Q

When do we use t-scores?

Answer

A

When dealing with small samples (<40). Instead of z scores and normal distribution.

Question 10

Q

What do we have to do when calculating confidence interval for RR and OR?

Answer

A

We must log transform estimate and then antilog it as they do not follow a normal distribution.

Question 11

Q

Define sampling frame.

Answer

A

Actual list of survey population from which the sample is drawn, after which inclusion and exclusion criteria have been determined.

Question 12

Q

define sampling fraction.

Answer

A

Ratio between sample size and population size.

Question 13

Q

What is systematic error?

Answer

A

Sample not representative of population due to inaccuracy in sampling design or procedures of measurement. Form of bias. Predictable and once identified can be avoided. Will likely not form normal distribution.

Question 14

Q

What is random error?

Answer

A

Not predictable. Caused by natural fluctuations in sampling or measurement process. When plotting random errors as a histogram they should always form a normal distribution.

Question 15

Q

Describe the process of simple random sampling?

Answer

A

Identify survey population, create sampling frame, list eligible units, number them, determine sample size needed, randomly draw units (random number generator).

Question 16

Q

What are the advantages of simple random sampling?

Answer

A

simple, sampling error easily measured, every unit in frame has equal probability of being selected

Question 17

Q

What are limitations of simple random sampling?

Answer

A

create list of all units, get list of units from records (what if they don’t represent the population e.g. telephone directory excludes people without telephone), logistical challenge (time and cost), important minority groups may be missed by chance

Question 18

Q

Describe systematic sampling.

Answer

A

identify survey population, sampling frame, arrange units in a sequence (alphabetically), determine sample size, divide sampling population by sample size, choose random starting point, draw units at reg. intervals.

Question 19

Q

Advantages of systematic sampling.

Answer

A

simple, easy to implement, sampling error easily determined, ensures representivity.

Question 20

Q

Limitations of systematic sampling.

Answer

A

Needs a complete list that is representative of target population, patterns in ordering sequence increases probability of some units being selected.

Question 21

Q

Describe cluster sampling.

Answer

A

list potential clusters e.g. all schools in a state 2. list of units in each cluster 3. calculate systematic sampling interval (cumulative population/number desired clusters) e.g. say it is 738 4. choose random start number between 1 and 738 5. select remaining clusters

Question 22

Q

Advantages of cluster sampling.

Answer

A

complete list of units not needed, less travel, within clusters all units have equal probability of being selected

Question 23

Q

Limitations of cluster sampling.

Answer

A

positive covariance within a cluster (bias), increased sampling (standard) error

Question 24

Q

Describe stratified sampling.

Answer

A

Stratify the sampling frame into homogenous sub-populations (strata), sample drawn randomly from each strata.

Question 25

Q

Advantages of stratified sampling.

Answer

A

Info on subgroups, increased precision so can have a smaller sample, economical, can have several strata

Question 26

Q

Limitations of stratified sampling.

Answer

A

more effort in administration to classify every unit to a category, a participant may classify into several sub-groups, harder to measure sampling error, ss at strata level may be low (high random error and loss of precision)

Question 27

Q

What is the sampling fraction?

Answer

A

Use in stratified sampling to ensure probability proportional to size. SS/population x100

Question 28

Q

What is multistage sampling?

Answer

A

Use a combo of methods. e.g. 1)identify primary sampling unit (clusters) 2. select sampling units from a cluster

Question 29

Q

What is optimal sampling method is precision and reduction of sampling error were the priorities?

Answer

A

stratified random sampling with replacement

Question 30

Q

What is optimal sampling method if logistics is a consideration or if study is on an intervention targeted at community level?

Question 31

Q

What is the objective of an estimation study?

Answer

A

Estimate a population parameter (mean or prevalence) from a sample.

Question 32

Q

What is the objective of a comparative study?

Answer

A

Compare groups to assess whether there s any statistically or clinically significant difference between them (expressed as a hypothesis test).

Question 33

Q

What do sample size calculations presume?

Answer

A

simple random sampling, random error

Question 34

Q

What do we need to know for sample size calculations in surveys?

Answer

A

confidence level (z stat- 1.96), sd (if estimating a mean) or proportion/prevalence (if estimating a single proportion), precision (0.05)

Question 35

Q

What do we need to know to calculate sample size of a comparative study?

Answer

A

threshold for a sig. result (0.05), power (0.8/0.9), base level for one of the groups (estimate from previous studies), minimum effect size (min RR or OR)

Question 36

Q

What is power?

Answer

A

The probability of making a correct decision to reject null hypothesis.

Question 37

Q

How would you boost the power of a comparative study

Answer

A

increase significance threshold, increase effect size, reduce variation, use a one tailed test

Question 38

Q

Describe the properties of the normal distribution?

Answer

A

symmetrical shape, deviations away from centre equally in + or - direction, mean and median directly in centre.
for cont. normal distributions the probabilities of all poss. outcomes are represented by the area under the curve.

Question 39

Q

How does a standard Normal distribution differ from normal distribution?

Answer

A

Standard normal is referenced to a standardised scale where mean=0 and variance=1.

Question 40

Q

What is a z-score?

Answer

A

found on a standardised normal distribution. calculated by x-mean/sd. They tell us how many sd’s an observation is above or below the mean. This allows comparisons of distributions expressed in different units.

Question 41

Q

The 95% reference range lies within how many sd and what z score?

Answer

A

+ and - 2sd and -1.96 +1.96 z-score.

Question 42

Q

When do you use a students t test?

Answer

A

When sample sizes are small or the sd of the population is not known.

Question 43

Q

How is the t-distribution used and what does its shape look like?

Answer

A

Used in t-test to construct CI for diff between 2 population means, and in linear regression analysis. It is bell shaped but with a stronger peak and longer tails.

Question 44

Q

How does a t-score differ from a z-score?

Answer

A

(x-mean) / (sd/root n)

uses sd of sample, not population

Brainscape's Knowledge GenomeTM

Sampling Error and Bias Flashcards

Brainscape's Knowledge Genome^TM