Data Sampling Flashcards

1
Q

What are the different sampling methods?

A
  • Population
  • Random
  • Systematic
  • Stratified
  • Cluster
  • Consecutive
  • Convenience
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Population sample?

A

Measuring everyone in the population

e.g. facebook users - have the entire population
rare to have such a large and rich dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Random sampling?

A

Random process to select a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Systematic sample?

A

Apply a rule to pick the sample e.g. every 5th person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a Stratified sample?

A

In different layers/strata
e.g. 10 from england, 10 from wales
or 10 women 10 men

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a Cluster sample?

A

Go to one place and sample e.g. one hospital

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a Consecutive sample?

A

E.g. ‘start on Tuesday and collect people until 100 is reached.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Convenience sampling?

A

e.g going downstairs to the cafe now to pick a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Source Population?

A

The overall population that we are trying to study e.g. patients on chemo in the southwest

Statistically, a population has a value (e.g. mean etc) which can almost never be truly known, but we can estimat eit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the study population?

A

Where is the sample coming from? e.g. a database of all patients on chemotherapy in the southwest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does sample size effect the estimated mean?

A

If the sample is random, the larger the sample size, the closer the mean of the sample will be to the mean of the actual overall population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Sampling Distribution of the mean?

A
  1. Small sample from a normally distributed population is taken (e.g. 10)
  2. Mean is found of this population
  3. Repeat steps 1 and 2 lots of times
  4. Plot the distribution of the means - will look like a normal distribution
  5. The mean of the sample means = the population mean

AKA if we take loads of repeated samples we can eventually guess something we can’t actually measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Standard Deviation?

A

Descriptive statistic

    1. Standard deviation of a sample from a population (descriptive statistic)
  1. It measures the variability or ‘width/spread’ of the population data
  2. It does not change as the sample gets larger
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Standard Error?

A

Inferential statistic

  1. Theoretical **standard devation of the sampling mean distribtution **(all the means of samples plotted)
  2. Gives a measure of the **precision ** of the estimate of the mean (aka if the standard error is smaller the closer the mean of the means is to the true mean of the population, the more accurate your guess is)
  3. It is always smaller than the standard devation of the sample
  4. It gets smaller as the samples get larger - as the samples get bigger the closer their means are to the true mean
  5. **It gets larger as the standard deviation of the population gets larger **- if the sample is very varied e.g. heights of all humans on earth would have a bigger standard error than heights of all adults on the earth
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula for the standard etror?

A

SE = SD/√n

SE is always smaller than SD because SD has been divided

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is skew?

A

Skew is when the population is not normally distributed
e.g. words in a text will be skewed smaller (to the left) because most words are shorter

17
Q

What is the Central Limit Theorem?

A

Predicts that the mean of the distribution of means of samples from the population, will be the same as the population’s actual mean

The sample means of a population will always be normally distributed even if the population itself is skewed as long as

  • the sample size is big enough (>30) (smaller samples will still have some skew)
  • the samples are truly random
18
Q

What is the standard error of a proportion?

A

e.g. responders and non-responders - aka binary

As long as:
* At least 10 successes/failures (e.g.at least 10 people responded or didn’t)
* The sample is randomly drawn from the population

SE = √p(1-p)/n

19
Q

Standard deviations

A

68% values within one s.d.
95% within 2 (1.96) s.d
99.7% within 3 s.d

20
Q

How are Confidence Intervals calculated?

A

e.g. 95% c.i

= mean +/- (1.96 xSE)

this range will contain the population mean in 95% of samples taken from the population

how good the guess is at guessing the real mean, the tighter the interval the more accurate the guess

21
Q

What is the difference between a reference range and conference interval?

A

Reference range = descriptive
sample mean +/- (1.96xS.D)
Describes where 95% of the sample lies

Reference range will always be bigger than the confidence interval

Confidence Interval = inferential
sample mean +/- (1.96xS.E) - is always narrower than RR
Describes where the population mean lies, 95% on the time