EAB - Estimation and Significance Tests and P Values Flashcards

1
Q

What is sampling error?

A

Samples provide an incomplete picture of the population.

Different samples will give different estimates, which is called ‘sampling error’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is sampling distribution?

A

Sample estimates (e.g. means) are calculated from multiple samples from the same population.

They will will then have a distribution of differing values which is known as the ‘sampling distribution’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two measures we can introduce to deal with uncertainty in drawing conclusions?

A
  • Confidence interval:
    If we are estimating some quantity from our data, for example, the proportion of patients who have a particular attribute, then we can quantify the imprecision in the estimate using a confidence interval.
  • Statistical significance test:
    If we are testing a hypothesis, for example, comparing blood pressure in two groups, then we can do a statistical significance test which helps us to weigh the evidence that the sample difference we have observed is in fact a real difference.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the relationship between sample size and how close it is to the true mean?

A

The bigger the sample size, the closer the estimate is to the true mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the relationship between spread of data and how close it is to the true mean?

A

The smaller the spread of data (standard deviation), the closer the estimate is to the true mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a standard error?

A

A standard error (SE) is an indication of the extent of the sampling error.

Standard error tells us how much a sample mean tends to vary from the population mean (true mean). It provides an estimate of the precision of the sample mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you calculate standard error?

A

For a sample mean, it can be calculated from the standard deviation divided by the square root of the sample size.

(SE = SD / √[𝑁])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can standard error be used to calculate a confidence interval?

A

The true (population) mean can be expected to lie in the range: (sample mean – 1.96 standard errors) to (sample mean + 1.96 standard errors) in 95% of calculations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are our assumptions when calculating a 95% confidence interval from population mean?

A
  • this is normal data or a large sample (at least 60)
  • the sample is chosen at random from the population
  • the observations are independent of each other
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are our assumptions when calculating a 95% confidence interval from population proportion?

A
  • the sample is chosen at random from the population
  • the observations are independent of each other
  • the proportion with the characteristic is not close to 0 or 1
  • np and n(1-p) are each greater than 5 (large sample)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate the standard error for proportion?

A

Multiply the proportion with the characteristic by the proportion without the characteristic:
p(1-p)

Divide by the sample size:
p(1-p)/n

Take the square root to deduce the SE:
√[(𝑝 × (1 − 𝑝)/𝑛)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a significance test (and its benefit)?

A

A significance test uses data from a sample to show the likelihood that a hypothesis about a
population is true. There are always two mutually exclusive hypotheses since, if the hypothesis being tested is not true, then the opposite hypothesis must be true.

A measure of the evidence for or against the hypothesis is provided by a P value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the null hypothesis?

A

The null hypothesis is the baseline hypothesis which is usually of the form ‘there is no difference’ or
‘there is no association’.

The corresponding alternative hypothesis is ‘there is a difference’ or ‘there is an association’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a two-sided test (two-tailed test)?

A

It is known as a two-sided or two-tailed test when the alternative hypothesis is general and allows the difference to be in either
direction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a one-sided test (one-tailed test)?

A

It is known as a one-sided or one-tailed test when the alternative hypothesis is not general and allows the difference to be in only one
direction.

Two-sided tests should always be used unless there is clear justification at the outset to use a one-sided test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the steps in doing a significance test?

A
  1. Specify the hypothesis of interest as a null and alternative hypothesis.
  2. Decide what statistical test is appropriate.
  3. Use the test to calculate the P value.
  4. Weigh the evidence from the P value in favour of the null or alternative hypothesis.
17
Q

Describe the types of errors in significance testing.

A

Since a significance test uses sample data to make inferences about populations, using the results from a sample may lead to wrong conclusion:

TYPE 1 ERROR:
this is getting a significant result in a sample when the null hypothesis is in fact true in the underlying population (‘false significant’ result).

We usually set a limit of 0.05 (5%) for the probability of a type 1 error which is equivalent to a 0.05 cut-off for statistical significance.

TYPE 2 ERROR:
this is getting a non-significant result in a sample when the null hypothesis is in fact
false in the underlying population (‘false non-significant’ result).

It is widely accepted that the probability of a type 2 error should be no more than 0.20 (20%).

18
Q

Describe what a P value is.

A

A P value is a probability, and therefore lies between 0 and 1. It comes from a statistical test that is testing a particular null hypothesis.

It expresses the weight of evidence in favour of or against the stated null hypothesis.

Precise definition: P value is the probability, given that the null hypothesis is true, of obtaining data as extreme or more extreme than that observed.

19
Q

What is the cut off point for a p value, and what does that indicate?

A

0.05 or 5% is commonly used as a cut-off, such that if the observed P is less than this (P<0.05) we consider that there is good evidence that the null hypothesis is not true. This is directly related to the type 1 error rate.

If 0.05 is the cut-off then P< 0.05 is commonly described as statistically significant and P≥0.05 is described as not statistically significant.

20
Q

List some factors that affects the size of the p value.

A
  • the size of the real effect in the population sampled
  • the sample size
  • the variability of the measure involved
21
Q

What does clinical significance indicate?

A

This indicates that the difference observed is large enough to be clinically meaningful. It is not necessarily related to statistical significance as it is a clinical judgement and not a mathematical
quantity.