Statistics Flashcards

1
Q

Levels of evidence: what is 1a?

A

Meta-analysis and RCTs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Levels of evidence: what is 1b?

A

At least 1 RCT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Levels of evidence: what is 2a?

A

One well-designed controlled trial which is not randomised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Levels of evidence: what is 2b?

A

One well-designed experimental trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Levels of evidence: What is 3?

A

Evidence from case, correlation, and comparative studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Levels of evidence: what is 4?

A

Evidence from a panel of experts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define randomised control trial

A

Participants randomly allocated to intervention or control group (e.g. standard treatment or placebo)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe cohort study

A

Observational and prospective
Two (or more) are selected according to their exposure to a particular agent and followed up to see how many develop a disease or other outcome
The usual outcome measure is relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe case-control study

A

Observational and retrospective
Patients with a particular condition are identified and matched with controls
Data is collected on a past exposure to a possible causal agent for the condition
The usual outcome measure is the odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe a cross-sectional study

A

Provides a snap shot in time, sometimes called a prevalence study
Provides weak evidence of cause and effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define relative risk (RR)

A

RR is the ratio of risk in the experimental group (experimental event rate, EER) to the risk in the control group (control event rate, CER)

Also sometimes called relative risk ratio

RR = EER / CER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A risk ratio of >1 means what?

A

The rate of an event is increased compared to the control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A risk ratio of <1 means what?

A

The rate of an event is decreased compared to control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define incidence

A

Number of new cases per population in a given time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define prevalence

A

Total number of cases per population at a particular point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is selection bias?

A
  • Error in assigning individuals to groups leading to differences which may influence the outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Give some examples of selection bias

A

Sampling bias: subjects not representative of the population
Volunteer bias: people more at risk of a disease may volunteer for a study
Non-responder bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is recall bias?

A

Difference in the accuracy of the recollections retrieved by study participants
A particular problem in case-control studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is publication bias?

A

Failure to publish results as they showed a negative or uninteresting result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define the P value

A

Probability of obtaining a result by chance at least as extreme as the one that was actually observed, assuming the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the null hypothesis?

A

Two treatments are equally effective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a type 1 error when testing the null hypothesis?

A

The null hypothesis is rejected when it is actually true (i.e. showing a difference between two groups when it doesn’t exist i.e. a false positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a type 2 error when testing the null hypothesis?

A

The null hypothesis is accepted when it is false (i.e. failing to spot a difference when one really exists i.e. a false negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the power of a study?

A

The probability of (correctly) rejecting the null hypothesis when it is false i.e. the probability of detecting a statistically significant difference

Power can be increased by increasing the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What will influence which significance test you use?

A

Whether the data is parametric (something which can be measured, usually normally distributed) or non-parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Name two parametric tests

A
  • Student’s t test (paired or unpaired)
  • Pearson’s product-moment coefficient correlation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is paired data?

A

Data obtained from a single group of patients e.g. measurement before and after an intervention
- Parametric and must be normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is unpaired data?

A

Comes from two different groups of patients e.g. comparing response to different interventions in two groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Name some non-parametric tests

A
  • Mann-Whitney U test
  • Wilcoxon signed-rank test
  • Chi-squared test
  • Spearman, kendall rank
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

When would you use Mann-Whitney U test?

A
  • Non-parametric data which is unpaired
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

When would you use the wilcoxon signed-rank test?

A
  • Non-parametric
  • To compare two sets of observations on a single sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

When would you use the chi-squared test?

A
  • Non-parametric
  • Used to compare proportions or percentages
33
Q

Define ‘numbers needed to treat’ (NNT)

A

A measure that indicates how many patients would require an intervention to reduce the expected number of outcomes by one
- Calculated by 1/absolute risk reduction

34
Q

How do you calculate the absolute risk reduction?

A

The difference between the control event rate (CER) and the experimental event rate (EER)

EER = (number who had particular outcome with the intervention) / (total number who had the intervention)

CER = (number who had a particular outcome with the control) / (total number who had the control)

35
Q

Define sensitivity

A

Proportion of patients with the condition who have a positive result

i.e. we need to know how often a test will be positive if a patient has the disease

True positives / (true positives + false negatives)

36
Q

Define specificity

A

Proportion of patients without the condition who have a negative result

I.e. we need to know how often a test will be negative if the patient is healthy

True negatives / (true negatives + false positives)

37
Q

Define positive predictive value

A

The chance that the patient has the condition if the diagnostic test is positive

True positives / (true positives + false positives)

38
Q

Define negative predictive value

A

The chance that the patient does not have the condition if the diagnostic test is negative

True negatives / (true negatives + false negatives)

39
Q

Define the likelihood ratio for a positive test result

A

How much the odds of the disease increase when a test is positive

Sensitivity / (1 - specificity)

40
Q

Define the likelihood for a negative test result

A

How much the odds of a disease decrease when a test is negative

(1-sensitivity) / specificity

41
Q

When is the mean used?

A

When the spread of the data is fairly similar
on each side of the mid point
e.g. when the data are “normally distributed”.

42
Q

What is the disadvantage of using the mean as an average?

A

Skewed by extremes of data, so not giving a typical picture in this instance
The median is often better in these circumstances

43
Q

When would you use the median?

A

When data is not symmetrical / i.e. is skewed distribution

44
Q

When would you use the mode?

A

The most is the most common set of events, and so used when a label is needed for the most frequently occurring event

45
Q

What does standard deviation mean?

A
  • Used for data which is normally distributed
  • Provides information on how much data varies around their mean
  • How much a set of values is spread around the average
46
Q

How much of the data values is within 1 standard deviation of the mean?

A

68.2%

47
Q

How much of the data values is within 2 standard deviations of the mean?

A

95.4%

48
Q

How much of the data values is within 3 standard deviations of the mean?

A

99.7%

49
Q

What is a confidence interval?

A

Instead of simply wanting the mean value of a
sample, when you want a range that is likely to contain the true population value.

  • A range (interval) in which we can be fairly sure (confident) that the “true value” lies.
50
Q

What happens to the confidence intervals in larger studies?

A

The CI is narrower because of the larger sample size

51
Q

What does a P value of 0.5 mean?

A

The probability of the difference having happened by chance is 0.5 in 1, or 50:50.

52
Q

What does a P value of 0.05 mean?

A

The probability of the difference having happened by chance is 0.05 in 1, i.e. 1 in 20.

53
Q

When do you use parametric tests?

A

To compare samples of normally distributed data

If the data is not normally distributed, do not use parametric tests

54
Q

FSRH grading of evidence
- What is Grade A?

A
  • Based on RCTs
  • At least one meta-analysis or RCT or systematic review rated as 1++ and directly to the target population
55
Q

FSRH grading of evidence
- What is Grade B?

A
  • Based on other robust or experimental or observational studies
  • Body of evidence including studies rated as 2++ directly applicable to the target population
56
Q

FSRH grading of evidence
- What is Grade C?

A
  • Evidence is limited but relies on expert opinion and endorsement of respected authorities
57
Q

FSRH grading of evidence
- What is Grade D?

A

Evidence level 3 or 4 (case, correlation etc)

58
Q

What are descriptive studies?

A
  • Describing a population / giving a picture of what is happening
  • e.g. case reports, case series, qualitative reports, surveys
59
Q

What are analytic studies?

A
  • Quantifies the relationship between two factors
  • i.e. effect of an intervention (I) or exposure (E) on an outcome (O) in a population (P)
  • To quantify this effect, we need a comparison group (C)

Was the intervention randomly allocated
- yes = RCT
- no = observational study (i.e. cohort, cross-sectional, case-control)

60
Q

What are the 3 types of observational studies?

A
  • Cohort (prospective: exposure -> outcome)
  • Case-control (retrospective: exposure <- outcome)
  • Cross-sectional (exposure and outcome simultaneously)
61
Q

What are the advantaged of a cohort study?

A
  • Ethically safe
  • Subjects can be matched
  • Can establish timings and directionality of events
  • Administratively easier and cheaper than RCT
62
Q

What are the disadvantages of a cohort study?

A
  • Controls can be difficult to identify
  • Exposure may be linked to a hidden confounder
  • Blinding is difficult
  • Randomisation not present
  • For rare diseases, large sample sizes or long follow-up is necessary
63
Q

What are the advantages of a case-control study?

A
  • Quick and cheap
  • Only feasible method for very rare disorders or those with long lad between exposure and outcome
  • Fewer subjects needed
64
Q

What are the disadvantages of a case-control study?

A
  • Reliance on recall or records to determine exposure status
  • Confounders
  • Selection of control groups is difficult
  • Potential bias: recall and selection
65
Q

What are the advantages of a cross-sectional study?

A
  • Cheap and simple
  • Ethically safe
66
Q

What are the disadvantages of a cross-sectional study?

A
  • Establishes association at most, not causality
  • Recall bias susceptibility
  • Confounders may be unequally distributed
  • Group sizes may be unequal
67
Q

What are the clinical trial phases?

A

0 - human microdosing studies
1 - healthy people
2 - people with relevant illness in lab setting
3 - people with illness in clinical setting
Market authorisation
4 - post-marketing surveillance studies

68
Q

Name descriptive statistics

A
  • Mean, median, mode, standard deviation
  • Confidence intervals
  • P-values (parametric, non-parametric, chi-squared)
69
Q

Name summary measures

A
  • RR
  • OR
  • RD
  • NNT
70
Q

Name accuracy statistics

A
  • Sensitivity, specificity, predictive values, likelihood values
71
Q

What is the variance of a set of data?

A
  • A measure of variability (SD is square root of variance)
  • Calculated by taking the average of squared deviations from the mean
  • Tells you the degree of spread in the data set
  • The more spread the data, the larger the variance is in relation to the mean
72
Q

How do you calculate risk ratio?

A

RR = risk in treated or exposed group / risk is unexposed or control group

73
Q

How do you calculate the odds ratio?

A

OR = odds of having been exposed to a risk factor in case group / odds of having been exposed to a risk factor in control group

74
Q

What is the absolute risk reduction? (ARR)

A
  • Difference between the event rate in the intervention group and control group

= improvement rate in the intervention group - improvement rate in control group

75
Q

What is the relative risk reduction? (RRR)

A
  • The proportion by which the intervention REDUCES the event rate
    RRR = ARR / control (placebo) no improvement event rate
76
Q

What the number needed to treat?

A

The number of patients who need to be treated for one to get benefit
= 100/ARR

77
Q

What is a student’s t-test?

A
  • Tests a hypothesis on the basis of a difference between sample means
  • i.e. the t test determines a probability that two populations are the same, with respect to the variable tested
  • An example null hypothesis might be there is no difference in the mean BMI of patients undergoing vaginal and c-section deliveries
78
Q

Describe a chi-squared test

A
  • Chi-squared test of proportions
  • Non-parametric test used to compare numerical or categorical data sets
  • E.g. investigating the proportion of women taking pre conception folic acid in different socioeconomic groups
79
Q

Describe ANOVA (analysis of variance)

A
  • ANOVA tests the hypothesis that there is no difference between two or more population means (usually at least 3)
  • Can test for differences without increasing the Type I error rate (which can happen if comparing multiple means by conducting multiple t-tests)