Statistics Flashcards

1
Q

Levels of evidence: what is 1a?

A

Meta-analysis and RCTs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Levels of evidence: what is 1b?

A

At least 1 RCT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Levels of evidence: what is 2a?

A

One well-designed controlled trial which is not randomised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Levels of evidence: what is 2b?

A

One well-designed experimental trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Levels of evidence: What is 3?

A

Evidence from case, correlation, and comparative studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Levels of evidence: what is 4?

A

Evidence from a panel of experts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define randomised control trial

A

Participants randomly allocated to intervention or control group (e.g. standard treatment or placebo)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe cohort study

A

Observational and prospective
Two (or more) are selected according to their exposure to a particular agent and followed up to see how many develop a disease or other outcome
The usual outcome measure is relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe case-control study

A

Observational and retrospective
Patients with a particular condition are identified and matched with controls
Data is collected on a past exposure to a possible causal agent for the condition
The usual outcome measure is the odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe a cross-sectional study

A

Provides a snap shot in time, sometimes called a prevalence study
Provides weak evidence of cause and effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define relative risk (RR)

A

RR is the ratio of risk in the experimental group (experimental event rate, EER) to the risk in the control group (control event rate, CER)

Also sometimes called relative risk ratio

RR = EER / CER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A risk ratio of >1 means what?

A

The rate of an event is increased compared to the control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A risk ratio of <1 means what?

A

The rate of an event is decreased compared to control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define incidence

A

Number of new cases per population in a given time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define prevalence

A

Total number of cases per population at a particular point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is selection bias?

A
  • Error in assigning individuals to groups leading to differences which may influence the outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Give some examples of selection bias

A

Sampling bias: subjects not representative of the population
Volunteer bias: people more at risk of a disease may volunteer for a study
Non-responder bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is recall bias?

A

Difference in the accuracy of the recollections retrieved by study participants
A particular problem in case-control studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is publication bias?

A

Failure to publish results as they showed a negative or uninteresting result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define the P value

A

Probability of obtaining a result by chance at least as extreme as the one that was actually observed, assuming the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the null hypothesis?

A

Two treatments are equally effective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a type 1 error when testing the null hypothesis?

A

The null hypothesis is rejected when it is actually true (i.e. showing a difference between two groups when it doesn’t exist i.e. a false positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a type 2 error when testing the null hypothesis?

A

The null hypothesis is accepted when it is false (i.e. failing to spot a difference when one really exists i.e. a false negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the power of a study?

A

The probability of (correctly) rejecting the null hypothesis when it is false i.e. the probability of detecting a statistically significant difference

Power can be increased by increasing the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What will influence which significance test you use?
Whether the data is parametric (something which can be measured, usually normally distributed) or non-parametric
26
Name two parametric tests
- Student's t test (paired or unpaired) - Pearson's product-moment coefficient correlation
27
What is paired data?
Data obtained from a single group of patients e.g. measurement before and after an intervention - Parametric and must be normally distributed
28
What is unpaired data?
Comes from two different groups of patients e.g. comparing response to different interventions in two groups
29
Name some non-parametric tests
- Mann-Whitney U test - Wilcoxon signed-rank test - Chi-squared test - Spearman, kendall rank
30
When would you use Mann-Whitney U test?
- Non-parametric data which is unpaired
31
When would you use the wilcoxon signed-rank test?
- Non-parametric - To compare two sets of observations on a single sample
32
When would you use the chi-squared test?
- Non-parametric - Used to compare proportions or percentages
33
Define 'numbers needed to treat' (NNT)
A measure that indicates how many patients would require an intervention to reduce the expected number of outcomes by one - Calculated by 1/absolute risk reduction
34
How do you calculate the absolute risk reduction?
The difference between the control event rate (CER) and the experimental event rate (EER) EER = (number who had particular outcome with the intervention) / (total number who had the intervention) CER = (number who had a particular outcome with the control) / (total number who had the control)
35
Define sensitivity
Proportion of patients with the condition who have a positive result i.e. we need to know how often a test will be positive if a patient has the disease True positives / (true positives + false negatives)
36
Define specificity
Proportion of patients without the condition who have a negative result I.e. we need to know how often a test will be negative if the patient is healthy True negatives / (true negatives + false positives)
37
Define positive predictive value
The chance that the patient has the condition if the diagnostic test is positive True positives / (true positives + false positives)
38
Define negative predictive value
The chance that the patient does not have the condition if the diagnostic test is negative True negatives / (true negatives + false negatives)
39
Define the likelihood ratio for a positive test result
How much the odds of the disease increase when a test is positive Sensitivity / (1 - specificity)
40
Define the likelihood for a negative test result
How much the odds of a disease decrease when a test is negative (1-sensitivity) / specificity
41
When is the mean used?
When the spread of the data is fairly similar on each side of the mid point e.g. when the data are “normally distributed”.
42
What is the disadvantage of using the mean as an average?
Skewed by extremes of data, so not giving a typical picture in this instance The median is often better in these circumstances
43
When would you use the median?
When data is not symmetrical / i.e. is skewed distribution
44
When would you use the mode?
The most is the most common set of events, and so used when a label is needed for the most frequently occurring event
45
What does standard deviation mean?
- Used for data which is normally distributed - Provides information on how much data varies around their mean - How much a set of values is spread around the average
46
How much of the data values is within 1 standard deviation of the mean?
68.2%
47
How much of the data values is within 2 standard deviations of the mean?
95.4%
48
How much of the data values is within 3 standard deviations of the mean?
99.7%
49
What is a confidence interval?
Instead of simply wanting the mean value of a sample, when you want a range that is likely to contain the true population value. - A range (interval) in which we can be fairly sure (confident) that the “true value” lies.
50
What happens to the confidence intervals in larger studies?
The CI is narrower because of the larger sample size
51
What does a P value of 0.5 mean?
The probability of the difference having happened by chance is 0.5 in 1, or 50:50.
52
What does a P value of 0.05 mean?
The probability of the difference having happened by chance is 0.05 in 1, i.e. 1 in 20.
53
When do you use parametric tests?
To compare samples of normally distributed data If the data is not normally distributed, do not use parametric tests
54
FSRH grading of evidence - What is Grade A?
- Based on RCTs - At least one meta-analysis or RCT or systematic review rated as 1++ and directly to the target population
55
FSRH grading of evidence - What is Grade B?
- Based on other robust or experimental or observational studies - Body of evidence including studies rated as 2++ directly applicable to the target population
56
FSRH grading of evidence - What is Grade C?
- Evidence is limited but relies on expert opinion and endorsement of respected authorities
57
FSRH grading of evidence - What is Grade D?
Evidence level 3 or 4 (case, correlation etc)
58
What are descriptive studies?
- Describing a population / giving a picture of what is happening - e.g. case reports, case series, qualitative reports, surveys
59
What are analytic studies?
- Quantifies the relationship between two factors - i.e. effect of an intervention (I) or exposure (E) on an outcome (O) in a population (P) - To quantify this effect, we need a comparison group (C) Was the intervention randomly allocated - yes = RCT - no = observational study (i.e. cohort, cross-sectional, case-control)
60
What are the 3 types of observational studies?
- Cohort (prospective: exposure -> outcome) - Case-control (retrospective: exposure <- outcome) - Cross-sectional (exposure and outcome simultaneously)
61
What are the advantaged of a cohort study?
- Ethically safe - Subjects can be matched - Can establish timings and directionality of events - Administratively easier and cheaper than RCT
62
What are the disadvantages of a cohort study?
- Controls can be difficult to identify - Exposure may be linked to a hidden confounder - Blinding is difficult - Randomisation not present - For rare diseases, large sample sizes or long follow-up is necessary
63
What are the advantages of a case-control study?
- Quick and cheap - Only feasible method for very rare disorders or those with long lad between exposure and outcome - Fewer subjects needed
64
What are the disadvantages of a case-control study?
- Reliance on recall or records to determine exposure status - Confounders - Selection of control groups is difficult - Potential bias: recall and selection
65
What are the advantages of a cross-sectional study?
- Cheap and simple - Ethically safe
66
What are the disadvantages of a cross-sectional study?
- Establishes association at most, not causality - Recall bias susceptibility - Confounders may be unequally distributed - Group sizes may be unequal
67
What are the clinical trial phases?
0 - human microdosing studies 1 - healthy people 2 - people with relevant illness in lab setting 3 - people with illness in clinical setting Market authorisation 4 - post-marketing surveillance studies
68
Name descriptive statistics
- Mean, median, mode, standard deviation - Confidence intervals - P-values (parametric, non-parametric, chi-squared)
69
Name summary measures
- RR - OR - RD - NNT
70
Name accuracy statistics
- Sensitivity, specificity, predictive values, likelihood values
71
What is the variance of a set of data?
- A measure of variability (SD is square root of variance) - Calculated by taking the average of squared deviations from the mean - Tells you the degree of spread in the data set - The more spread the data, the larger the variance is in relation to the mean
72
How do you calculate risk ratio?
RR = risk in treated or exposed group / risk is unexposed or control group
73
How do you calculate the odds ratio?
OR = odds of having been exposed to a risk factor in case group / odds of having been exposed to a risk factor in control group
74
What is the absolute risk reduction? (ARR)
- Difference between the event rate in the intervention group and control group = improvement rate in the intervention group - improvement rate in control group
75
What is the relative risk reduction? (RRR)
- The proportion by which the intervention REDUCES the event rate RRR = ARR / control (placebo) no improvement event rate
76
What the number needed to treat?
The number of patients who need to be treated for one to get benefit = 100/ARR
77
What is a student's t-test?
- Tests a hypothesis on the basis of a difference between sample means - i.e. the t test determines a probability that two populations are the same, with respect to the variable tested - An example null hypothesis might be there is no difference in the mean BMI of patients undergoing vaginal and c-section deliveries
78
Describe a chi-squared test
- Chi-squared test of proportions - Non-parametric test used to compare numerical or categorical data sets - E.g. investigating the proportion of women taking pre conception folic acid in different socioeconomic groups
79
Describe ANOVA (analysis of variance)
- ANOVA tests the hypothesis that there is no difference between two or more population means (usually at least 3) - Can test for differences without increasing the Type I error rate (which can happen if comparing multiple means by conducting multiple t-tests)