Statistics and studies Flashcards

Question 1

Q

What is selection bias?

Answer

A

Error in assigning individuals to groups leading to differences which may influence the outcome.

Question 2

Q

What is recall bias?

Answer

A

Difference in the accuracy of the recollections retrieved by study participants, possibly due to whether they have disorder or not.

This is a problem in case control studies

Question 3

Q

What is publication bias?

Answer

A

Failure to publish results from valid studies, often as they showed a negative or uninteresting result. Important in meta-analyses where studies showing negative results may be excluded.

Question 4

Q

What is work up bias?

Answer

A

In studies which compare new diagnostic tests with gold standard tests, work-up bias can be an issue. Sometimes clinicians may be reluctant to order the gold standard test unless the new test is positive, as the gold standard test may be invasive (e.g. tissue biopsy).

Question 5

Q

What is expectation bias?

Answer

A

Only a problem in non-blinded trials. Observers may subconsciously measure or report data in a way that favours the expected study outcome.

Question 6

Q

What is the Hawthorne effect?

Answer

A

Describes a group changing it’s behaviour due to the knowledge that it is being studied

Question 7

Q

What is late look bias?

Answer

A

Gathering information at an inappropriate time e.g. studying a fatal disease many years later when some of the patients may have died already

Question 8

Q

What is procedure bias?

Answer

A

Occurs when subjects in different groups receive different treatment

Question 9

Q

What is lead time bias?

Answer

A

Occurs when two tests for a disease are compared, the new test diagnoses the disease earlier, but there is no effect on the outcome of the disease

Question 10

Q

What are the phases of clinical trial?

Answer

A

Phase 1- Determines pharmacokinetics and pharmacodynamics and side-effects prior to larger studies (first in human - normally healthy people - but not in cancer)

Phase 2 Assess efficacy + dosage
- 2a: Dosing
- 2b: Efficacy

Phase 3 - Assess effectiveness - normally an RTC

Phase 4 - post marketing surveillance. Monitors for long-term effectiveness and side-effects

Question 11

Q

What do confidence intervals show?

Answer

A

a range of values within which the true effect of intervention is likely to lie

Question 12

Q

What does the standard error of the mean show?

Answer

A

measure of the spread expected for the mean of the observations - i.e. how ‘accurate’ the calculated sample mean is from the true population mean

Question 13

Q

How is the standard error of the mean calculated?

Answer

A

SEM = SD / square root (n)

where SD = standard deviation and n = sample size
therefore the SEM gets smaller as the sample size (n) increases

Question 14

Q

How is the lower 95% confidence interval calculated?

Answer

A

lower limit = mean - (1.96 * SEM)

*If the sample is < 100 then need a student critical T test

Question 15

Q

How is the higher 95% confidence interval calculated?

Answer

A

upper limit = mean + (1.96 * SEM)
*If the sample is < 100 then need a student critical T test

Question 16

Q

What is a confounder?

Answer

A

In statistics confounding refers to a variable which correlates with other variables within a study leading to spurious results.

Question 17

Q

What is correlation?

Answer

A

Correlation is used to test for association between variables

Question 18

Q

What is regression?

Answer

A

Once correlation is demonstrated can do regression

Regression can be used to predict values of other dependent variables from independent variables.

Question 19

Q

What is nominal data?

Answer

A

Observed values can be put into set categories which have no particular order or hierarchy.

e.g. Birthplace

Question 20

Q

What is ordinal data?

Answer

A

Observed values can be put into set categories which themselves can be ordered (for example NYHA classification of heart failure symptoms)

Question 21

Q

What is discrete data?

Answer

A

Observed values are confined to a certain values, usually a finite number of whole numbers (for example the number of asthma exacerbations in a year)

Question 22

Q

What is continuous data?

Answer

A

Data can take any value with certain range (for example weight)

Question 23

Q

What is binomial data?

Answer

A

Data may take one of two values (for example gender)

Question 24

Q

What is a hazard’s ratio?

Answer

A

The hazard ratio (HR) is similar to relative risk but is used when risk is not constant to time. It is typically used when analysing survival over time

Question 25

Q

What is incidence?

Answer

A

The incidence is the number of new cases per population in a given time period.

Question 26

Q

What is the prevalence?

Answer

A

The prevalence is the total number of cases per population at a particular point in time. It can be divided into two types:
- point prevalence = number of cases in a defined population / number of people in a defined population at the same time
- period prevalence = number of identified cases during a specified period of time / total number of people in that population

Question 27

Q

What is intention to treat analysis?

Answer

A

All patient asigned to one arm of an RCT are analysed in that arm
Regardless of completing assigned treatment of not

Question 28

Q

What is % is in one standard deviation of a normal distribution?

Answer

A

68.3% of values lie within 1 SD of the mean

Question 29

Q

What is % is in two standard deviation of a normal distribution?

Answer

A

95.4% of values lie within 2 SD of the mean

Question 30

Q

What is % is in three standard deviation of a normal distribution?

Answer

A

99.7% of values lie within 3 SD of the mean

Question 31

Q

What is the numbers needed to treat principal?

Answer

A

Average number of patients who require to be treated for one to benefit compared with a control in a clinical trial.

Question 32

Q

How is the number needed to treat principal calculated?

Answer

A

1/(Absolute risk reduction) and is rounded to the next highest whole number

Question 33

Q

What is the experimental event rate?

Answer

A

Experimental event rate (EER) = (Number who had particular outcome with the intervention) / (Total number who had the intervention)

Question 34

Q

What is the control event rate?

Answer

A

Control event rate (CER) = (Number who had a particular outcome with the control/ (Total number who had the control)

Question 35

Q

What is an odds ratio?

Answer

A

Odds are a ratio of the number of people who incur a particular outcome to the number of people who do not incur the outcome.
odds of / odds not

Question 36

Q

What is relative risk?

Answer

A

Relative risk (RR) is the ratio of risk in the experimental group (experimental event rate, EER) to risk in the control group (control event rate, CER). The term relative risk ratio is sometimes used instead of relative risk.

To recap
EER = rate at which events occur in the experimental group
CER = rate at which events occur in the control group

Question 37

Q

What does a relative risk > 1 mean?

Answer

A

risk ratio is > 1 then the rate of an event (in this case experiencing significant pain relief) is increased compared to controls.

Question 38

Q

What does a relative risk < 1 mean?

Answer

A

risk ratio is > 1 then the rate of an event (in this case experiencing significant pain relief) is increased compared to controls.

Question 39

Q

How do you calculate the relative risk increase or relative risk reduction?

Answer

A

Relative risk reduction (RRR) or relative risk increase (RRI) is calculated by dividing the absolute risk change by the control event rate

Question 40

Q

What is reliability?

Answer

A

Reliability is used in statistics to imply consistency of a measure.

Question 41

Q

What is validity?

Answer

A

Validity is determined by whether a test accurately measures what it is supposed to measure.

Question 42

Q

What is a true positive?

Answer

A

Disease is present AND Positive test

Question 43

Q

What is false positive?

Answer

A

Disease absent BUT Positive test

Question 44

Q

What is a true negative?

Answer

A

Disease not present AND Negative test

Question 45

Q

What is false negative?

Answer

A

Disease present BUT Negative test

Question 46

Q

What is specificity?

Answer

A

Proportion of patients without the condition who have a negative test result

True negative / (True negative + False Positive)

Question 47

Q

What is sensitivity?

Answer

A

Proportion of patient with the condition who have a positive test

True positive / (True positive + False negative)

Question 48

Q

What is the positive predictor value?

Answer

A

The chance that the patient has the condition if the diagnostic test is positive

True positive / (True positive + False positive)

Question 49

Q

What is the negative predictor value?

Answer

A

The chance that the patient does not have the condition if the diagnostic test is negative

True negative / (True negative + False negative)

Question 50

Q

The ten criteria for a screening test. Wilson and Junger criteria?

Answer

A

The condition should be an important public health problem
There should be an acceptable treatment for patients with recognised disease
Facilities for diagnosis and treatment should be available
There should be a recognised latent or early symptomatic stage
The natural history of the condition, including its development from latent to declared disease should be adequately understood
There should be a suitable test or examination
The test or examination should be acceptable to the population
There should be agreed policy on whom to treat
The cost of case-finding (including diagnosis and subsequent treatment of patients) should be economically balanced in relation to the possible expenditure as a whole
Case-finding should be a continuous process and not a ‘once and for all’ project

Question 51

Q

What does a null hypothesis state?

Answer

A

A null hypothesis (H0) states that two treatments are equally effective

Question 52

Q

What is the P value?

Answer

A

Probability of obtaining a result by chance at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

Question 53

Q

What is a normal distribution?

Answer

A

Normal (Gaussian) distributions: mean = median = mode

Question 54

Q

What is a positively skewed distribution (right skewed) ?

Answer

A

Positively skewed distribution: mean > median > mode

Question 55

Q

What is a negatively skewed distribution (left skewed) ?

Answer

A

Negatively skewed distribution mean < median < mode

Question 56

Q

What does a funnel plot demonstrate?

Answer

A

demonstrate the existence of publication bias in meta-analyses.

a symmetrical, inverted funnel shape indicates that publication bias is unlikely

an asymmetrical funnel indicates a relationship between treatment effect and study size. This indicates either publication bias or a systematic difference between smaller and larger studies (‘small study effects’)

Question 57

Q

What determines the type of significant testing you can use?

Answer

A

If data is parametric (somewhat is measured and normally normal distribution)
Or
Non-parametic

Question 58

Q

When should a student paired t test be done?

Answer

A

Parametric data

paired data refers to data obtained from a single group of patients, e.g. Measurement before and after an intervention.

Question 59

Q

When should a student unpaired t test be done?

Answer

A

Parametric data

Unpaired data comes from two different groups of patients, e.g. Comparing response to different interventions in two groups

Question 60

Q

When should a Mann- Whitney U test be done?

Answer

A

Non-parametric
compares ordinal, interval, or ratio scales of unpaired data

Question 61

Q

When should a Wilcoxon signed-rank test be done?

Answer

A

Non-parametric
compares two sets of observations on a single sample, e.g. a ‘before’ and ‘after’ test on the same population following an intervention

Question 62

Q

When should a Chi squared test be done?

Answer

A

Non-parametric
used to compare proportions or percentages e.g. compares the percentage of patients who improved following two different interventions

Question 63

Q

What type of study can be used to estimate the prevalence of a condition?

Answer

A

Cross sectional study

Question 64

Q

What is the relationship between prevalence of incidence?

Answer

A

Prevalence = incidence X duration of disease

Answer 65

A

Parametic: Pearson rank correlation
Non-parametic: Spearman’s rank correlation

Answer 66

A

Superiority
Equivalence
Non-inferiority (only the lower confidence interval needs to lie in range - not the upper)

Answer 67

A

Large sample size needed

Answer 68

A

Equivalence margin is defined (-delta to +delta) on a specified outcome. If the confidence interval of the difference between the two drugs lies within the equivalence margin then the drugs may be assumed to have a similar effect

Answer 69

A

The null hypothesis (H0) is rejected, when it is true

Answer 70

A

The null hypothesis(H0) is accepted, when it is not true

Answer 71

A

The probability of rejecting the null hypothesis, when the null hypothesis is not true

Power = 1 - type 2 error

Answer 72

A

Low sample size needed

Answer 73

A

Ia - evidence from meta-analysis of randomised controlled trials
Ib - evidence from at least one randomised controlled trial
IIa - evidence from at least one well designed controlled trial which is not randomised
IIb - evidence from at least one well designed experimental trial
III - evidence from case, correlation and comparative studies
IV - evidence from a panel of experts

Answer 74

A

ARR = Control event rate - experiment event rate

Answer 75

A

SD = square root of variance

Answer 76

A

Standard error of the mean = standard deviation / square root (number of patients)

Answer 77

A

Relative risk reduction = (EER - CER) / CER

Answer 78

A

Odds ratio

Answer 79

A

The number needed to treat to prevent one bad outcome

Answer 80

A

the chance that the patient has the condition if the diagnostic test is positive

Answer 81

A

probability that subjects with a negative screening test don’t have the disease.

Answer 82

A

Confidence intervals