Stats definitions Flashcards

1
Q

Should studies be registered?

A

Yes, all studies should be registered on a publicly accessible database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the null hypothesis?

A

Hypothesis stating that there is no real difference between the two groups, and any difference is due to chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the alternate hypothesis

A

Hypothesis stating that there is a real difference between the two groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is bias?

A

Any tendency that influences the results of the trial causing over / underexageration of the results, other than the experimental intervention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is blinding?

A

Technique used to eliminate bias by hiding intervention from the patient, clinician (double) and data analyst (triple)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is evidence based medicine?

A

Using current best evidence judiciously to make the best decisions for the patient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is efficacy?

A

Performance of an intervention under ideal and controlled circumstances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is effectiveness?

A

Performance of an intervention under real world conditions

– think that “effective” is layman word > reflects real world, while “efficacious” is scientific

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is alpha?

A

Alpha = probability of rejecting Ho due to chance = False positive error = T1Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is alpha usually set as

A

0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is beta?

A

Beta = probability of accepting Ho when it should have. been rejected = False negative = T2 Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is power (definition and as maths terms)

A

Power is the study’s ability to accept H1 when true (True positive)

Power = 1 - beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is power usually set as?

A

0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can you increase power in a study?

A

By increasing the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a type 1 error?

A

False positive.

You reject Ho even though Ho was true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are causes of T1 error?

A

Bias
Confounding
Data dredging (cherry picking with multiple hypothesis testing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a T2 error?

A

False negative

You accept Ho even though Ho was false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are causes of a T2 error?

A

Sample size too small

Variance is too large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What studies to heterogeneity and homogeneity apply to?

A

systematic reviews

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is heterogeneity?

A

the amount of incompatibility of trials included in the review, whether clinical (studies clinically different) or statistical (results different from each other)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is homogeneity?

A

When studies included in a systematic review are clinically and statistically similar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is incidence v prevalence?

A

Incidence = number of NEW cases occurring over a specific time

Prevalence = proportion of population with the condition in a specific time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is internal. validity

A

Indicates how well the study backs the conclusion

i.e. the extent to which study methodology accomplishes what it set to accomplish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is external validity?

A

the generalisability of the results to non-study population

depends on incl/exclusion criteria, patient demographics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a confounding variable?

A

A variable which is not the experimental variable but that may affect trial results (i.e. independent factor associated with both exposure and outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are ways to reduce confounding variables?

A

stratification, regression or randomisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a confidence interval

A

the range in which the population value lies 95% of the. time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the NNT

A

Number needed to treat

So number of patients that need to be treated in order. for one to benefit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the ARR

A

Absolute Risk Reduction

So the amount by which your therapy reduces the risk of a bad outcome

So if a drug reduces risk from 50% to 30%, ARR = 0.5-0.3 = 0.2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How do you calculate the NNT based on ARR

A

NNT = 1/ARR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the NNH and what should it be ideally

A

Number Needed to Harm

Ideally as BIG as POSSIBLE.

32
Q

What is the Hazard Rate

A

The probability of an endpoint in the time interval / duration of that time interval

33
Q

What is the Hazard ratio and what does the hazard ratio give you that other stats dont?

A

Hazard in the intervention group / Hazard in the control group

It tells you the effect of an intervention on an outcome in the INTEREST OF TIME

34
Q

What is the absolute risk?

A

incidence rate of the outcome

= outcome in either control or experimental arm / total n participants in arm

35
Q

What is the absolute risk reduction (ARR)?

A

AR in control - AR in experimental group

36
Q

What is the Relative Risk?

A

prob (risk) of event in exposed group : prob. of event in non-exposed group

= EER/CER

37
Q

What is RR used in?

A

In cohort studies - it requires knowing the total number of people at risk (exposed)

So in retrospective case control studies it cannot be calculated

38
Q

What is the Odds ratio?

And what is it used in

A

Ratio of odds of outcome occurring in experimental group vs. control group

Cohort or case control

39
Q

What its the differentce between OR and RR

A

OR = ratio of two odds

RR=ratio of two probabilites

40
Q

When would you use. RR over OR

A

RR requires total people at risk in the denominator > can only be used in cohort studies

41
Q

What is sensitivity

A

True positives on test / everyone with the disease

42
Q

What is specificity

A

True negatives/ everyone without disease

43
Q

What is PPV

A

True +ve on test / everyone testing +ve on test

44
Q

What is NPV

A

Those with negative. result

Who do not have the result

45
Q

What are two key advantages of intention to tx studies

A

Represent what happens IRL

Ensures maintainance of comparability between groups (for randomisation, maintaining sample size, eliminating bias)

46
Q

What is another important piece of stats data you need when looking at OR/RR?

A

the confidence interval

47
Q

When is the confidence interval for absolute difference and OR /RR indicative of no significant difference and why?

A

absolute difference: when it crosses 0

RR/OR: when it crosses 1

48
Q

What does it mean when OR or RR = 1

A

that there was no differennce between the two groups

49
Q

what contexts are HR useful for?

A

time-to-event analysis
OR
Survival analysis

50
Q

What does it mean we say HR looks at the context of time?

A

It tells you the probability that an individual woulld experience an evennt at a particular given time point after the intervention

i.e. at any particular time

51
Q

what is the confidence interval?

A

range between which the population mean value willl lie in 95% of the time

== 95% sure the population mean is contained within the CI

52
Q

Why do we need to calculate the CI

A

because the sample point estiimate mean will always be different to the true population mean

53
Q

what is a per protocol study

A

only data from subjects who complied with trial protocol through to completion are considered

54
Q

what are +s and -s of per protocol study

A

ADVANTAGES: shows true treatment effect (accurate representation of event)

DISADVANTAGES

  • attrition bias - loss of randomisation
  • exclusion bias - excludes patients who have had bad side effects / have failed to improve so stopped taking the drug
55
Q

What is intention to treat analysis

A

all randomised subjects are included in the analysis, regardless of their completion / adherence to study

56
Q

What are +s and -s of intention to treat analysis

A

ADVANTAGES:

  • mirrors real life results (effectiveness > efficacy)
  • ensures maintainance of comparability between groups obtained through randomisation, maintaining sample size and eliminating bias

DISADVANTAGE:
-reduces statistical power and may thus fail to demonstrate a real effect

57
Q

what is another name for a TIME TO EVENT CURVE

A

a KAPLAN MEIR curve

58
Q

What are stat ways to analyse a time to event curve / survival distribution???

A

Cox proportional Hazard
Log-rank
Wilcoxon two-sample test

59
Q

What does each downward step represent on KAPLAN MEIR curve

A

Each downward step represents an event experienced by a patient

60
Q

What does each small vertical tick reeprsent on KM curve?

A

Each vertical tick represents a censored observation (death, lost to follow up, or study period ends)

61
Q

what is a set of criteria you can use for causality?

A

Bradford HIll Criteria

62
Q

What is the Relative Risk Reduction

A

Proportion of risk reduction attributable to an intervention

RRR = (CER-EER)/CER

63
Q

What are good outcomes?

A

Outcomes determined A PRIORI

That are patient oriented and meaningful to pt and HC

64
Q

What do the Bradford HIll criteria for assessment of causality include=

A
  • Strength of effect (the larger an association, the more likely it is to be causal)
  • Biological plausibility
  • Coherence (with otgher studies)
  • Consistency
  • Temporality….
65
Q

What is log-rank used for

A

To statistically compare two groups. Assumes proportional hazard

66
Q

What are proportional hazards

A

the assumption that HR remain constant over time

67
Q

What is Cox prop hazard model used for

A

Compares two groups bu including other factors (covariates) - similar to multiple regression
also assumes proportional hazards

68
Q

What model is used to analyse a KM curve when you CANNOT assume proportional hazards?

A

Wilcoxon test - as it gives more weight to deaths/events early on in time

69
Q

What is the benefit of a Wilcoxon test

A

It gives more weight to deaths or events early on in time

So it is good at detecting EARLY differences between the two groups

70
Q

What is a linear regression analysis?

A

Looks at whether there is a linear relationship between dependent variable and independent variables

71
Q

What type of linear regression analysis exist

A

Univariable (1 independent variable only)

Multivariable (>1 independent variable)

72
Q

What number do we look at for linear regression

A

R2

73
Q

What does R2 indicate

A

Coefficient of determination

Measures % of variation in dependent variable that is explained by the variation in the independent variable

= fraction of variation in dependent variable that is explained by the model

So if R2 = 52%, smoking accounts for 52% of inhaler use. Other factors need to be considered.

74
Q

what are strategies to reduce confounders?

A
  • Randomization (aim is random distribution of confounders between study groups)
  • Restriction (restrict entry to study of individuals with confounding factors - incl/excl criteria, although risks bias in itself)
  • Matching (of individuals or groups, aim for equal distribution of confounders)
  • Stratification (confounders are distributed evenly within each stratum)
  • multivariate analysis (only works if you can identify and measure the confounders)
75
Q

when is logistic regression used?

A

Used to estimate the association of INDEPENDENT VARIABLES to a BINARY DEPENDENT VARIABLE (the outcome)

I.e. the outcome MUST be a YES/NO situation

76
Q

what are methods to deal with missing data in an intention to treat protocol?

A
  • Worse-case scenario
  • Hot deck imputation (fill in missing vaules from similar subjects with complete records)
  • last observation carried forward