Statistics Flashcards

1
Q

Numerical data types

A

Discrete - whole numbers only

Continuous - any value, scale - weight, height, length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What % of population for 1SD?

A

68%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What % of population for 2 SD?

A

95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What % of population for 3 SD?

A

99.7%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parametric data
- criteria?
- Advantages

A

Continuous numerical data
Population has normal distribution
Population and sample have same variance and SD

Parametric assessment has better power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Features of non-parametric tests

A

Emphasis on rank
Doesn’t require specific distribution
Less power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Central Limit theorem

A

For a skewed population, if N > 30, then can assume distribution will be normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 groups
Unpaired
Parametric

What tests?

A

Equal variance - student t test

Not equal variance - Welch test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 groups
Unpaired
Non-parametric

A

Mann Whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 groups
Paired
Parametric

A

Paired t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

2 groups
Paired
Non-parametric

A

Wilcoxon signed rank test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 or more groups
Unpaired
Parametric

A

One way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

3 or more groups
Unpaired
Non-parametric

A

Kruksal Wallis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3 or more groups
Paired
Parametric

A

One way repeated measured ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

3 or more groups
Paired
Non-parametric

A

Friedman test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Test association between 2 qualitative variables

N > 50

A

Chi Squared test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Test association between 2 qualitative variables

N < 50

A

Fischer Exact Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Test linear relationship between 2 variables

Parametric

A

Pearson’s correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Test linear relationship between 2 variables

Non-parametric

A

Spearman’s rank correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Difference between test statistic and p value

A

Test statistic - standardised value used for hypothesis testing

p value - probability that test statistic is random = type 1 error probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When to use Z statistic

A

Known population mean + SD

Sample size > 30

Z = z score. Need population mean and sd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When to use t statistic

A

Popualtion mean and SD unknown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When to use F statistic

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical Power
- What is it

A

Probability study will detect predetermined difference between 2 groups
= probability will correctly accept alternative hypothesis

1- power = chance of false negative = probability of type 2 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Changes that will increase power

A

Increase sample size
Increase significance level (0.05 - 0.1)
Increase detected difference

Reduce SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Deciding significance level

A

If consequences of type 1 errror are serious - use small significance level, reducetype 1 error

If consequences of false negative are high, use higher significance level, increase power, reduce type 2 error chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Drawback of post-hoc analysis

A

Type 1 error chance increases (selectively looking for positives, multiple error each time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Drawback of trying to mitigate type 1 error risk in post-hoc analysis

A

Make total significance level smaller

Increase requirement for power - if N not increased, then type 2 error increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Event rate is also called?

A

Absolute risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

NNT formula

A

1/ARR

i.e 1 divided by absolute risk reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Odds calculation

A

Number of events: number of non events

e.g
20% event rate
1:4 odds

32
Q

Formula for probability from odds

A

Prob = odds/(odds+1)

33
Q

Hazard vs hazard ration

A

Hazard - conditional probability of event given patient has survived to that point in time. On time to event analysisi

Hazard ratio - ratio of two groups hazards. Ratio should remain constant with time

34
Q

Differences between hazard ratio and median ratio

A

Hazard ratio - odds of survival of one group to the other (odds on winning)

Median ration - margin of victory (compares point in time where probabilty of survival i 0.5 for both arms)

35
Q

Tests used to compare Kaplan Meir

A

Log rank test - compares two survival curves

COX proportional regressional model - factors other variable to explain hazard

36
Q

Study
Start with group
Trace backwards and determine exposure

A

Case Control

37
Q

Study for rare disorders

Study for disease with long lag time between exposure and otucome

A

Case control

38
Q

Bias affecting case control

A

Recall bias

Selection bias

39
Q

Start with exposure
Measure if disease occured

A

Cohort

Measure newly occuring disease = prospective

Look back in time at exposure, see if disease developed = retrospective

40
Q

Study for rare risk factor/exposure

A

Cohort study

41
Q

Study where only individuals who have experienced an event are included

A

Self controlled case series

42
Q

Sampling to use in homogenous population

A

Simple random sample

43
Q

Sample to use in heterogenous population with homogenous subgroups

A

Stratified random sample

44
Q

Sample to use for population that has heterogenous subgroups, which are similar to each other

A

Cluster sampling

e.g geography

45
Q

Screening detects disease earlier, but disease course no different

Survival appears longer when it is not

A

Lead time bias

46
Q

Screening detectes earlier cancers wich might not develop into end disease

A

Length time bias

47
Q

Assume subjects remain in randomised group, regardless of crossover

A

Intention to treat analysis

48
Q

Limitation of intention to treat analysis

A

Underestimates treatment effect

49
Q

Analyse patients who strictly adhered to protocol
(exclude those who dropped out)

A

Per protocol analysis

Likely to show exaggerated treatment effect

50
Q

Way to reduce confounders

A

Randomisation

51
Q

Inaccurate way to select patients for trial

Produce sample that is not representative of population

A

Selection bias

52
Q

Investigator knows which arm next patient will receive - can change who is allocated

A

Allocation bias

Blind to reduce

53
Q

Data for study collected so taht some members of population less likely to be included (e.g email and old people)

A

Ascertainment bias = sampling bias

54
Q

Difference in how information is obtained/recorded

A

Interviewer bias

55
Q

Poor recollection of previous events

A

Recall bias

56
Q

Lack of response from some patients changing sample characteristics

A

Non-response bias = transfer bias

57
Q

Participants leave trial/lost to follow up

A

Attrition bias

58
Q

Differences that occur due to knowledge of intervention allocation

A

Performance bias

59
Q

Participants report positive effect if they know they are being observed

A

Hawthorne effect

60
Q

Nocebo

A

negatiev expectations, cause control to have more negative effect

61
Q

Rely too heavily on initial piece of information for all subsequent decisions

A

Anchoring bias

62
Q

Not all research makes it into analysis

A

Publication bias

63
Q

Establish dosing, pharmacokinetics

A

Phase 1

64
Q

Establish significant A/E

A

Phase 2

65
Q

Establish efficacy

A

Phase 3

66
Q

stablish long term f/u and surveillance of a/e

A

Phase 4

67
Q

Ways to assess heterogeneity in meta-analysis

A

How different the results of different studies are

1) Overlap of confindence intervals

2) I2 statistic - >7% = high heterogeneity

68
Q

Random effects model

A

Takes into account other studies that may have been ignored by meta-analysis

69
Q

Propensity score
- What is it
- What does it reduce

A

Score from 0-1, looks at other variables that predict whether a patients is assigned to a particular group (e.g smoking –> age, socioeconomic)

Aims to match groups

Reduces selection bias and confounding

70
Q

When you use propensity score?
What does it reduce

A

No randomised studies

Reduces treatment selection bias due to knowledge of treatment

71
Q

When can allocation bias occur?

When can allocation concealment be applied

A

Random allocation requires
- Generate random sequence - this not done can lead to allocation bias

Implementing random sequence so it is concealed - can use allocation concealment here

72
Q

Example of allocation concealment

A

envelopes

73
Q

Diferrence in allocation vs performance bias in terms of timing

A

Allocation bias occurs before allocation
- randomisation
- concealment

Performance occurs after randomisation
- recording of results due to knowledge of treatment

74
Q

Coefficienct of variation

A

Standard deviation:mean

75
Q

I2 is a measure of difference in what between studies?

A

Variance

76
Q

Formula for variance

A

each value - mean, squared

Average of above values