Statistics Flashcards

1
Q

Numerical data types

A

Discrete - whole numbers only

Continuous - any value, scale - weight, height, length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What % of population for 1SD?

A

68%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What % of population for 2 SD?

A

95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What % of population for 3 SD?

A

99.7%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parametric data
- criteria?
- Advantages

A

Continuous numerical data
Population has normal distribution
Population and sample have same variance and SD

Parametric assessment has better power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Features of non-parametric tests

A

Emphasis on rank
Doesn’t require specific distribution
Less power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Central Limit theorem

A

For a skewed population, if N > 30, then can assume distribution will be normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 groups
Unpaired
Parametric

What tests?

A

Equal variance - student t test

Not equal variance - Welch test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 groups
Unpaired
Non-parametric

A

Mann Whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 groups
Paired
Parametric

A

Paired t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

2 groups
Paired
Non-parametric

A

Wilcoxon signed rank test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 or more groups
Unpaired
Parametric

A

One way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

3 or more groups
Unpaired
Non-parametric

A

Kruksal Wallis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3 or more groups
Paired
Parametric

A

One way repeated measured ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

3 or more groups
Paired
Non-parametric

A

Friedman test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Test association between 2 qualitative variables

N > 50

A

Chi Squared test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Test association between 2 qualitative variables

N < 50

A

Fischer Exact Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Test linear relationship between 2 variables

Parametric

A

Pearson’s correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Test linear relationship between 2 variables

Non-parametric

A

Spearman’s rank correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Difference between test statistic and p value

A

Test statistic - standardised value used for hypothesis testing

p value - probability that test statistic is random = type 1 error probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When to use Z statistic

A

Known population mean + SD

Sample size > 30

Z = z score. Need population mean and sd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When to use t statistic

A

Popualtion mean and SD unknown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When to use F statistic

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical Power
- What is it

A

Probability study will detect predetermined difference between 2 groups
= probability will correctly accept alternative hypothesis

1- power = chance of false negative = probability of type 2 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Changes that will increase power
Increase sample size Increase significance level (0.05 - 0.1) Increase detected difference Reduce SD
26
Deciding significance level
If consequences of type 1 errror are serious - use small significance level, reducetype 1 error If consequences of false negative are high, use higher significance level, increase power, reduce type 2 error chance
27
Drawback of post-hoc analysis
Type 1 error chance increases (selectively looking for positives, multiple error each time)
28
Drawback of trying to mitigate type 1 error risk in post-hoc analysis
Make total significance level smaller Increase requirement for power - if N not increased, then type 2 error increases
29
Event rate is also called?
Absolute risk
30
NNT formula
1/ARR i.e 1 divided by absolute risk reduction
31
Odds calculation
Number of events: number of non events e.g 20% event rate 1:4 odds
32
Formula for probability from odds
Prob = odds/(odds+1)
33
Hazard vs hazard ration
Hazard - conditional probability of event given patient has survived to that point in time. On time to event analysisi Hazard ratio - ratio of two groups hazards. Ratio should remain constant with time
34
Differences between hazard ratio and median ratio
Hazard ratio - odds of survival of one group to the other (odds on winning) Median ration - margin of victory (compares point in time where probabilty of survival i 0.5 for both arms)
35
Tests used to compare Kaplan Meir
Log rank test - compares two survival curves COX proportional regressional model - factors other variable to explain hazard
36
Study Start with group Trace backwards and determine exposure
Case Control
37
Study for rare disorders Study for disease with long lag time between exposure and otucome
Case control
38
Bias affecting case control
Recall bias Selection bias
39
Start with exposure Measure if disease occured
Cohort Measure newly occuring disease = prospective Look back in time at exposure, see if disease developed = retrospective
40
Study for rare risk factor/exposure
Cohort study
41
Study where only individuals who have experienced an event are included
Self controlled case series
42
Sampling to use in homogenous population
Simple random sample
43
Sample to use in heterogenous population with homogenous subgroups
Stratified random sample
44
Sample to use for population that has heterogenous subgroups, which are similar to each other
Cluster sampling e.g geography
45
Screening detects disease earlier, but disease course no different Survival appears longer when it is not
Lead time bias
46
Screening detectes earlier cancers wich might not develop into end disease
Length time bias
47
Assume subjects remain in randomised group, regardless of crossover
Intention to treat analysis
48
Limitation of intention to treat analysis
Underestimates treatment effect
49
Analyse patients who strictly adhered to protocol (exclude those who dropped out)
Per protocol analysis Likely to show exaggerated treatment effect
50
Way to reduce confounders
Randomisation
51
Inaccurate way to select patients for trial Produce sample that is not representative of population
Selection bias
52
Investigator knows which arm next patient will receive - can change who is allocated
Allocation bias Blind to reduce
53
Data for study collected so taht some members of population less likely to be included (e.g email and old people)
Ascertainment bias = sampling bias
54
Difference in how information is obtained/recorded
Interviewer bias
55
Poor recollection of previous events
Recall bias
56
Lack of response from some patients changing sample characteristics
Non-response bias = transfer bias
57
Participants leave trial/lost to follow up
Attrition bias
58
Differences that occur due to knowledge of intervention allocation
Performance bias
59
Participants report positive effect if they know they are being observed
Hawthorne effect
60
Nocebo
negatiev expectations, cause control to have more negative effect
61
Rely too heavily on initial piece of information for all subsequent decisions
Anchoring bias
62
Not all research makes it into analysis
Publication bias
63
Establish dosing, pharmacokinetics
Phase 1
64
Establish significant A/E
Phase 2
65
Establish efficacy
Phase 3
66
stablish long term f/u and surveillance of a/e
Phase 4
67
Ways to assess heterogeneity in meta-analysis
How different the results of different studies are 1) Overlap of confindence intervals 2) I2 statistic - >7% = high heterogeneity
68
Random effects model
Takes into account other studies that may have been ignored by meta-analysis
69
Propensity score - What is it - What does it reduce
Score from 0-1, looks at other variables that predict whether a patients is assigned to a particular group (e.g smoking --> age, socioeconomic) Aims to match groups Reduces selection bias and confounding
70
When you use propensity score? What does it reduce
No randomised studies Reduces treatment selection bias due to knowledge of treatment
71
When can allocation bias occur? When can allocation concealment be applied
Random allocation requires - Generate random sequence - this not done can lead to allocation bias Implementing random sequence so it is concealed - can use allocation concealment here
72
Example of allocation concealment
envelopes
73
Diferrence in allocation vs performance bias in terms of timing
Allocation bias occurs before allocation - randomisation - concealment Performance occurs after randomisation - recording of results due to knowledge of treatment
74
Coefficienct of variation
Standard deviation:mean
75
I2 is a measure of difference in what between studies?
Variance
76
Formula for variance
each value - mean, squared Average of above values