Statistics Flashcards

1
Q

Pearson coefficient

A

Measures the strength and direction of relationship between two variables - ie linear correlation
0 - no relationship
0-1 or -1-0 = positive or negative linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Kappa coefficient

A

Cohen’s kappa coefficient is a statistic that is used to measure inter-rater reliability for qualitative items. It is generally thought to be a more robust measure than simple percent agreement calculation, as κ takes into account the possibility of the agreement occurring by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Linear regression test

A

Look at cause and effect relationship
estimate the effect of one CONTINUOUS variable on another Try to determine a specific mathematical equation to describe the relationship (line of best fit)
Simple : one continous IV and one continous DV eg effect of income on longevity
Multiple: 2 or more continous IV and one continous DV eg effect of income and mins of exercise per day on longevity
Logistic regression: continuous IV and binary DV eg what is the effect of drug dosage on survival

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do ANOVA and T tests have in common

A

Parametric
Compare differences between group means
Test the effect of a categorical variable on a quantitative DV
ANOVA- more than one IV, one DV
MANOVA- more than one IV and 2+ DV.What is the effect offlower speciesonpetal length,petal width, andstem length?
Repeated measures ANOVA compares the same group at various time points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Corrolation tests

A

Check whether variables are relatedwithout hypothesizing a cause-and-effect relationship. I if you know one, can you predict the other

eg Pearsons r
2 continous variables eg how are latitude and temperature related
Spearmans r- 2 ranked/ordinal varibales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chi squared test

A

Chi square test of independence: Test if 2 categorical variables are related to each other
Is the species of flower related to petal size
Is there more sporting injuries in basketball compared to netball (compare proportions of people who are injured)
Chi square goodness of fit test: test weather observed frequncies are significantly different to what was expected (equal frequencies/proportion). Null hypothesis would be that there is no difference in proportions in each category
Fishers exact test: like chi squared but if value <5 in one more more cells in data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kruskal Wallis test

A

non parametric version of ANOVA
3 + categories + one quanitative outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Wilcoxon signed ranke test

A

non parametric version of paired t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

mann witney u test

A

non parametric version of independant t test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bonferroni correction

A

Post hoc test. The Bonferroni correction is a multiple-comparison correction used when several dependent or independentstatistical testsare being performed simultaneously
If there are more than 2 groups in a varibale and the null hypthesis is rejected with the first statistical test, need to do a Bonferroni to figure out which 2 groups are significantly different from each other. A Bonferroni correction is when you divide your originalsignificance level(usually .05) by the number of tests you’re performin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Absolute risk

A

the number of events in a group, divided by the number of people in that group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ARR (absolute risk reduction, aka attributable risk, risk difference)

A

Absolute risk in contol group - absolute risk in treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relative risk

A

absolute risk in treatment/ absolute risk in control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

relative risk reduction

A

Risk difference/ absolute risk in control
(ARC – ART) / ARC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

relative risk reduction

A

1- relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

odd ratio

A

WITH/WITHOUT

probability of outcome occurring/probability of outcome not occurring

=cross product = AD/BC
odd that case exposed/odds control exposed
= (A/C) / (B/D)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Prevelance

A

PREVELENCE= all cases/total population
Prevalence depends on: incidence, recovery rate, and death rate (ie influenced by both the rate at which new cases are occurring and the average duration of the disease)
Prevalence = (Incidence Rate) x (Average Duration of Disease)
Point prevalence- at a specific moment in time
Period prevalence- over a specific period of time

18
Q

incidence

A

INCIDENCE = new cases per time period/population at risk
Population at risk = total population who can get the disease- those who already have the disease
Incidence reflects the rate at which new cases of disease are being added to the population (and becoming prevalent cases).
Incidence rate: new cases in a certain period of time

19
Q

standard deviation

A

measures variation/dispersion of dataset relative to the mean 68-95-99.7

20
Q

confidence interval

A

The 95% confidence interval is a range of values that you can be 95% confident contains the true mean of the population.
To calculate the confidence interval, start by computing the mean andstandard errorof the sample.
The narrower the interval (upper and lower values), the more precise is our estimate.
As a general rule, as a sample size increases the confident interval should become more narrow.

21
Q

cross sectional study

A

Case control study looks at those who have the disease, and then look backwards to see if they have the past exposure in question, so better for rare disease
- Efficient in design for study of RARE diseases
- Requires fewer subjects than other studies
- Best design for diseases with long latent periods
- Can evaluate multiple possible/potential exposures

22
Q

Type 1 error

A

False positive (incorrectly rejects null hypothesis )
Pr type 1 error = Alpha
alpha level (α), which is thep-value below which you reject the null hypothesis. Ap-value of 0.05 indicates that you are willing to accept a 5% chance that you are wrong when you reject the null hypothesis.
Can reduce risk T1error by using lower P value, eg P 0.01 means 1% chance of a type 1 error

23
Q

Type 2 error

A

False negative (fails to reject null hypothesis)
ie saying no effect when there is
· The probability of making a type II error = Beta (β), and this is related to the power of the statistical test (power = 1- β). You can decrease your risk of committing a type II error by increasing the power of the test.
· Power is increased by increasing sample size

24
Q

internal validity

A

Internal validity: the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.-

eg Designs of study, minimal systemic bias , Allocation concealment, randomization, blinding, appropriate comparer, intention to treat,

25
Q

external validity

A

External validity: is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times

26
Q

Left skew/negative skew

A

tail on the left- mean- median - mode (most= peak)

27
Q

right skew /postive skew

A

Mode (peak) - median- mean - tail on the right

28
Q

sensitivity

A

ability to detect disease
sensitve test, when negative, rules disease out
true postive/all those with disease

29
Q

specificity

A

ability to detect those without disease
a specific test, when positive, will rule a disease in
true negative/ all those without disease

30
Q

Positive predictive value

A

likelihood of having disease when test is positve

31
Q

Negative predictive value

A

likelihood of not having disease when test is negative

32
Q

Positive Likelihood Ratio

A

if test positive, how likely is patient to have disease
sensitivity/1- specificity

33
Q

Negative Likelihood Ratio

A

if test is negative, how likely is patient to have disease
1- sensitivity/specificity

34
Q

Number needed to treat

A

1/ absolute risk reduction

35
Q

Clinical trials

A

Preclinical

· In vitro/ animal

Phase 0/ Pilot

· Preliminary pharmacokinetics/pharmacodynamic data
· Micro dosing /subtherapeutic dosing

· Very small

Phase I

· Safety
· Dosage, side effects
· Further PK/PD information

· Small groups (<100)
· Healthy volunteers

Phase II

· Safety and Efficacy
· Dose requirements/dose response

· Larger groups, several hundred (100-300)
· Case series/ small RCT

Phase III

· Efficacy compared to current standard treatment
· Several hundred to thousands (300-5000)
· Individuals with disease
· >1 RCT usually needed

Phase IV

· Surveillance, continued pharmaco-vigilance/post marketing surveillance
· Cost efficacy
· Longer term / rare effects
· After marketing

· Effectiveness in general population

36
Q

Standard error

A

Standard error = measures the amount of variability in the sample mean; it indicates how closely the population mean is likely to be estimated by the sample mean

37
Q

Bias best avoided by

A

Randomisation
Blinding
intention to treat analysis

38
Q

confounding best avoided by

A

randomisation
matching on variables eg sex, age

39
Q

magnitude of effect in various studies

A

· Case control = odds ratio
· Cohort = relative risk
· RCT
o Absolute risk difference
o Relative risk difference
NNT

40
Q

Pre test probability

A

Prevelance

Those with disease / population

41
Q

when does the OR approximate the RR

A

low prevalence condition