Stats Flashcards
Significance level
The probability of rejecting the null hypothesis given that it is true (a type I error)
AKA alpha
Power
Probability that a test will reject a false null hypothesis
Factors influencing power:
- Sample size
- Standard deviation
- Effect size
- Alpha
- Beta
Better off increasing effect size than sample size as standard errors of estimation decrease with the
square root of the sample size
Effect size
Effect size is a quantitative measure of the strength of a phenomenon
Examples of effect sizes are:
- correlation between two variables
- the regression coefficient in a regression
- the difference between the two means
- the risk with which something happens
Regression coefficient
The constant that represents the rate of change of one variable as a function of changes in the other
It is a in y=ax+b
Hazard ratio
The ratio of the
probability of a harmful event in the
experimental arm to the probability in the comparator arm
It is a measure at a specific time point, whereas relative risk is cumulative over a period of time.
Odds ratio
Represents the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure.
OR= ratio of positive outcome (exposed/non exposed) divided by ratio of no outcome (exposed/non exposed)
OR=1 Exposure does not affect odds of outcome
OR>1 Exposure associated with higher odds of outcome
OR<1 Exposure associated with lower odds of outcome
Used for case-control studies
Applies only to sample tested, not overall population
Not a measure of probability, unlike relative risk
In rare diseases, odds ratio approximates to relative risk
Relative risk
Probability of an event when exposed divided by probability of event when not exposed
Probability of event in exposed/all exposed divided by probability of event not exposed/all not exposed
P-value
Probability of obtaining a result equal to or “more extreme” than what was actually observed, when the null hypothesis is true
Differs from alpha which is the level
Absolute risk reduction
The difference in risk of an outcome with and without intervention
Relative risk reduction
Absolute risk reduction divided by control event rate
- Takes into account control rate of event
Selection bias
Individuals being more likely to be selected for study than others
Berksonian bias is a type of selection bias when both the disease and exposure affect participant selection e.g. case control study
Bias
Systematic deviation of results or inferences from truth
Spectrum bias
Failure or diagnostic test to account for variation in the population e.g. evaluating diagnostic tests on biased patient samples, leading to an overestimate of the sensitivity and specificity of the test.
In an ideal world, every variation would be included proportionally within the study and stratified for according to probability of an outcome.
Omitted variable bias
Bias that appears in estimates of parameters in a regression analysis when the assumed specification omits an independent variable that should be in the model.
Detection bias
Systematic differences between groups in how outcomes are determined
N.B selection bias: participant with influential characteristic more likely to be recruited/selected for
Funding bias
May lead to selection of outcomes, test samples, or test procedures that favor a study’s financial sponsor
Reporting bias
Selective revealing or suppression of information
-more likely in self-reporting surveys for habits perceived as positive or negative
Exclusion bias
Systematic exclusion of certain individuals from the study.
Attrition bias
Arises due to a loss of participants e.g. loss to follow up during a study - less than 5% is of little concern, over 20% poses serious threats to validity
Recall bias
Arises due to differences in the accuracy or completeness of participant recollections of past events
Differs from reporting bias as to do with memory rather than perception
Response bias
Systematic difference between survey answers and actual participant experiences
Includes recall bias and reporting bias
Observer bias
Systematic difference between a true value and the value actually observed due to observer
Confidence interval
“Is a range that the true value lies within”
Actually means that if the experiment was repeated an unlimited number of times, the results would lie within CI 95% of the time, if the confidence level is 95%.
Performance bias
Systematic differences between groups in the care that is provided, or in exposure to factors other than the interventions of interest
Absolute risk
Number of events over total population
Number needed to treat
Number of people who have to be exposed to treat 1
= 1/Absolute risk reduction
= 1/AR control - AR treatment
Relative risk
Rate of events with treatment divided by rate of events in controls
Differs from absolute risk which is rate of events with treatment divided by total population
Confounder
Factor that is not the intervention which may influence outcome
Incidence
Number of new cases occurring within a period of time
Prevalence
Actual number of cases alive, with the disease either during a period of time (period prevalence) or at a particular date in time (point prevalence)
Intention to treat analysis
Patients analysed in groups in which they were randomly allocated, regardless of the treatment they ultimately recieved
“Once randomised, always randomised”
Per protocol analysis
Only those who completed study protocol are analysed
Subject to attrition bias if high dropout rate
Negative predictive value
If the test is negative, chances that the patient does not have the disease
Positive predictive value
If the test is positive, chances that the patient has the disease
Sensitivity
If the patient has the disease, probability that the test is positive
Specificity
If the patient does not have the disease, probability that the test is negative
Internal validity
The extent to which you are able to say that the tested variable caused the result
- must account for confounders
External validity
Ability to apply results of the study to the clinical context
ROC curve
Assess a tests ability to discriminate between two outcomes
Chi-squared test
A statistical test for heterogeneity
Assesses whether variability between results are compatible with chance alone
Exposure and outcome must be categorical
Assumes large population
Fisher’s exact test
Statistical significance test used when exposure and outcome are categorical. Can be used for all population sizes, but tends to be selected for small samples
T-test
Statistical significance test used to compare two categorical exposures with a continuous outcome
Assumes Gaussian distribution
Mann-Whitney/ Wilcoxon Rank Sum
Statistical significance test used to compare two categorical exposures with a continuous outcome
Assumes non Gaussian distribution
ANOVA
Statistical significance test used to compare more than two categorical exposures with a continuous outcome
Assumes non Gaussian distribution
Pearson test
Statistical significance test used to compare a continuous set of exposures with a continuous outcome
Assumes Gaussian distribution
Spearman test
Statistical significance test used to compare a continuous set of exposures with a continuous outcome
Assumes non Gaussian distribution
Linear regression
Statistical test used to compare a continuous set of exposures with a continuous outcome
Logistic regression
Statistical test used to compare a continuous set of exposures with a binary outcome
Cox proportional hazard test
Statistical test used wen exposure is categorical or continuous and outcome is time dependent
Allows multi-variable adjustments to account for confounders (unlike log rank test)