Definitions Flashcards
What is Accuracy?
How close the sample statistic is on average to the population parameter that it estimates
What is age standardisation?
Adjustment to minimise the effects of differences in age composition, when comparing summary statistics across different populations
What is ANOVA?
(Analysis of Variance) = A statistical test used to compare means in three or more groups (one way for unmatched data and repeated measures for matched data)
What is a bar chart?
A graph used to present a categorical variable; frequencies within each group of observations are represented by the heights of the corresponding bars
What is a baseline/reference group?
The group (usually the unexposed group) with which other exposure groups are compared
What is bias?
Systematic departure from the true value which can give misleading results; includes selection, loss to follow up, measurement (recall and interviewer) and performance bias
What is a binary variable/ dichotomous?
A categorical variable which can only take one of two values
What is blinding?
Subjects and/or outcome assessors are unaware of treatment allocation in a randomised controlled trial
What is a box and whisker plot?
A graph used to present a continuous outcome by a categorical exposure; boxes for each category represent medians and inter-quartile ranges and whiskers represent the extreme values for the outcome
What are case-control studies?
Study designed such that subjects are recruited on the basis of the presence (case) or absence (control) of an outcome, then an exposure is measured retrospectively
What is a categorical variable?
Values indicated category membership; can be ordinal or nominal
What is a causal factor?
Exposure which causes an outcome i.e. must precede the outcome
What is central tendency?
Location of a distribution including mean, median and mode
What is chance?
Variation that is due to random fluctuations
What is the chi-squared test?
Statistical test used to compare two unmatched continuous variables; an ordinal version also exists
What is clinical equipoise?
A state of uncertainty where it is believed to be equally likely that either of two treatment options may be better
What is the clinical iceberg?
Phenomenon whereby health practitioners are only aware of the relatively small proportion of diseases that present to them
What is the cochrane Q test?
Statistical test used to compare two categorical variables when data are matched
What is a cohort study?
(also known as longitudinal or follow up study) = Participants are identified as a sample from a population, then collection of exposure and outcome data depends on whether the study of prospective or historical
What is concealment?
Random allocation is hidden from investigators in randomised controlled trials making it impossible for them to have any influence over allocation of participants in treatment groups
What is a confidence interval?
Interval with a given probability (i.e. 95%) that it contains the true value of a population parameter, measures the precision of the sample statistic
What is a confounder?
A third variable which provides an alternative explanation for the observed association between an exposure and outcome
What is confounding?
Association with a third variable which provides an alternative explanation for the observed association between an exposure and outcome
What is confounding?
Association with a third variable which provides an alternative explanation for the observed association between an exposure and outcome
What is a contingency table?
Table showing the frequencies of observations for two categorical variables such that sub-categories of one variable (exposure) are indicated in rows and sub-categories of the other variable (outcome) are indicated in columns
What is a continuous variable?
A numerical variable which can potentially take an infinite number of distinct values
What is a correlation coefficient?
Measure of association that indicates the degree to which variable change together; can be pearsons (parametric) or spearman (non parametric)
What is critical appraisal?
Judgement made as to the quality of published articles e.g. regarding whether the appropriate study design and statistical methods have been chosen
What is a cross sectional study?
Study that examines the association between exposure and outcome at a particular point in time
Wat is a crude association?
Also known as an unadjusted association. Estimated association between exposure and outcome, before possible confounding variables are taken into account
What is a crude association?
Also known as an unadjusted association. Estimated association between exposure and outcome, before possible confounding variables are taken into account
What is demography?
Study of populations, especially with reference to size, density, morality, fertility, growth, age distribution and the interaction of these with social and economic factors
What is denominator?
The lower portion of a fraction used to calculate a rate or ratio
What is a descriptive study?
A study concerned with describing a variable in terms of time, place or person
What is detection bias?
Form of measurement bias that may occur when the outcome assessor is not blinded
What is a diagnostics test?
A test performed to aid diagnosis of an outcome (usually a disease), often compared with a gold standard in reliability studies
What is a discrete variable?
A numerical value representing counts, which cannot take on any intermediate values
What is the dose-response effect/trend?
Pattern of association observed between exposure (does) and outcome (response) including linear trend and threshold effects
What is ecological fallacy?
Bias that may occur because an association observed between variables on an aggregate level does not necessarily represent the association that exists at an individual level
What is an ecological study?
Study in which the unit of analysis is populations or groups of people, rather than individuals
What is eligibility criteria?
The criteria that must be met by subjects eligible for inclusion in a study
What is epidemiology?
Study of the distribution and determinants of health-related conditions or events in specified populations and the application of this study to the control of health problems
What is ethical approval?
Approval that must be sought from a local or regional ethics committee before a randomised controlled trial can be undertaken
What is Evidence-based healthcare?
The conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients
What is exposure/ explanatory variable/ independent variable/ x-variable/ risk factor/treatment group/intervention group?
A variable whose influence on the outcome variable is of interest
What os fisher’s exact test?
Statistical test used to compare two unpaired dichotomous variables in small datasets
What is Friedman test?
A statistical test used to compare distributions between three or more groups, when variables are matched and not normally distributed
What is the geometric mean?
A back transformation (antilog/exponential) of a mean value which has been calculated on logged data
What is a gold standard?
Measurement method widely accepted as being the best available, often used in reliability studies
What is a gold standard?
Measurement method widely accepted as being the best available, often used in reliability studies
What is the hierachy of evidence?
Simple guide to assessment of the evidence provided by different study designs randomised controlled trials (highest level) -> cohort -> case-control -> cross-sectional -> ecological -> descriptive; although quality of evidence also depends on quality of the study design and execution
What is a histogram?
Graphical representation of the frequency distribution of a continuous variable with areas of the bars representing the frequencies within each grouping interval
What is a historical/retrospective cohort study?
Outcome status for a defined sunset of the populations i ascertained at baseline and then linked to pre-existing historical data on exposure usually from routine records, so that the cohort’s experience of outcome risk can be reconstructed
What is a hypothesis?
Idea expressed in such a way that it can be tested and refuted
What is hypothesis testing?
Statistical methods used to determine how likely observed differences in data are due to chance rather than real differences
What is incident rate?
Rate of occurrence of new cases of an outcome, which is dependent on the number of new cases, total number in the population, and the time interval of interest
What is (statistical) interference?
Drawing conclusions abut some unknown aspect of a population, based on statistics derived from a random sample from that population
What is informed consent?
Consent five by the subject or responsible person fro participation in a study - usually randomised controlled trial
What is intention to treat analysis?
Participants in a randomised controlled trial analysed according to their treatment group allocation, regardless of whether they completed the trial
What is interaction?
An interaction between an exposure and confounder exists if the association between an outcome and exposure varies across the categories of the confounding variable
What is an intercept?
Point where a linear regression line crosses the y-axis i.e. value of the outcome when the exposure is zero
What is an inter-quartile range?
A measure of variability = the spread of data around the median, and is the distance between the lower quartile (25th gentile) value and at the upper quartile (75th centile) of a distribution
What is an interventional study?
A study where an investigator tests whether modifying or changing something (‘intervening’) alters the outcome, usually a randomised controlled trial or experimental study
What is interviewer bias?
Form of measurement bias where an interviewer inquires more deeply about exposure in those with the outcome compared to those without
What is kappa?
Measures the agreement between two or more examiners or methods when the variables are both categorial; if one or more of the variables are ordinal a modified version called the weighted kappa should be used
What is Kruskal-Wallis test?
A statistical test used to compare distributions between three or more groups when variables are unmatched and not normally distributed
What are limits of agreement/bland altman method?
Gives an indication of the agreement between two examiners or methods when the variables are unmatched and not normally distributed
What is linear regression?
Regression method appropriate for continuous outcomes that are approximately normally distributed, which produces regression coefficients
What is the linear regression equation?
Equation derived from fitting a linear regression model, usually denoted y = a + bx
What is a linear regression line?
A diagrammatic presentation (usually overlaid on a scatter plot) of a linear regression equation
What is a linear trend?
A type of dose-response effect wherby there is a systematic increase in the risk of outcome with increasing or decreasing level of exposure
What is logarithmic transformation?
Conversion of data to their natural log values, with the aim of achieving an approximate normal distribution
What is a logistic regression?
Regression method appropriate for binary outcomes where the resulting regression coefficients can be exponentiated to produce odds ratios, multinomial and ordinal versions also exist
What is a loss to follow up bias?
Bias due to subjects being lost over a follow up period, where the loss may be associated with the exposure and/or outcome
What is the Mann Whitney U/ Wilcoxon signed rank test?
Statistical test used to compare distributions between two groups, when variables are unpaired and not normally distributed
What is matched data?
Data that are not independent, often repeated measurements on the same person, or measurements from people who are related such as siblings or twins
What is the McNemar’s test?
Statistical test used to compare two paired dichotomous variables
What is a mean/average/arithmetic mean?
Measure of the central tendency calculated by summing all values and dividing by the total number of observations
What is a mean/average/arithmetic mean?
Measure of the central tendency calculated by summing all values and dividing by the total number of observations
What is measurement bias?
Bias in how the exposure and/or outcome is measured or classified that results in different quality of information collected between those with and without the outcome; includes detection, interviewer & recall bias
What is a median?
Measure of central tendency, which is calculated as middle value when all the values are arranged in order, useful for summarising data that are not normally distributed
What is mendelian randomisation?
Use of observational studies to obtain an estimate of the causal effect of a modifiable exposure on an outcome, through identification of a candidate gene which is related to exposure
What is meta analysis?
Statistical technique for combining estimates of exposure-outcome associations from more than one study, weighting according to size of the study
What is mode?
Measure of central tendency which is calculated as the most frequently occurring of all values, but is rarely used in epidemiology
What is Multiple regression?
regression modelling when there is more than one exposure, or there is one exposure plus a number of confounders
What is a nominal variable?
Unordered categorical variable i.e. categories have no order to them
What are non-parametric/ distribution free methods?
Set of tests based on ranking observations in order of magnitude and testing these rankings rather than the actual values if the observations, suitable for data which are not normally distributed
What is normal?
Statistically = data follows a normal distribution; Clinically = the likely values for an individual
What is Normal distribution?
Continuous symmetrical frequency distribution where both tails extend to infinity and the shape is determined by the mean and standard deviation
What is a null hypothesis?
Hypothesis that there is no association between outcome and exposure
What is a numerator?
The upper portion of a fraction used to calculate a rate or ratio
What is a numerical variable?
Values are numbers as opposed to categories
What is an observational study?
Study that does not involve any form of intervention, where the investigatory just observes and records exposure and outcome information
What are odds?
Number of people with thoutcome divided by the number of healthy people
What is an odds ratio?
Ratio of odds of outcome amongst exposed subjects to the odds of outcome amongst unexposed subjects
What are odds?
Number of people with the outcome divided by the number of healthy people
What is an odds ratio?
Ratio of odds of outcome amongst exposed subjects to the odds of outcome amongst unexposed subjects
What is an ordinal variable?
An ordered categorical variable i.e. categories can take values that are ranked according to an ordered classification
What is an outcome?
(response variable/dependent variable/y-variable/cas-control status/disease group) = variable whose association with an exposure is of interest
What is an outcome?
(response variable/dependent variable/y-variable/cas-control status/disease group) = variable whose association with an exposure is of interest
What is paired data?
Special case of matched date when there are only two groups
What is a parameter?
Numerical quantity measuring some aspect of a population e.g. the mean
What are parametric methods?
Assume the data has an underlying distribution
What is performance bias?
Bias arising due to an unequal provision of care between the treatment and control group in a randomised controlled trial, which may occur if subjects and assessors are not blinded
What is per-protocol analysis/on treatment analysis?
Analysis restricted to those who completed a randomised controlled trial according to protocol, which defeats the main purpose of random allocation and may invalidate the results; it is preferable to use intention to treat analysis
What is person-years at risk?
Sum of the number of years that each individual has been under observation, sometimes used as a denominator for calculating incidence rates
What is a Pie chart?
Graph used to respresent a categorical variable; frequencies within each group of observations are represented by the areas of segments in a circular diagram
What is a Placebo?
Inert medication or procedure i.e. a drug having no pharmacological effect, but intended to give patients the perception that they are receiving treatment for their complaint
What is the placebo effect?
Phenomenon whereby a patients symptoms can be alleviated by an otherwise ineffective treatment, as they expect the treatment to work
What is a Placebo group/control group?
Group of patients in a randomised controlled trial who receive no treatment other than standard care i.e.e placebo
What is poisson regression?
Regression method appropriate for outcomes that are count variables; the resulting regression coefficients are usually exponentiated to produce rate ratios
What is power?
Ability of a study to demonstrate an association between variables if one exists i.e. the probability of observing evidence against the null hypothesis if it is false
What is precision?
Amount of variation in the sample statistic, with greater variation indicating less precision
What is prevalence?
Total number of individuals who have an outcome at a particular time divided by the total population at risk at that time i.e. proportion with outcome at a particular time
What s proportion?
Number of occurrences of an event divided by total number of observations
What is a prospective cohort study?
Healthy individuals are recruited (though some may already have the outcome at baseline), exposure status recorded, then subjects are followed up to see whether those who were exposed develop the outcome at a different rate to those who were not exposed
What is publication bias?
Tendency for studies that find associations between variables to get published, and the that do not find associations not to get published
What is a P-value?
Probability that the difference between exposure groups would be at least as big as that observed if the null hypothesis of no difference is true i.e. if the difference has arisen due to chance
What is R squared?
Proportion of variance in one variable that is explained by the variation in another, given by the square of the correlation coefficient (R)
What is a random sample?
Sample drawn from a population such that all members of the population have an equal chance of being chosen
What is randomisation?
(/random allocation) In randomised controlled trials, allocation of subjects in either the intervention or control group, by chance aloe, to ensure that groups are similar with respect to the distribution of confounding factors
What is a randomised controlled trial?
Study in which subjects are randomly allocated to either a treatment or control group, followed up, then outcomes measured and compared
What is range?
Measured variability which is the difference between the largest and smallest values in a distribution
What is Rate?
measure of the frequency of occurrence of an event e.g. incidence rate
What is rate ratio?
Quantification of the association between an exposure and discrete/count outcome, calculated using poisson regression models
What is recall bias?
Form of measurement bias occurring in retrospective studies, whereby recall of information is different in the with the outcome compared to those without
What is a reference range?
Measure of variability which indicates the amount of variation between individual observations in a sample and hence likely values for an individual in the population/can inform as to whether a patient is “clinically” normal.
What is regression/regression modelling?
Finds the best mathematical model to describe the outcome (y) with respect to the exposure (x)
What is regression coefficient?
(/slope) = Estimate of the change in outcome (y) for a unit change in exposure (x); in a linear regression this is the gradient of the linear regression line, denoted by b in the linear regression equation (y= a + bx)
What is a reliability/inter-observer study?
Compares the measurements made by two or more examiners or methods with respect to the agreement between them
What is reverse causality?
Alternative explanation for an exposure-outcome association, whereby the outcome causes the exposure rather than the other way around
What is risk/incidence/cumulative incidence?
Number of new cases with he outcome in a particular time period divided by the number of people who did not have the outcome at the outset i.e. proportion of new cases in a time period
What is risk difference?
Difference in risk between exposed and non-exposed groups
What is a risk ratio/relative risk?
Risk of developing the outcome in the exposed group compared to risk of developing the outcome in the unexposed group
What is sample size calculation/power calculation?
Mathematical process of deciding how many subjects should be included in a study, to be determined at the outset
What is a sample statistic?
Estimate of a population parameter, based on a sample drawn from that popultion
What is sampling?
Process of selecting a number of subjects from all the subjects in the target population
What is sampling distribution?
Distribution of sample statistics if repeated samples were drawn from the same population
What is a scatter plot?
Graph used to present two continuous variables, whereby each point represents the exposure and outcome values for an individual
What is a selected sample?
Randoms ample of individuals that have been selected from the target population
What is selection bias?
Systematic difference in the characteristics of the subjects selected randomly to take part in a study and those who are not
What is sensitivity?
Proportion of sample with the outcome, who are correctly classified by a diagnostic test that has been compared to a gold standard (e.g. in reliability study)
What is a sign test/ Wilcoxon signed rank test?
Statistical test used to compare distributions between two groups, when variables are paired but not normally distributed
What is skewed?
An asymmetrical frequency distribution, which is either positively skewed (long tail to the right) or negatively skewed (long tail to the left)
What is Specifity?
Proportion of sample without the outcome, who are correctly classified by a diagnostic test that has been compared to a gold standard (reliability study)
What is standard deviation?
Measure of variability, indicating how widely dispersed the individual observations are in a distribution
What is standard error?
Measurement of the precision of the sample mean as an estimate of the population mean standard deviation of the sampling distribution of a sample statistic
What is a standard normal distribution?
Special case of the normal distribution where the mean is zero and the standard deviation is one
What is statistical significance?
A p-value less than a specified level, usually 5%, suggesting that the null hypothesis can be rejected; although statistical results have traditionallyy been interpreted in this way it is now considered preferable to avoid the the term statistical significance!
What are statistics?
Science of collecting, summarising, analysing (e.g. estimating the strength of association between two variables) and interpreting data
What is a stratified analysis?
Analysis undertaken separately in each of a number of subgroups
What is a study sample?
Subgroup of subjects from the selected sample that actually agree to take part in the study
What is a subgroup?
Subdivision of the sample into groups
What is survival analysis?
Statistical modelling of the time to an event which does not assume that rates are constant over time
What is a systematic review?
Review of a clearly formulated question that uses systematic methods to identify, select and critically appraise relevant research
What is a target population?
Collection of individuals for which it is of interest to draw inferences or be able to generalise too, often defined in terms of geographical location
What is temporal?
Referring to time
What is a test statistic?
Quantity calculated from the data which is used to assess the strength of evidence against the null hypothesis
What is the threshold effect?
Type of dose-response effect, whereby the risk of the outcome is only increased in subjects whose exposure is above or below a certain level
What is a t-test?
A statistical test used to compare means between two groups, when variables are approximately normally distributed; can be unpaired or paired test
What is variability?
Measure of variability, the variance is the square of the standard deviation
What is the Wilcoxon signed rank test?
The same as the Mann-Whitney U test for two unpaired groups & sign test for two paired groups
What is a z-test?
Statistical test used to compare means between two groups, when variables are approximately normally distributed, and sample is not too small