Stats Flashcards

1
Q

Bias

A

Any factor that moves the findings of a study away from the truth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Binary data

A

Data where there are only two possible values such as survived/died; also known as dichotomous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Blinding in a randomized controlled trial

A

When the treatment allocation is concealed from either the subject or the assessor or both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Case-control studies

A

Observational study that starts with cases with a disease and compares them with controls without the disease to investigate possible risk factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chi-squared goodness of fit test

A

A statistical test used to investigate whether a frequency distribution follows a specific theoretical distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chi-squared test

A

A statistical test used to investigate the association between two categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cluster Analysis

A

A statistical method used to identify groups or clusters of individuals who have common features in terms of known variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cluster randomization

A

When groups of individuals are allocated to treatments so that all subjects in a group receive the same treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cohort study

A

Observational study that starts with a sample of individuals who are disease-free and measures possible causal factors at baseline and over time. The cohort of subjects is followed and their disease status is observed to investigate which factors are linked to the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence interval (CI)

A

A range of values that indicates the precision of an estimate; for a 95% CI we can be 95% confident that the interval contains the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Continuous data

A

Data that lie on a continuum and so can take any value between two limits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cox proportional hazards regression

A

A multifactorial regression model used with a time-to-event outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Crossover trial

A

A single group study where each patient receives each of two or more treatments in turn so that they act as their own control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Degrees of freedom (DF or df)

A

A quantity used in statistical testing and modelling that is related to the size of the sample and the number of parameters that have been estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dummy variables

A

Used in regression modelling to enable a categorical predictor variable to be included, by converting a variable with n categories into n–1 binary variables, where one category is the reference category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Equivalence trial

A

A trial that aims to see if a new treatment is no better or worse than an existing one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fisher’s exact test

A

A statistical test that can be used to investigate the association between two categorical variables when the sample is small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Forest plot

A

A graph used to display individual study estimates and confidence intervals, and the pooled estimate and confidence interval in a meta-analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Gold standard test

A

A diagnostic test that is regarded as definitive, i.e. it gives the correct answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Funnel plot

A

A simple graphical method for exploring the results from studies to see if publication bias might be present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Hazard ratio

A

Hazard ratio In survival analysis, the ratio of hazards or risks of outcome in two groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Heterogeneity

A

Where there is statistical variability between estimates such as may be found in a meta-analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Incidence

A

The number of new cases of a given condition occurring within a specific time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Indirect standardization

A

Gives the standardized mortality ratio (SMR), which is the ratio of the observed number of deaths in the comparison population and the number expected if that population had the same age-specific death rates as the standard population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Intention to treat analysis

A

Statistical analysis where patients are analysed in the treatment group to which they were originally randomly allocated even if they did not actually receive that treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Logistic regression

A

A multifactorial regression model used with a binary outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Logrank test

A

A statistical test used to compare time-to-event data in two or more groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Meta-analysis

A

A statistical analysis which combines the results of several independent studies examining the same question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Multifactorial methods

A

Statistical models fitted to datasets with one outcome variable and several predictor variables; used to disentangle effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Multiple regression

A

A multifactorial regression model used with a continuous outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Negative predictive value

A

The proportion of those found negative on a diagnostic test who are truly negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Normal distribution

A

A continuous probability distribution with a symmetrical bell shape, which is followed by many naturally occurring variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Number needed to harm

A

The number of patients who need to be treated in order that one additional patient has a negative outcome

34
Q

Number needed to treat

A

The number of patients who need to be treated in order that one additional patient has a positive outcome

35
Q

Observational study

A

A study in which subjects are observed, with exposures and outcomes measured, without any intervention by the researcher

36
Q

Odds

A

The probability of an event occurring divided by the probability of it not occurring

37
Q

Odds ratio

A

A measure of the difference in odds between two groups, calculated by dividing the odds in one group by the odds in another group

38
Q

One-way analysis of variance

A

A statistical test used to compare the means from three or more independent samples

39
Q

Parallel group trial

A

A trial in which subjects are allocated to receive one of two or more possible treatments and the comparison of different treatments is made between treatment groups

40
Q

Pearson’s correlation

A

A measure of the strength of linear relationship between two continuous variables

41
Q

Placebo

A

An inert treatment which is indistinguishable from the active treatment

42
Q

Poisson regression

A

A multifactorial regression model used to model rates

43
Q

Positive predictive value

A

The proportion of those found positive on a diagnostic test who are truly positive

44
Q

Posterior distribution

A

A probability distribution obtained by combining prior evidence with new information

45
Q

Power

A

The probability that a statistical test will find a significant difference if a real difference of a given size exists, i.e. the null hypothesis is not true

46
Q

Predictor variable

A

In regression analysis, a variable which is used to predict the value of an outcome variable

47
Q

Prevalence

A

The proportion of individuals with a condition within a specific population at a given time (point prevalence) or over a given time period (period prevalence)

48
Q

Principal components analysis

A

A statistical method used to reduce a dataset with many inter-correlated variables to a smaller set of uncorrelated variables that explain the overall variability almost as well

49
Q

Publication bias

A

A bias that occurs when the papers which are published on a topic are an incomplete subset of all the studies which have been conducted on that topic

50
Q

Rank correlation

A

A non-parametric measure of the relationship between two variables, using the ranks of the data rather than the data values themselves

51
Q

Receiver operating characteristic (ROC) curve

A

A graph plotting the sensitivity against 1–specificity for a diagnostic test at different cut-off points

52
Q

Relative risk (RR)

A

A measure of the difference in risk between two groups, calculated by dividing the risk in the exposed group by the risk in the unexposed group (also known as risk ratio)

53
Q

Risk ratio

A

A measure of the difference in risk between two groups, calculated by dividing the risk in the exposed group by the risk in the unexposed group (also known as relative risk)

54
Q

Selection bias

A

A statistical bias introduced by the way in which subjects are selected for a research study

55
Q

Sensitivity

A

The proportion of those who have the disease who are correctly identified by the diagnostic test as positive

56
Q

Sensitivity analysis

A

A way of testing assumptions made in statistical analyses by doing several analyses based on different assumptions, and comparing the results

57
Q

Significance level

A

The probability that a statistical test rejects the null hypothesis when no real difference exists, i.e. the null hypothesis is true (type 1 error)

58
Q

Simple linear regression

A

A statistical method to estimate the nature of the linear relationship between two continuous variables

59
Q

Skewed data

A

Data that do not follow a symmetrical distribution

60
Q

Specificity

A

The proportion of those who do not have the disease who are correctly identified by the diagnostic test as negative

61
Q

Standard deviation (SD)

A

A measure of dispersion used for continuous data; is equal to the square root of the variance

62
Q

Standard error (SE)

A

A measure of precision of an estimated quantity that is equal to the standard deviation of the sampling distribution of the quantity

63
Q

Stem and leaf plot

A

A graph which uses the data values themselves to depict the shape of a frequency distribution

64
Q

Superiority trial

A

A trial which aims to see if one treatment is better than another

65
Q

t test

A

A statistical test used to compare the means from two independent samples

66
Q

Transformation

A

A function applied to a dataset to better fit a specific probability distribution, for example applying a logarithmic transformation to skewed data to make it fit a Normal distribution

67
Q

Two-way analysis of variance

A

A statistical method used to investigate the effects of two factors on a continuous outcome

68
Q

Type 1 error

A

Getting a significant result in a sample when the null hypothesis is in fact true in the underlying population

69
Q

Type 2 error

A

Getting a non-significant result in a sample when the null hypothesis is in fact false in the underlying population (‘false non-significant’ result)

70
Q

Variable

A

A quantity that is measured or observed in an individual and which varies from person to person

71
Q

Washout period

A

The time interval between the administration of different treatments in subjects in a crossover trial that prevents there being any carry-over effects of the current treatment when the next treatment starts

72
Q

Wilcoxon matched pairs test

A

A statistical test comparing ordinal data from paired sample

73
Q

Wilcoxon signed rank test

A

A statistical test comparing ordinal data from two independent groups; equivalent to the Mann Whitney U test

74
Q

Z-test for proportions

A

A statistical test used to compare proportions from two independent samples

75
Q

Stratification for prognostic factors

A

important prognostic factors that need to be accounted for in a particular trial, the random allocation can be stratified so that the treatment groups are balanced for the prognostic factors.

76
Q

Minimization

A

Allocation takes place in a way that best maintains balance in important prognostic factors. At all stages of recruitment, the next patient is allocated to that treatment which minimizes the overall imbalance in prognostic factors

77
Q

Advantages of parallel group study design

A

The comparison of the treatments takes place concurrently

Can be used for any condition, especially an acute condition which is cured or self-limiting such as an infection

No problem of carry-over effects

78
Q

Disadvantages of parallel study group designs

A

The comparison is between patients and so usually needs a bigger sample size than the equivalent cross-over trial

79
Q

Advantages of crossover study designs

A

Treatments are compared within patients and so differences between patients are accounted for explicitly

Usually need fewer subjects than the equivalent parallel group trials

Can be used to test treatments for chronic conditions

80
Q

Disadvantages of crossover study designs

A

Cannot be used for many acute illnesses

Carry-over effects need to be controlled

Likely to take longer than the equivalent parallel designs

Statistical analysis is more complicated if subjects do not complete all periods

81
Q

Zelen Randomised Consent Design

A

Subjects are randomly allocated to treatment or usual care

Only those subjects who are allocated to treatment are invited to participate and to give their consent

Subjects allocated to usual care (control) are not asked to give their consent

Among the treatment group, some subjects will refuse and so this design results in three treatment groups1,2

  1. Usual care (allocated)
  2. Intervention
  3. Usual care (but allocated to intervention)

The analysis is performed with patients analysed in the original randomized groups, i.e. 1 versus 2 + 3 (Research design Intention to treat analysis)