Critical Numbers Flashcards

1
Q

What is a sample?

A
  • Rarely collect information on everyone of interest
  • So we can take a representative sample from the population of interest (population = group of people we are interested in, not whole population)
  • We describe our sample using descriptive statistics
  • We make inference about our population using inferential statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is bias?

A
  • Arises when imperfections in the research process cause our findings to deviate from the truth
  • Can occur in all studies
  • Can occur intentionally or unintentionally
  • Impacts the validity and reliability
  • We should consider it when critically evaluating the research of others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is sampling bias?

A

Sample does not represent population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is recall bias?

A

Inaccurate recall of past events/ exposures/ behavious

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is information bias?

A

Incorrect measurement e.g miscalibrated machine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Hawthorne effect?

A

Participants change their behaviours when they know they are being observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is attrition bias?

A

Differential dropout from studies e.g. sicker patients have to drop out so we end up only measuring the healthier participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is confounding?

A
  • If it is unaccounted for, it can be a form of bias.
  • These variables obscure the real effect of an exposure on an outcome
  • Related to both exposure and outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an experimental study design?

A

The researchers have intervened in some way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an observational study design?

A

The researchers have not intervened, merely observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a retrospective observational study design?

A

Looking back into the past

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a cross-sectional observational study design?

A

A single snap shot in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a prospective observational study design?

A

following up over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a randomised controlled trial?

A
  • Randomly allocate participants to different interventions and follow up
  • Experimental and perspective
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a Cluster randomised controlled trial?

A

Participants randomised in groups (e.g. by GP centre or therapist) rather than at the individual level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a cross over randomised controlled trial?

A

Participants receive both interventions in a randomised order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a multi-arm and factorial randomised controlled trial?

A

Two or more interventions evaluated in a single study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an adaptive randomised controlled trial?

A

accruing information is used to inform planned design adaptations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the benefits of a randomised controlled trial?

A
  • Randomisation reduces potential for confounding
  • Can reduce bias
  • Can determine casual effecta
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the negatives of a randomised controlled trial?

A
  • Randomisation can be unfeasible or unethical
  • Require expert management and oversight, especially in ‘high risk’ interventions
  • Expensive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a cohort study?

A
  • Non-randomised (one group may be exposed, the other unexposed)
  • Observational
  • Typically prospective
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the benefits of a cohort study?

A
  • Useful when random allocation not possible
  • Can work for rare exposures – select participants on the basis of exposure
  • Can examine multiple outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the negatives of a cohort study?

A
  • May require long follow-up
  • Can be expensive
  • Not ideal for rare outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a case-control study?

A
  • Non-randomised
  • Observational
  • Retrospective (using the sample to look at cases to find the exposure not an outcome)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the benefits of a case-control study?

A
  • Faster: use past data so do not require long follow-up
  • Useful for rare outcomes: select participants on the basis of outcome
  • Cheaper
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the negatives of a case-control study?

A
  • More prone to bias or poor quality data
  • Harder to show causal relationship
  • Not ideal for rare exposures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a cross-sectional study?

A
  • Non-randomised
  • Observational
  • Single time point
    Look at a sample at the unexposed and exposed outcomes and no outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are the benefits of a cross-sectional study?

A
  • Relatively quick
  • Cheap
  • Can assess multiple exposures/outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are the negatives of a cross-sectional study?

A
  • Susceptible to bias
  • Cannot prove causality
  • Not ideal for rare exposures/outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is an ecological study and what are the pros and cons?

A

The unit of observation is group (aggregate) rather than individual
e.g. Electoral ward, country

Some pros:
- Large-scale comparisons
- Can quantify geographical or temporal trends

Some cons:
- Ecological fallacy
- Cannot make inference at the individual level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What can categorical variables be?

A

-binary
- ordinal
-nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What can numeric variables be?

A

Discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is binary (categorical data)?

A

Only two categories (e.g. positive and negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is ordinal (categorical data)?

A

Categories with natural order (e.g. stage of cancer)

35
Q

What is nominal (categorical data)?

A

Categories with no natural order (e.g. blood group)

36
Q

What is discrete (numeric data)?

A

Observations can only take certain numerical values (e.g. number of children)

37
Q

What is continuous (numeric data)?

A

Observations can take any value within a range (e.g. height)

38
Q

What is a proportion?

A

The number with a characteristic or outcome divided by the total number. Used to describe probability or risk (scale 0-1)

39
Q

What is a percentage?

A

Proportion multiplied by 100

40
Q

What are odds?

A

The number with an exposure or outcome divided by the number without.
The ratio of the probability of an event occurring to the probability of it not occurring

41
Q

The incidence of health-related events or outcomes is often presented as a rate. What is a rate?

A

A rate is the frequency per another unit of measurement . This allows us to account for variation .
Once an outcome has occurred an individual will not be at risk either forever or for some period of time.
Person-time risk is not always known and may be approximated

42
Q

What is the risk difference?

A

Difference in proportions between groups
If there is no difference this will be 0

43
Q

What is the risk ratio AKA relative risk?

A

The risk in one group divided by the risk in the other
If there is no difference the ratio will be 1
Ratios >1 indicate higher risk/odds in group of interest
Ratios<1 indicate lower risk/odds in group of interest
The more common the outcome, the more apparent the difference between risk and odds ratios

44
Q

What is an odds ratio?

A

Odds in one group divided by the odds in the other

If there is no difference the ratio will be 1
Ratios >1 indicate higher risk/odds in group of interest
Ratios<1 indicate lower risk/odds in group of interest
The more common the outcome, the more apparent the difference between risk and odds ratios

45
Q

What is the mean?

A

Sum of the values divide by the count

46
Q

What is the median?

A

order the values then take the midpoint

47
Q

What is the mode?

A

The most common value

48
Q

How is the mean typically reported?

A

The standard deviation

49
Q

How is the median typically reported?

A

A central range

50
Q

What is standard deviation?

A
  • Standard deviation – describes dispersion of values around the mean
  • When describing samples the mean is denoted by ¯𝒙 and the SD by s
  • When describing populations the mean is denoted by µ and the SD by σ
51
Q

When reporting the median how do we quantify the variability of the data?

A

Range- lowest value and the highest value
Centiles- The median is the 50th centile. We can describe the spread using centiles around that e.g. 5th to 95th gives 90% central range

52
Q

What is the IQR?

A

Interquartile range:
- the 25th to 75th centile, which gives the 50% range

53
Q

What is a normal distribution curve?

A

The Gaussian distribution or the “bell-shaped curve”

54
Q

If the normal distribution is normal, what will happen to the mean and median?

A

They will be the same

55
Q

What happens to the normal distribution curve if the SD is bigger?

A

More wide spread curve and the apex is lower

56
Q

What does positively/right skewed mean?

A

The sample has the same mean but the median is lower

57
Q

What does negatively/left skewed mean?

A

The sample has the same mean but the median is higher

58
Q

Is the mean affected by skew?

A

It is ‘pulled out’ by extreme values

59
Q

Is the mean affected by skew?

A

will always have 50% of the data to either side

60
Q

What is a parametric/ non-parametric statistical model?

A

Parametric – make distributional assumptions
Non-parametric – make no assumptions (distribution-free)

61
Q

What does it mean if the normal distribution is equal?

A

Symmetric (mean, median and mode are equal)

62
Q

What is the 68-95-99.7 rule?

A

68% of values lie within 1 SD of the mean
95% of values lie within 2 SD of the mean
99.7% of values lie within 3 SD of the mean

63
Q

what is correlation?

A

Correlation – a measure of linear relationship between variables
- Quantified by the correlation coefficient r
- r is bound between -1 and 1
- The closer to 1/-1, the stronger the correlation
- the closer to 0, the weaker the correlation
- Can be positive (as one variable increases, so does the other)
- Or negative (as one variable increases, the other decreases)
- The ordering of the variables does not matter

64
Q

Why do we take all these measurements on data?

A
  • We can assess Normality
  • We can identify outliers (also useful for identifying data entry errors)
  • We can determine whether data might benefit from transformation
  • We can assess collinearity
  • We can choose a method of analysis best suited to our research question and data:
    • Parametric – make distributional assumptions
      Non-parametric – make no assumptions (distribution-free)
65
Q

What is statistical inference?

A
  • Descriptive statistics relate to the sample
  • Inferential statistics relate to the population
  • We infer properties of the population by using sample statistics to derive estimates of population parameters and test hypotheses
  • When making inference from a sample we need to account for uncertainty in our sample estimates
66
Q

What is the problem with random sampling?

A

Produces variation - need to account for when making inference

67
Q

What is the central limit theorem

A

If we were to take repeat samples and calculate the mean each time, those sample means will be Normally distributed around the true population mean even if the population itself is not normally distributed

68
Q

What is the standard error?

A

The standard error is a type of standard deviation
(It is the standard deviation of the sampling distribution)
(Both are measures of spread)

The standard Deviation is for Describing
The standard Error is for Estimating

  • The standard error indicates how different a sample mean is likely to be from the population mean
  • It tells us the precision of estimation
  • The smaller the standard error of the mean, the more precise our estimate of the mean
    i.e. the closer it is likely to be to the true population mean
69
Q

How do we calculate the standard error?

A

SD/ root (n)

70
Q

What does the standard error calculation tell us?

A

Bigger then SD, bigger the standard error
Bigger the sample size, smaller the standard error

This makes sense because the less variable the data are, the more precise our estimation.
The more people we sample, the better the representation and therefore the more precise our estimation.

71
Q

What is a confidence interval?

A

We can use the sample mean and standard error of the mean and properties of the Normal distribution to calculate a range of values we can be confident includes the true mean
This is called the confidence interval
We are now no longer just describing our sample – we are now making inference about the population parameter

72
Q

What factors affect confidence interval width?

A
  • Variability in the sample (SD)
  • Sample size (n)
  • The desired level of confidence: typically we use 95% but it could be 90%, 99%, etc.
73
Q

How else should we calculate the confidence interval?

A

Means
Differences in means
Proportions
Differences in proportions
Correlation coefficients
Relative risks
Odds ratios

74
Q

What is a hypothesis test?

A
  • We can perform a statistical test to determine how likely the result we have observed is ‘real’
  • Or if it is more likely there is no true difference and we are just seeing chance variation
  • To do this we test the hypothesis of no difference between groups
  • We then weigh up the strength of the evidence against that hypothesis
  • And come to a conclusion
75
Q

What is probability?

A
  • Probability values range from 0 to 1
    (though as you’ve seen we often x100 to express as a percentage)
  • A probability of 0 means an event is impossible
  • A probability of 1 means an event is certain
  • So the smaller the probability the less likely the outcome
76
Q

What is the first step in completing a hypothesis test?

A
  • Define the null hypothesis:
  • This is typically the theory we want to disprove
  • We will assume this hypothesis is true until we see sufficient evidence to the contrary
  • Denoted H0
  • In our example:
    H0 = no difference in mean IQ between groups
77
Q

What is the second step in completing a hypothesis test?

A
  • Define the alternative hypothesis:
  • This is the opposite theory to the null
  • Denoted HA or H1
  • In our example:
    HA = there is a difference in mean IQ between groups
78
Q

What is the third step in completing a hypothesis test?

A

Choose a significance level for the test:
- This is how we determine whether our result is statistically significant
- It is also the probability we make a false positive conclusion and reject the null hypothesis when it is in fact true
- So we need to minimise this risk
- Typically it is set around 0.05 (so 5%)

79
Q

What is the fourth step in completing a hypothesis test?

A

Perform an appropriate statistical test:
- We then compare that test statistic to the distribution we would expect under the null hypothesis and work out the probability of our result if the null were true

80
Q

What is the fifth step in completing a hypothesis test?

A

Decision time:
We use the probability value from the statistical test to weigh up the strength of the evidence against the null hypothesis
We call this probability value the p-value
The p-value is the probability of seeing an effect of the observed magnitude or greater if the null hypothesis were true

81
Q

What happens if the p-value is high?

A

The result is probable under the null hypothesis… so it is likely the null hypothesis is true

82
Q

What happens if the p-value is smaller than our significance level (so < 0.05 in our example)?

A

We reject the null hypothesis
- The smaller the p-value, the less likely it is we would see our observed result under the null hypothesis

83
Q

What does the confidence interval give us with hypothesis testing?

A
  • Gives our plausible range for the true population difference
  • Can be used to determine statistical (and clinical) significance
  • Thus is more informative than the p-value alone
84
Q

What is the difference between clinical and statistical significance?

A
  • Statistical significance just means an observed result is unlikely due to chance
  • Clinical significance means the result is practically important