Stats - box and whisker plots, forest plots Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What diagram is used to display information about the range, median and the quartiles?

A

Box and Whisker plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 5 lines on the box plot and what do they represent (top to bottom)?
What else is on the graph?

A

1) highest value that isn’t an outlier
(2,3,4 make up the box)
2) Upper quartile (Q3)
3) Median (Q2)
4) Lower quartile (Q1)
5) lowest value that isn’t an outlier

Outlier dots - more than 1.5 IQR from the end of the box (important definition of outliers!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the definition of an outlier on the Box and Whisker plot?
What does this correspond to?

A

Value more than 1.5 IQR from the end of the box

The 1.5 multiplier corresponds to approximately ±2.7SD and 99.3% coverage of the data for a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an other name for the interquartile range? How is it defined?

A

Mid spread - equal to the difference between the 3rd and 1st quartiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The median divides the data into two halves. How do you calculate the median from a set of ordered numbers?

A

(n+1)/2 i.e if there are 11 numbers in a set, the median is the 6th value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 1st and 3rd quartile?
How are they calculated in a set of ordered numbers?

A

1st quartile - equivalent to the 25th percentile - (n+1)/4 i.e in a set of 11 this would be the 3rd value)

3rd quartile - equivalent to the 75th percentile - 3(n+1)/4 i.e in a set of 11 this would be the 9th value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you calculate the interquartile range?

A

Q3 minus Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a percentile (centile)?
How does this relate to quartile data?

A

This is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20% of the observations may be found.

75% of the data set is below Q3
50% of the data set is less than Q2
25% of the data set is below Q1
50% of the data set is between Q1 and Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is skewing in a box and whisker plot?

A

If a distribution is symmetric, the observations will be evenly split at the median.

If most of the observations are concentrated on the low end of the scale, the distribution is skewed right (Plot will be higher on the left side).

And vice versa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a forest plot also known as?

What is it used to display and how does this work?

A

Blobbogram

It is the main method for illustrating the results of a meta-analysis. It takes all the relevant studies which ask the same question, identifies a common statistic in and displays them on a single image. Doing this allows direct comparison.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is on the horizontal axis of a forest plot?

A

The horizontal axis usually represents the statistic the studies being profiled show. This could either be a relative statistic like an odds ratio (OR) or a relative risk (RR). Or might be an absolute one such as Absolute Risk Reduction (ARR) or Standardised Mean Difference (SMD).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is on the vertical axis of a forest plot?

A

The vertical line is known as the ‘line of null (or no) effect’. This line is placed at the value where there is no association between an exposure and outcome or no difference between two interventions (in the above example it is placed at 1). Remember that relative statistics like OR or RR have a null effect value of 1 whereas absolute statistics like Absolute Risk or ARR or SMD, have a null difference value of 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What 2 components are on each study line (horizontal) in a forest plot?

A

A point estimate of the study result (represented by a black box). The size of the boxes represents the weight or the relative contribution of each study to the overall meta-analysis. This weight is typically determined by the precision of the study’s estimate, which is often related to the sample size and the variance of the outcome.

A horizontal line representing the 95% confidence intervals of the study result, with each end of the line representing the boundaries of the confidence interval. Note which line cross the line of no effect (any study line which crosses the line of null effect does not illustrate a statistically significant result)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the diamond represent on a forest plot?

A

The diamond represents the point estimate and confidence intervals when you combine and average all the individual studies together. If you drew a vertical line through the vertical points of the diamond, that represents the point estimate of the averaged studies. The horizontal points of the diamond represent the 95% confidence interval of this combined point estimate. if the horizontal tips of the diamond cross the vertical line, the combined result is potentially not statistically significant (as you cannot be certain that the null value isn’t the true value).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is name give to the influence an individual study has had on the pooled result in a forest plot?

A

The weight (in %) indicates the influence an individual study has had on the pooled result. In general, the bigger the sample size AND the narrower the confidence interval (CI), the higher the percentage weight, the larger the box, and more the influence the study has on the pooled result.

17
Q

In a forest plot, what is heterogeneity?

What are the two types called?

A

Heterogeneity refers to variability between studies and can affect the ability to combine the data of the individual studies. There are two types of heterogeneity: clinical heterogeneity and statistical heterogeneity.

18
Q

What is clinical heterogeneity in a forest plot?

A

Clinical heterogeneity refers to the variability caused by differences in clinical variables, such as the patient population, interventions, outcome measures or setting of the included studies. Clinicians determine clinical heterogeneity, which means that it will always be a rather subjective decision. Readers should also consider these differences and subjectively decide whether the clinical heterogeneity is small enough for meta-analysis to be appropriate.

19
Q

What is statistical heterogeneity in a forest plot?

A

Statistical heterogeneity is the variability in effect estimates between the studies and can be quantified by various statistics. Forest plots ONLY present the statistical heterogeneity. The simplest statistic is the I², which quantifies the heterogeneity from 0 to 100%. There is no clear point beyond which there is too much heterogeneity. Some use a rule of thumb stating that around 25% is low heterogeneity, around 50% medium and around 75% high heterogeneity.

20
Q

Heterogenicity in a forest plot is indicated by ‘eyeball analysis’ in which 3 ways?

A

1) If the CIs are widely spread out and the point estimates vary considerably, this suggests heterogeneity (i.e., the studies show varying results)
2) Large deviations of individual studies from the overall estimate may indicate heterogeneity
3) If larger studies (larger boxes) show different effects than smaller studies (smaller boxes), this may indicate heterogeneity

21
Q

When reading a meta analysis, what 3 things do we need to assess (in terms of limitations)?

Which of these does a forest plot help witg?
What else might be needed?

A

Heterogeneity
The pooled result
Publication bias

A forest plot does a great job in illustrating the first two of these (heterogeneity and the pooled result). However, it cannot display potential publication bias to readers. A funnel plot can do that instead.