1. Interpreting Data Flashcards

1
Q

How do you calculate the standard deviation?

A

Square root the average squared distance from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the interquartile range?

When would you use IQR over standard deviation?

When would you use the mean or the median?

How should the following data be summarised (pic):

A. Median and standard deviation

B. Mean and interquartile range

C. Mean and standard deviation

D. Median and interquartile range

A

25th to 75th centile

If you have outliers

Median if you have outliers, otherwise either will do

D (always use IQR with median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How should this data be summarised?

A. Median and standard deviation

B. Mean and interquartile range

C. Mean and standard deviation

D. Median and interquartile range

What is this distribution called?

A

C

Gaussian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a Gaussian distribution (pic) what happens if you change age to include older/younger people?

A

Older = curve shifts R, but shape stays same. Younger = shifts L

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In a Gaussian distribution (pic) what happens if you change the standard deviation (decrease/increase)?

A

If SD decreases = curve flatter and wider but centre point ad area underneath always the same. If SD increases = taller and narrower

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 reference ranges that Gaussian distributions can be used to show?

What is the standard error?

What is the standard error of the mean?

A

99% range (0.5th to 99.5th centile) = mean ± 2.58 SDs

95% range (2.5th to 97.5th centile) = mean ± 1.96 SDs

90% range (5th to 95th centile) = mean ± 1.64 SDs

Ranges get narrower as go down

The standard deviation of the Gaussian distribution; it’s a measure of the statistiacal accuracy of an estimate.

The standard deviation of the distribution of all possible sample means. Estimated from a single sample as:

Standard error of the mean = standard deviation / √sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you calculate the 95% confidence interval (CI) of a sample mean?

If the 95% CI for BMI was 21.4 - 22.6, what 2 ways could you describe the results?

A

95% CI = sample mean ± 1.96 x standard error

  1. We would expect 95% of samples of the same size to have a mean BMI between 21.4 and 22.6.
  2. In the population we are 95% sure that the mean BMI could be as low as 21.4 or as high as 22.6
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

95% confidence interval for the mean weight of a sample of 30 adult men is 75kg to 81kg. Which is the correct definition?

A. In the population we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg

B. In the population the mean weight will be between 75kg and 81kg

C. In the population 95% of men will weigh between 75kg and 81kg

D. In this study 95% of men weighed between 75kg and 81kg

A

A

(B is just the 95% range, not the CI)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When do you use standard deviation or standard error?

What happens to the 95% range as the sample size increases (in pic)?

What happens to the 95% CI?

A

Use SD for ranges (for individual values) and SE for CIs (for means)

Stays the same

Gets narrower (because calculating it using SE, and SE gets smaller as sample size increases b/c dividing by the square root of the sample size)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How would you describe this relationship?

What is r?

A

Birth weight is positively correlated with gestational age.

Correlation coefficient, always between -1 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What would the r value be in A - C?

How is linear regression represented?

What is the equation?

A

A) r = 0, no correlation

B) r = 1, perfect positive correlation

C) r = -1, perfect negative correlation

Line of best fit

y = a + bx

(y = outcome/dependant variable, x = predictor/independant variable, b = diff in y/diff in x), a = if line was continued, where it crosses the y axis when x = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Predicting gestational age from crown rump length.
Which regression should you be doing?

A

A (whatever we’re predicting on the vertical axis, and what we’re using to predict on the horizontal axis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Predicting PAPP-A from gestational age
Which regression should you be doing?

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What 2 things would you look at to determine whether an observed difference was due to chance or statistically significant?

A

CIs and p-values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the p-value?

Using the data (pic), how would you calculate the probability of observing at least 17 heads or at least 17 tails (the p-value)? We don’t know which side the coin is biased to.

What does it mean if the p-value is <0.05?

When can p-values be calculated?

A

The probability of observing a result as or more extreme than the sample result if the underlting assumption in the population is true.

0.008 + 0.008 = 0.016 (This is a two-tailed p-value; if we thought it was biased to heads it’d just be 0.008 = one-tailed p-value)

Statistical significance. If >0.05, can’t rule out chance effect.

2 means (are they different), association (are observed diff from expected results?), regression (is slope diff from 0?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you calculate the numbers in the expected table?

What is the p-value if the 95% CI excludes or contains 0?

What about for the 99% CI and 90% CI?

A

E.g. for yes yes box: 111 x 41 / 163

If excludes 0 then p < 0.05

If contains 0 then p0.05

99%: if excludes 0 then p<0.01. If contains 0 then p ≥ 0.01

90%: if excludes 0 then p<0.1. If contains 0 then p≥0.1

17
Q

The p-value for the difference in birth weight of children born to smokers compared with non-smokers is 0.02. Which is the correct 95% confidence interval for the difference in birth weight?

  1. -0.70 to 0.06kg
  2. -0.06 to 0.70kg
  3. 0.06 to 0.70kg
A
  1. Since p is <0.05 the 95% CI must exclude 0
18
Q

In a study, a group of patients took statins and another group placebo. The mean difference in LDL cholesterol was 1 mmol/L:

The 95% CI was 0.2 to 1.8

The 99% CI was -0.1 to 2.1

Which is correct?

  1. P-value is less than 0.01
  2. P-value is less than 0.05 but greater than 0.01
  3. P-value is greater than 0.05
A
  1. Crosses 0 so p >0.01, but 95% doesn’t cross 0 so <0.05