1. Interpreting Data Flashcards

Question 1

Q

How do you calculate the standard deviation?

Answer

A

Square root the average squared distance from the mean.

Question 2

Q

What is the interquartile range?

When would you use IQR over standard deviation?

When would you use the mean or the median?

How should the following data be summarised (pic):

A. Median and standard deviation

B. Mean and interquartile range

C. Mean and standard deviation

D. Median and interquartile range

Answer

A

25th to 75th centile

If you have outliers

Median if you have outliers, otherwise either will do

D (always use IQR with median)

Question 3

Q

How should this data be summarised?

A. Median and standard deviation

B. Mean and interquartile range

C. Mean and standard deviation

D. Median and interquartile range

What is this distribution called?

Answer

A

C

Gaussian

Question 4

Q

In a Gaussian distribution (pic) what happens if you change age to include older/younger people?

Answer

A

Older = curve shifts R, but shape stays same. Younger = shifts L

Question 5

Q

In a Gaussian distribution (pic) what happens if you change the standard deviation (decrease/increase)?

Answer

A

If SD decreases = curve flatter and wider but centre point ad area underneath always the same. If SD increases = taller and narrower

Question 6

Q

What are the 3 reference ranges that Gaussian distributions can be used to show?

What is the standard error?

What is the standard error of the mean?

Answer

A

99% range (0.5th to 99.5th centile) = mean ± 2.58 SDs

95% range (2.5th to 97.5th centile) = mean ± 1.96 SDs

90% range (5th to 95th centile) = mean ± 1.64 SDs

Ranges get narrower as go down

The standard deviation of the Gaussian distribution; it’s a measure of the statistiacal accuracy of an estimate.

The standard deviation of the distribution of all possible sample means. Estimated from a single sample as:

Standard error of the mean = standard deviation / √sample size

Question 7

Q

How do you calculate the 95% confidence interval (CI) of a sample mean?

If the 95% CI for BMI was 21.4 - 22.6, what 2 ways could you describe the results?

Answer

A

95% CI = sample mean ± 1.96 x standard error

We would expect 95% of samples of the same size to have a mean BMI between 21.4 and 22.6.
In the population we are 95% sure that the mean BMI could be as low as 21.4 or as high as 22.6

Question 8

Q

95% confidence interval for the mean weight of a sample of 30 adult men is 75kg to 81kg. Which is the correct definition?

A. In the population we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg

B. In the population the mean weight will be between 75kg and 81kg

C. In the population 95% of men will weigh between 75kg and 81kg

D. In this study 95% of men weighed between 75kg and 81kg

Answer

A

(B is just the 95% range, not the CI)

Question 9

Q

When do you use standard deviation or standard error?

What happens to the 95% range as the sample size increases (in pic)?

What happens to the 95% CI?

Answer

A

Use SD for ranges (for individual values) and SE for CIs (for means)

Stays the same

Gets narrower (because calculating it using SE, and SE gets smaller as sample size increases b/c dividing by the square root of the sample size)

Question 10

Q

How would you describe this relationship?

What is r?

Answer

A

Birth weight is positively correlated with gestational age.

Correlation coefficient, always between -1 and 1

Question 11

Q

What would the r value be in A - C?

How is linear regression represented?

What is the equation?

Answer

A

A) r = 0, no correlation

B) r = 1, perfect positive correlation

C) r = -1, perfect negative correlation

Line of best fit

y = a + bx

(y = outcome/dependant variable, x = predictor/independant variable, b = diff in y/diff in x), a = if line was continued, where it crosses the y axis when x = 0)

Question 12

Q

Predicting gestational age from crown rump length.
Which regression should you be doing?

Answer

A

A (whatever we’re predicting on the vertical axis, and what we’re using to predict on the horizontal axis)

Question 13

Q

Predicting PAPP-A from gestational age
Which regression should you be doing?

Question 14

Q

What 2 things would you look at to determine whether an observed difference was due to chance or statistically significant?

Answer

A

CIs and p-values

Question 15

Q

What is the p-value?

Using the data (pic), how would you calculate the probability of observing at least 17 heads or at least 17 tails (the p-value)? We don’t know which side the coin is biased to.

What does it mean if the p-value is <0.05?

When can p-values be calculated?

Answer

A

The probability of observing a result as or more extreme than the sample result if the underlting assumption in the population is true.

0.008 + 0.008 = 0.016 (This is a two-tailed p-value; if we thought it was biased to heads it’d just be 0.008 = one-tailed p-value)

Statistical significance. If >0.05, can’t rule out chance effect.

2 means (are they different), association (are observed diff from expected results?), regression (is slope diff from 0?)

Question 16

Q

How would you calculate the numbers in the expected table?

What is the p-value if the 95% CI excludes or contains 0?

What about for the 99% CI and 90% CI?

Answer

Study These Flashcards

A

E.g. for yes yes box: 111 x 41 / 163

If excludes 0 then p < 0.05

If contains 0 then p ≥ 0.05

99%: if excludes 0 then p<0.01. If contains 0 then p ≥ 0.01

90%: if excludes 0 then p<0.1. If contains 0 then p≥0.1

Question 17

Q

The p-value for the difference in birth weight of children born to smokers compared with non-smokers is 0.02. Which is the correct 95% confidence interval for the difference in birth weight?