Interpreting Data Flashcards

1
Q

How do you calculate standard deviation?

A

Take the mean - Distance from the mean, squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When would using the median result be more appropriate? [1]

A

Better to use median when have a skewed distribution, can avoid the influence of outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When would you use IQR vs STD? [1]

A

When wide outliers / skewed distribution: Better to use IQR in this case, to avoid the influence of outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What value is changing between the different colours in this Gaussain Distribution?

A

mean value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What value is changing between the different colours in this Gaussain Distribution?

A

Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the standard deviation measuring? [1]

A

It is essentially calculating the average distance from the mean, and therefore the measure of spread of the results you have obtained. This is shown visually on the image here.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the references ranges are for 95%, 99% and 90% range for STD?

learn

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the references ranges are for 95%, 99% and 90% range for STD?

learn

A

99% range (0.5th to 99.5th centile) = mean ± 2.58 SDs

95% range (2.5th to 97.5th centile) = mean ± 1.96 SDs

90% range (5th to 95th centile) = mean ± 1.64 SDs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Repeated sampling from a population

If the sample size isn’t too small then the distribution of the sample mean will be []

A

If the sample size isn’t too small then the distribution of the sample mean will be Gaussian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the STD of Gaussian distribution called? [1]

A

standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate standard error of the mean? [1]

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What would the standard error of the following be?

n=163
mean=22
standard deviation=4

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you calculate the 95% confidence interval (CI) of a sample mean?

A

95% CI = sample mean ± 1.96 × standard error

key !!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What do the results from condfidence interval for the mean mean?

E.g. if 21.4-22.6

A

If results is 21.4 0 226:

We would expect 95% of samples of the same size to have a mean BMI between 21.4 and 22.6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

95% confidence interval

Which is the correct definition?

In the population we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg

In this study 95% of men weighed between 75kg and 81kg

A

Which is the correct definition?

In the population we are 95% sure that the mean weight could be as low as 75kg or as high as 81kg

In this study 95% of men weighed between 75kg and 81kg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the difference between reference range and CI? [1]

A

The reference range is defined as the interval between which 95% of values of a reference population fall into, in such a way that 2.5% will be below and 2.5% will be above.

Confidence interval is more concerned with the idea that the mean could be as low as/high as the two values given.

19
Q

What is difference in values used to calculate STD and SE? [2]

A

Use standard deviation for ranges (for individual values)

Use standard error for confidence intervals (for means)

20
Q

What is correlation (pearson) score for the following?

A

Perfect positive correlation: r=1

Perfect negative correlation r=-1

No correlation = r=0

21
Q

If the data do not follow a Gaussian distribution or there is a non-linear monotonic relationship [] correlation can be used

A

If the data do not follow a Gaussian distribution or there is a non-linear monotonic relationship Spearman’s rank correlation can be used

22
Q

x

A
23
Q

Pearson’s correlation and linear regression describe which type of associations? [1]

A

Pearson’s correlation and linear regression only describe linear associations

24
Q

Different regression curves can be fitted where the data pattern is

What type of regression modal would you use for the following?

A

Polynomialquadratic (y=b0+b1x+b2x2)

25
Q

Different regression curves can be fitted where the data pattern is

What type of regression modal would you use for the following?

A

Polynomialquadratic (y=b0+b1x+b2x2), cubic (y=b0+b1x+b2x2+b3x3) etc

26
Q

Different regression curves can be fitted where the data pattern is

What type of regression modal would you use for the following?

A

Exponential growth (y=bx;b>1)

27
Q

Different regression curves can be fitted where the data pattern is

What type of regression modal would you use for the following?

A

exponential decay (y=bx;b<1)

28
Q

Different regression curves can be fitted where the data pattern is

What type of regression modal would you use for the following?

A

Sigmoid

29
Q

Multivariate linear regression

State if is dependent or independent on variables

A
30
Q

What is meant by degrees of freedom? [1]

How do you calculate degress of freedom? [1]

e.g In the example we have 150 girls and 156 boys, what is the degrees of freedom? [1]

A

These are the number of values that are free to vary

In the example we have 150 girls and 156 boys, so there are 149+155=304 degrees of freedom.

31
Q

If the 95% CI for a difference excludes 0 then p []?

If the 95% CI for a difference includes 0 then p []?

A

If the 95% CI for a difference excludes 0 then p < 0.05

If the 95% CI for a difference includes 0 then p ≥ 0.05

32
Q

In a study, a group of patients took statins and another group placebo. The mean difference in LDL cholesterol was 1 mmol/L:

The 95% CI was 0.2 to 1.8
The 99% CI was -0.1 to 2.1

Which is correct?
P-value is less than 0.01
P-value is less than 0.05 but greater than 0.01
P-value is greater than 0.05

A

In a study, a group of patients took statins and another group placebo. The mean difference in LDL cholesterol was 1 mmol/L:

The 95% CI was 0.2 to 1.8
The 99% CI was -0.1 to 2.1

Which is correct?
P-value is less than 0.01
P-value is less than 0.05 but greater than 0.01
P-value is greater than 0.05

33
Q

What analysis could you use where two measurements are made on the same group of people? [1]

A

Paired t-test:

Calculate individual differences – is the mean of those different to 0?

34
Q

T-tests assume data follows a Gaussian distribution.

If the data doesn’t follow Gaussain distribution what tests would you perform if:

Instead of 2-sample t-test? [1]

Instead of paired t-test? [1]

A

Instead of 2-sample t-test do Mann Whitney U-test (Wilcoxon rank sum test)

Instead of paired t-test do Wilcoxon signed rank test

35
Q

For example, in the free thyroxine samples in pregnant women shown above, the mean is 14 and the standard deviation is 1.7. Calculate the 95% reference range [1]

A

The 95% reference range is 14 +/- 1.96x1.7 = 10.7 to 17.3.

The observed middle 95% of women in this sample is 10.9-17.5. This therefore means 2.5% of women in the sample had a level of thyroxine less than 10.9 and 2.5% had a level of thyroxine greater than 17.5.

36
Q

What is linear regression used for? [1]

A

usedto predict the value of a variable based on the value of another variable.

37
Q

How do you calculate linear regression?

A

Y = a + bx.

Y = outcome (dependent variable)

Measured/affected during experiment

X = predictor (independent variable).

Factor changed during experiment.

B = slope (Y2-Y1/X2-X1).

A = intercept.

NOTE: this only works for linear associations.

38
Q

How would you interpret this linear regression? [1]

A

UPDRS decreases by 0.7 for every unit increase in kinesia score

39
Q

How would you interpret this linear regression? [1]

A

UPDRS decreases by 0.7 for every unit increase in kinesia score

40
Q

How can toy adapt linear regression to for multiple factors? [1]

A

Independent variables can be continuous, categorical or a mix

Example: kinesia score according to age and sex

41
Q
A

B

What is being predicted should always be on the vertical axis

42
Q

What do you use a T-test and chi-squared test for? [2]

A

T-test cans be done for comparing continuous variables

Chi-squared test for comparing categorical variables

43
Q

What is the link between CIs and p values?

A

It is important to note the link between CIs and p-values. In general, we say that CIs and p-values are consistent: id the 95% CI for a difference excludes zero then the p value will be less than 0.05 (statistically significant)< if the 95% CI contains zero then the p-value will be greater than 0.05 (not statistically significant).

44
Q

When do you use Pearson vs Spearman correlation? [2]

A

Pearson correlation evaluates the linear relationship between two continuous variables.

Spearman correlation: Spearman correlation evaluates the monotonic relationship.