Interpreting Data Flashcards

1
Q

What are the two main types of data?

A

Qualitative and quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of quantitative data?

A

Discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two types of qualitative data?

A

Nominal (unordered) and ordinal (ordered)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is nominal data split into?

A

Binary and categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the median?

A

Middle value when values ordered from smallest to largest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the median?

2, 3, 6, 7, 10, 11, 14

A

7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mode?

A

Most common value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the mean?

A

The average. It is the sum of all the values divided by the number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Calculate the mean.

2, 3, 4, 7, 8, 8, 11

A

6.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does standard deviation mean?

A

The average distance from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is standard deviation calculated?

A

The sum of (each individual value - mean) squared, then divided by the number of values. Then you square root this answer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What centile is the median?

A

50th

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the interquartile range?

A

25th to 75th centile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When is it better to use a median rather than a mean?

A

To avoid the influence of outliers, i.e. if there is an outlier that is very different to the rest of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When is it better to use IQR rather than the standard deviation?

A

To avoid the influence of outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Gaussian distribution determined by?

A

Mean and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

If the mean is reduced from 120 to 110, what happens to the Gaussian distribution?

A

It shifts to the left.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If the mean is increased from 120 to 130, what happens to the Gaussian distribution?

A

It shifts to the right.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What happens to the Gaussian distribution if the standard deviation is decreased from 15 to 10?

A

The curve becomes narrower and taller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What happens to the Gaussian distribution if the standard deviation is increased from 15 to 20?

A

The curve becomes wider and flatter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a useful property of Gaussian distributions?

A

A constant proportion of values will lie within any specified number of Standard Deviations above or below the mean (reference ranges).

22
Q

If you go one standard deviation away from the mean, how many % does this represent?

A

68%

23
Q

If you go 1.64 standard deviations away from the mean, how many % does this represent?

A

90%

24
Q

If you go 1.96 standard deviations away from the mean, how many % does this represent?

A

95%

25
Q

What is the 99% range? How is it calculated?

A

0.5th centile to 99.5th centile

Mean +/- 2.58 SDs

26
Q

What is the 95% range? How is it calculated?

A

2.5th centile to 97.5th centile

Mean +/- 1.96 SDs

27
Q

What is the 90% range? How is it calculated?

A

5th centile to 95th centile

Mean +/- 1.64 SDs

28
Q

If the sample size isn’t too small then the distribution of the sample mean will be…?

A

Gaussian

29
Q

What is the standard error?

A

The standard deviation of this distribution (Gaussian) is called the standard error. It is a measure of the statistical accuracy of an estimate.

30
Q

What is the standard error of the mean?

A

The standard deviation of the distribution of all possible sample means – can’t do this in practice, so it is estimated.

31
Q

How is standard error of the mean estimated?

A

Standard deviation divided by the square root of the sample size.

32
Q

How is the 95% confidence interval of a sample mean calculated?

A

95% CI = sample mean +/- (1.96 x standard error)

33
Q

What does the 95% confidence interval mean?

A

We would expect 95% of samples of the same size to have a mean between the two values calculated.
In the population we are 95% sure that the mean could be as low as ___ or as high as ___.

34
Q

When calculating confidence intervals and ranges, what should be used for each?

A

Standard deviation for ranges

Standard error for intervals

35
Q

When the sample size increases, the 95% range…

A

Stays the same

36
Q

When the sample size increases, the 95% confidence interval…

A

Gets narrower

37
Q

What is ‘r’? What two values is it always between?

A

Correlation coefficient

-1 and 1

38
Q

What does r=1 tell you?

A

Perfect positive correlation

39
Q

What does r=-1 tell you?

A

Perfect negative correlation

40
Q

What does r=0 tell you?

A

No correlation

41
Q

What is the equation for a linear regression?

A

y = a + bx,

where y is the outcome and x is the predictor

42
Q

What does the line of best fit do?

A

Minimises square of vertical distances

43
Q

Regression - whatever we are predicting, should it be on the vertical or horizontal axis?

A

Vertical

44
Q

Statistical significance - what does this mean and how is it determined?

A

An observed sample difference between groups might be due to chance. Statistically significant means the result is unlikely to be due to chance.
Use confidence intervals and p-values

45
Q

What does a p-value mean?

A

A p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the population is true.

46
Q

What does the p-value have to be less than to be statistically significant?

A

<0.05

47
Q

When can p-values be calculated?

A

When there is a comparison:
2 means – are they different i.e. is their difference different from 0?
Association – are the observed results different from those expected
Regression – is the slope different from 0?

48
Q

How are p-values calculated?

A

Using chi-squared test

49
Q

If the 95% CI for a difference excludes 0 then what can be said about the p-value?

A

p<0.05

50
Q

If the 95% CI for a difference contains 0 then what can be said about the p-value?

A

p≥0.05