Interpreting Data Flashcards

Question 1

Q

What are the two main types of data?

Answer

A

Qualitative and quantitative

Question 2

Q

What are the two types of quantitative data?

Answer

A

Discrete and continuous

Question 3

Q

What are the two types of qualitative data?

Answer

A

Nominal (unordered) and ordinal (ordered)

Question 4

Q

What is nominal data split into?

Answer

A

Binary and categorical

Question 5

Q

What is the median?

Answer

A

Middle value when values ordered from smallest to largest

Question 6

Q

What is the median?

2, 3, 6, 7, 10, 11, 14

Question 7

Q

What is the mode?

Answer

A

Most common value

Question 8

Q

What is the mean?

Answer

A

The average. It is the sum of all the values divided by the number of values.

Question 9

Q

Calculate the mean.

2, 3, 4, 7, 8, 8, 11

Question 10

Q

What does standard deviation mean?

Answer

A

The average distance from the mean

Question 11

Q

How is standard deviation calculated?

Answer

A

The sum of (each individual value - mean) squared, then divided by the number of values. Then you square root this answer.

Question 12

Q

What centile is the median?

Question 13

Q

What is the interquartile range?

Answer

A

25th to 75th centile

Question 14

Q

When is it better to use a median rather than a mean?

Answer

A

To avoid the influence of outliers, i.e. if there is an outlier that is very different to the rest of the data.

Question 15

Q

When is it better to use IQR rather than the standard deviation?

Answer

A

To avoid the influence of outliers

Question 16

Q

What is the Gaussian distribution determined by?

Answer

A

Mean and standard deviation

Question 17

Q

If the mean is reduced from 120 to 110, what happens to the Gaussian distribution?

Answer

A

It shifts to the left.

Question 18

Q

If the mean is increased from 120 to 130, what happens to the Gaussian distribution?

Answer

A

It shifts to the right.

Question 19

Q

What happens to the Gaussian distribution if the standard deviation is decreased from 15 to 10?

Answer

A

The curve becomes narrower and taller

Question 20

Q

What happens to the Gaussian distribution if the standard deviation is increased from 15 to 20?

Answer

A

The curve becomes wider and flatter

Question 21

Q

What is a useful property of Gaussian distributions?

Answer

A

A constant proportion of values will lie within any specified number of Standard Deviations above or below the mean (reference ranges).

Question 22

Q

If you go one standard deviation away from the mean, how many % does this represent?

Question 23

Q

If you go 1.64 standard deviations away from the mean, how many % does this represent?

Question 24

Q

If you go 1.96 standard deviations away from the mean, how many % does this represent?

Question 25

Q

What is the 99% range? How is it calculated?

Answer

A

0.5th centile to 99.5th centile

Mean +/- 2.58 SDs

Question 26

Q

What is the 95% range? How is it calculated?

Answer

A

2.5th centile to 97.5th centile

Mean +/- 1.96 SDs

Question 27

Q

What is the 90% range? How is it calculated?

Answer

A

5th centile to 95th centile

Mean +/- 1.64 SDs

Question 28

Q

If the sample size isn’t too small then the distribution of the sample mean will be…?

Question 29

Q

What is the standard error?

Answer

A

The standard deviation of this distribution (Gaussian) is called the standard error. It is a measure of the statistical accuracy of an estimate.

Question 30

Q

What is the standard error of the mean?

Answer

A

The standard deviation of the distribution of all possible sample means – can’t do this in practice, so it is estimated.

Question 31

Q

How is standard error of the mean estimated?

Answer

A

Standard deviation divided by the square root of the sample size.

Question 32

Q

How is the 95% confidence interval of a sample mean calculated?

Answer

A

95% CI = sample mean +/- (1.96 x standard error)

Question 33

Q

What does the 95% confidence interval mean?

Answer

A

We would expect 95% of samples of the same size to have a mean between the two values calculated.
In the population we are 95% sure that the mean could be as low as ___ or as high as ___.

Question 34

Q

When calculating confidence intervals and ranges, what should be used for each?

Answer

A

Standard deviation for ranges

Standard error for intervals

Question 35

Q

When the sample size increases, the 95% range…

Answer

A

Stays the same

Question 36

Q

When the sample size increases, the 95% confidence interval…

Answer

A

Gets narrower

Question 37

Q

What is ‘r’? What two values is it always between?

Answer

A

Correlation coefficient

-1 and 1

Question 38

Q

What does r=1 tell you?

Answer

A

Perfect positive correlation

Question 39

Q

What does r=-1 tell you?

Answer

A

Perfect negative correlation

Question 40

Q

What does r=0 tell you?

Answer

A

No correlation

Question 41

Q

What is the equation for a linear regression?

Answer

A

y = a + bx,

where y is the outcome and x is the predictor

Question 42

Q

What does the line of best fit do?

Answer

A

Minimises square of vertical distances

Question 43

Q

Regression - whatever we are predicting, should it be on the vertical or horizontal axis?

Question 44

Q

Statistical significance - what does this mean and how is it determined?

Answer

A

An observed sample difference between groups might be due to chance. Statistically significant means the result is unlikely to be due to chance.
Use confidence intervals and p-values

Question 45

Q

What does a p-value mean?

Answer

A

A p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the population is true.

Question 46

Q

What does the p-value have to be less than to be statistically significant?

Question 47

Q

When can p-values be calculated?

Answer

A

When there is a comparison:
2 means – are they different i.e. is their difference different from 0?
Association – are the observed results different from those expected
Regression – is the slope different from 0?

Question 48

Q

How are p-values calculated?

Answer

A

Using chi-squared test

Question 49

Q

If the 95% CI for a difference excludes 0 then what can be said about the p-value?

Question 50

Q

If the 95% CI for a difference contains 0 then what can be said about the p-value?