Data Flashcards

1
Q

What is the median?

A

The middle value when values are ordered from the smallest to the largest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the mode?

A

The most common value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the mean?

A

The average value: sum of all values divided by the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the standard deviation?

A

This is the average distance from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the interquartile range?

A

This the difference between the 75th centile and the 25th centile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the best values to use to avoid the influence of outliers?

A

If the data is not symmetrical: Should use the median rather than the mean and should use the IQR rather than the standard deviation

If the data IS symmetrical, then should use the mean and the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the Gaussian distribution?

A

This is a curve representing symmetrical data - this is calculated from the mean and the standard deviation

The graph is of a symmetrical bell shape curve

(The peak of the curve represents the mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What effect does a changing standard deviation have on the Gaussian distribution?

A

A change in the SD will cause the curve to become flatter and wider or thinner and taller
BUT the curves will all have the same area beneath them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What effect does a changing mean have on the Gaussian distribution?

A

A change in the mean will cause the shape of the curve to remain the same but the location of the curve will shift further left or right (the peak of the curve represents the mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is the Gaussian distribution useful?

A

A constant proportion of values will lie within any specified number of SDs above or below the mean i.e. the Gaussian distribution is symmetrical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the ‘reference range’?

A

This is the proportion of the values that are lying within the number of SDs above or below the mean e.g. if the 1.96SD = 95% range –> the reference range is the 2.5th to 97.5th centile

Commonly, the reference range lies within the 95th centile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why are samples used to estimate data?

What is the role of confidence values?

A

Not practical or feesible to measure the data from every single person in the country - so use a sample instead and then use this to estimate for the entire population

The confidence value allows us to analyse to what degree we agree that the information from the sample is reliable enough to use for the whole population i.e. confidence interval tells you how accurately the sample estimates of the population values are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can the sample size be determined from the results?

A

A large enough sample size result in a Gaussian distribution from the sample mean and results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is meant by standard error?

A

Standard error is the standard deviation of the sample distribution - this is a measure of the statistical accuracy of an estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the standard error of the mean?

How is this calculated?

A

This is the standard deviation of the distribution of all possible sample means - your sample is only one sample of all potential samples which could all provide differing results so you must account for this level of error

SD/square root of the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is meant by a confidence interval for the mean if e.g. Gaussian distribution?

How can this be calculated?

A

A confidence interval of 95% for a Gaussian distribution means that you can expect 95% of all possible sample means to lie within the 1.96 standard errors of the true population mean

SO if you have mean of 22
Standard error of 0.3
Confidence interval is 22 +/- 1.96x0.3

17
Q

What is meant by the confidence interval?

A

E.g. a 95% confidence interval means that we are 95% sure that the true mean is between a certain range

18
Q

What is the difference between the standard deviation and the standard error?

A

SD - this indicates the amount of dispersion within a sample - used to calculate reference ranges for individual values

SE - measures the precision of a sample - used to calculate confidence intervals in sample means

19
Q

What is the effect of an increasing or decreasing sample size on confidence intervals?

A

Increases in the sample size number (if the mean and the SD remains the same) will result in a narrower confidence interval - a larger sample size allows you to be more confident

20
Q

What are the different types of correlations shown on a graph and what value depicts this?

A

Positive correlation: r = 1
Negative correlation: r = -1
No correlation: r = 0

r = the correlation coefficient and is always between -1 and 1

21
Q

What can be used to determine whether results are statistically significant?

A

Use the confidence intervals and p-values

22
Q

What is a p-value and what is it’s significance?

A

A p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the sample population is true

If the p-value is below 0.05 then it has statistical significance i.e. below 0.05 means that the results are unlikely to be due to a chance effect

23
Q

When is the confidence interval statistically significant?

A

The confidence interval is statistically significant if it excludes 0 e.g. 1-4 (mean) and if it excludes 1 (ratio/risk)

24
Q

What is the relationship between the confidence interval and the p value?

A

These are generally consistent with each other

SO if the CI excludes zero, then the p-value will be less than 0.5 (values are statistically significant)