stats Flashcards

1
Q

define the median

A

the central value when data is placed in order (if even number it is halfway between the two middle values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when is it best to use the median

A

It is not affected by outliers, and reflects what most people experience so is useful when data isn’t skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the mode

A

the most frequent value. It reflects what most people experience, is rarely used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the range

A

the difference between the highest and lowest values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the interquartile range

A

is where the data is divided into quarters (quartiles) and is the difference between the middle two quartiles. I.e. it is the middle 50% of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is standard deviation

A

the average distance of the observations from the mean value. It is used to find abnormal results or “outliers”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

describe a box and whisker diagram

A

The median is shown by a line across the main box.

The lines extend to the highest and lowest results excluding outliers.

The outliers are those values which are more than 1.5 x the interquartile ranges away from the upper or lower edges of the box. They are shown as dots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

describe how data is distributed in a bell shaped curve

A

2/3rds of the data lies within 1 standard deviation of the mean

95% lies within 2 standard deviations of the mean.

The median and the mean will be the same in a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

if data is symmetrical on a bell shaped curve what values should be used to summarise the data

A

the mean and standard deviation should be used to summarise data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If data is skewed what measurements should you use to summarise th findings

A

the median and interquartile ranges should be used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

describe positive skew in terms of mode / median / mean

A

mode < median < mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

describe negative skew in terms of mode / median / mean

A

mode > median > mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the purpose of a reference range

A

A reference range gives limits within which we would expect the majority of data to fall.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the usual reference range for standard deviated data

A

2 standard deviations above or below the mean (95% of the data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the difference between a population and a sample?

A

population is all the individuals in which we are interested. The sample is a group within a population which we will study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

why are larger sample sizes better

A

reduce the standard error of the mean

17
Q

are random samples always representative

A

no

18
Q

what is stratified random sampling

A

population divided into groups and then randomly sampled within those groups

19
Q

what is cluster sampling

A

rather than sample individuals, groups or clusters of individuals are samples.

20
Q

what is standard error

A

the standard deviation of all the sample means - Standard error is an estimate of precision. It provides a measure of how far from the true value the sample estimate is likely to be

21
Q

how do you calculate standard error

A

the standard deviation divided by the square root of the number in the sample

the larger the data the lower the error

22
Q

what is a confidence interval

A

a range of values so defined that there is a specified probability that the value of a parameter lies within it.

The 95% confidence interval is found between two standard errors above and below the mean