Measures of location and spread Flashcards

Question 1

Q

measure of location

Answer

A

A measure of location is a single value describing a position in a data set.

Question 2

Q

measure of central tendency

Answer

A

A measure of central tendency (averages) is a single value that describes the centre of the data.

Question 3

Q

Measures of central tendency (averages) - Mean

Answer

A

The mean uses all the data points
The mean can be distorted by extreme values

Question 4

Q

Measures of central tendency (averages) - Median

Answer

A

middle value when data is arranged in order (or average of middle two values).
The position of the median is given by (n+1) / 2 where n is the number of items of data.

Some points about the median:
* The median is not distorted by extreme values
* The median can still be calculated even if some of the data is missing, e.g. times taken for people to finish a race
* The median is the value with the property that half the values are higher than it and half the values are lower than it
* It can be tedious to have to order the data first

Question 5

Q

Measures of central tendency (averages) - Mode:

Answer

A

Mode: most common value

Question 6

Q

Measures of central tendency (averages) - Modal class:

Answer

A

Modal class:
class that occurs most often ie. has the highest frequency.

Some points about the mode:
* The mode is useless unless there are lots of repeated values
* It is used when the data set has either a single mode or two modes (bimodal)

Question 7

Q

Grouped Data - Mean:

Answer

A

Mean:
When the data is grouped into classes, you can obtain an estimate for the mean by using the midpoint of the classes (the mid-interval value). This means that you assume that all the values in each class interval are equally spaced about the mid-point.

Question 8

Q

Grouped Data - Modal class:

Answer

A

Modal class:
This is the class which has the highest frequency.

Question 9

Q

Grouped Data - Class containing the median:

Answer

A

Class containing the median:
This is the class that contains the middle data value.

Question 10

Q

Other measures of location

Answer

A

Other measures of location include quartiles and percentiles.

Question 11

Q

To find the lower quartile for discrete data containing n data values you need to use the following rules:

Answer

A

Lower quartile: Divide n by 4.
→ If this is a whole number, the lower quartile is halfway between this data
point and the one above.
→ If this is not a whole number, round up and pick this data point.

Question 12

Q

To find the upper quartile for discrete data containing n data values you need to use the following rules:

Answer

A

Upper quartile: Find ¾ of n.
→ If this is a whole number, the upper quartile is halfway between this data
point and the one above.
→ If this is not a whole number, round up and pick this data point.

Question 13

Q

Measures of spread / dispersion / variation - Range

Answer

A

Range = Largest value – Smallest value

This is simple to calculate but is highly sensitive to outliers.
Consider this set of marks for a maths test:

45, 50,43, 49, 52, 58, 48, 10, 50, 82, 56, 40, 47, 39, 51

Range = 82 – 10 = 72 marks
This is not a good measure of spread as most of the marks are in the range 40 – 60.
Discounting the ‘10’ and ‘80’ as outliers gives a range of 58 – 40 = 18 which is perhaps more representative of the data.

Question 14

Q

Measures of spread / dispersion / variation - Interquartile Range

Answer

A

One way of refining the range so that it does not rely completely on the most extreme items of data is to use the interquartile range. This gives the spread of the middle 50% of the data and therefore avoids extreme values.

Interquartile Range = Upper Quartile (Q3) – Lower Quartile (Q1)

i.e. IQR = Q3 – Q1

For a large data set, 25% of the data lie below the lower quartile, and 75% of the data lie below the upper quartile. The interquartile range measures the range of the middle 50% of the data.

Question 15

Q

Measures of spread / dispersion / variation - Interpercentile Range

Answer

A

This is the difference between the values for two given percentiles.
This is still not affected by extreme values but allows more of the data to be considered.
Eg. The 20th to 80th interpercentile range considers the spread of the middle 60% of the data.
The 10th to 90th interpercentile range considers the spread of the middle 80% of the data.
The 10th to 90th interpercentile range is often used as it includes a lot of the data whilst not being affected by extreme values.

Question 16

Q

deviation

Answer

Study These Flashcards

A

The deviation of an item of data from the mean is the difference between the data item and the mean i.e. x- x ̅

Consider a small set of data: {0, 1, 1, 3, 5}

The mean of this data is given by x ̅= (0+1+1+3+5) / 5 =2

The set of deviations for this set of data is: {-2, -1, -1, 1, 3}

Question 17

Q

Sum of squares

Answer

Study These Flashcards

A

To compensate for differing signs, we square the differences or deviations.
The sum of the squares of the deviations is known as the sum of squares and is denoted by .
For the set of data above:

Question 18

Q

variance

Answer

Study These Flashcards

A

To use S_xx as a comparable measure of spread, it is necessary to take into account the number of data items. This allows two data sets of different sizes to be compared.
Therefore, we need to divide this value by n.

Question 19

Q

standard deviation

Answer

Study These Flashcards

A

The standard deviation is the square root of the variance and is given by

Question 20

Q

coding???

Answer

Study These Flashcards

A

Coding
Find the mean and standard deviation of the following data sets:-
1. 20, 26, 26, 27, 28
2. 21, 27, 27, 28, 29
3. 12, 18, 18, 19, 20
4. 18, 24, 24, 25, 26
5. 40, 46, 46, 47, 48
6. 40, 52, 52, 54, 56
7. 60, 78, 78, 81, 84
8. 2, 2.6, 2.6, 2.7, 2.8
9. 4, 5.2, 5.2, 5.4, 5.6
10. 41, 53, 53, 55, 57
11. 55, 73, 73, 76, 79
12. 11, 14, 14, 14.5, 15

Measures of location and spread Flashcards

(20 cards)