Location & Spread Flashcards

Question 1

Q

How do we represent the mean?

Answer

A

x̄ (x bar)

Question 2

Q

How do we calculate the mean

Answer

A

The sum of the x values divided by n or the sum of f(x) divided by the sum of f.

Question 3

Q

How do you find the position of LQ (Q1) within listed/grouped data?

Question 4

Q

How do you find the position of the median within listed/grouped data?

Question 5

Q

How do you find the position of UQ (Q3) within listed/grouped data?

Question 6

Q

When we find the position of a quartile within listed data, what do we do when the answer is a decimal?

Answer

A

Round up to the next integer

Question 7

Q

When we find the position of a quartile within listed data, what do we do when the answer is whole?

Answer

A

Find the midpoint with the next number

Question 8

Q

How do we find percentiles of grouped data e.g the 57th percentile?

Question 9

Q

What is a decile?

Answer

A

10% chunks

Question 10

Q

How do we use linear interpolation to find the median ?

Answer

A

Find the true class limits and class width (since the data is most often rounded)
Find the total frequency of the data and divide by 2 to find the nth place of your median
Identify which group your median is in and calculate how far into the group it is out of the frequency of that group.
Multiply by the class width
Add it to the lower bound of the class.

Question 11

Q

How do we find the interquartile range?

Question 12

Q

What is the advantage of using the interquartile range?

Answer

A

It ignores extremes

Question 13

Q

How do we find the interpercentile range?

Answer

A

Highest percentile-lowest percentile

Question 14

Q

What is variance?

Answer

A

A measure of spread that takes all values into account. It is the average squared distance from the mean.

Question 15

Q

What are the 2 formulas for calculating variance? (Use first formula)

Answer

A

The sum of all values squared divided by n minus the mean squared

OR
‘The mean of the squares minus the square of the mean’ MSMSM

Question 16

Q

What is standard deviation?

Answer

A

The average distance from the mean.

Question 17

Q

How do we calculate the standard deviation?

Question 18

Q

What happens if you add a value to your data set that is within one standard deviation of the mean? i.e is the number within the range of mean + s.d

Answer

A

The standard deviation will decrease.

Question 19

Q

What is Sxx?

Answer

A

Therefore we can use the expression
Sxx/n to find variance

Question 20

Q

What is coding?

Answer

A

Applying the same rules to data so it is easier to process. This may or may not change the average, standard deviation etc.

Question 21

Q

Whar happens to the mean if you have coded your data to be y=ax+b?

Answer

A

It’s affected by both a and b components

Question 22

Q

Whar happens to the standard deviation if you have coded your data to be y=ax+b?

Answer

A

(It is not affected by the b component)

Question 23

Q

How are measures of location (e.g mean) and spread (e.g standard deviation) affected by coding?

Answer

A

Measures of location are affected by all parts of coding.
Measures of spread are only affected by multiplicative parts of coding.

Question 24

Q

What is assumed when we use midpoints to calculate the mean?

Answer

A

The data is uniformly distributed.

Question 25

Q

What are the advantages and disadvantages of using the mode?

Answer

A

+Useful for non-numerical data
+Not usually affected by outliers
-Doesn’t use all the data
-May not be representsative if it has a low frequency

Question 26

Q

What are the advantages and disadvantages of using the median?

Answer

A

+Not affected by outliers
+Not significantly affected by errors
-Doesn’t make use of all the data

Question 27

Q

What are the advantages and disadvantages of using the mean?

Answer

A

+When the data set is very large, a few extreme vaalues have negligible impact
-When the data set is very small, a few extreme vlues have a large impact

Question 28

Q

What are the advantages and disadvantages of using the range?

Answer

A

+Reflects the full data set
-Distorted by outliers

Question 29

Q

What are the advantages and disadvantages of using the IQR?

Answer

A

+Not distorted by outliers
-Doesn’t reflect all the data

Question 30

Q

What are the advantages and disadvantages of using standard deviation?

Answer

A

+When the data set is very large, a few outliers have negligible impact
-When the data set is very small, a few outliers have a large impact