Location & Spread Flashcards

1
Q

How do we represent the mean?

A

x̄ (x bar)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we calculate the mean

A

The sum of the x values divided by n or the sum of f(x) divided by the sum of f.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you find the position of LQ (Q1) within listed/grouped data?

A

N/4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you find the position of the median within listed/grouped data?

A

N/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you find the position of UQ (Q3) within listed/grouped data?

A

3n/4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When we find the position of a quartile within listed data, what do we do when the answer is a decimal?

A

Round up to the next integer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When we find the position of a quartile within listed data, what do we do when the answer is whole?

A

Find the midpoint with the next number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we find percentiles of grouped data e.g the 57th percentile?

A

0.57 x n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a decile?

A

10% chunks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we use linear interpolation to find the median ?

A
  1. Find the true class limits and class width (since the data is most often rounded)
  2. Find the total frequency of the data and divide by 2 to find the nth place of your median
  3. Identify which group your median is in and calculate how far into the group it is out of the frequency of that group.
  4. Multiply by the class width
  5. Add it to the lower bound of the class.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we find the interquartile range?

A

Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the advantage of using the interquartile range?

A

It ignores extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we find the interpercentile range?

A

Highest percentile-lowest percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is variance?

A

A measure of spread that takes all values into account. It is the average squared distance from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 2 formulas for calculating variance? (Use first formula)

A

The sum of all values squared divided by n minus the mean squared

OR
‘The mean of the squares minus the square of the mean’ MSMSM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is standard deviation?

A

The average distance from the mean.

15
Q

How do we calculate the standard deviation?

A
16
Q

What happens if you add a value to your data set that is within one standard deviation of the mean? i.e is the number within the range of mean + s.d

A

The standard deviation will decrease.

17
Q

What is Sxx?

A

Therefore we can use the expression
Sxx/n to find variance

18
Q

What is coding?

A

Applying the same rules to data so it is easier to process. This may or may not change the average, standard deviation etc.

19
Q

Whar happens to the mean if you have coded your data to be y=ax+b?

A

It’s affected by both a and b components

20
Q

Whar happens to the standard deviation if you have coded your data to be y=ax+b?

A

(It is not affected by the b component)

21
Q

How are measures of location (e.g mean) and spread (e.g standard deviation) affected by coding?

A

Measures of location are affected by all parts of coding.
Measures of spread are only affected by multiplicative parts of coding.

22
Q

What is assumed when we use midpoints to calculate the mean?

A

The data is uniformly distributed.

23
Q

What are the advantages and disadvantages of using the mode?

A

+Useful for non-numerical data
+Not usually affected by outliers
-Doesn’t use all the data
-May not be representsative if it has a low frequency

24
Q

What are the advantages and disadvantages of using the median?

A

+Not affected by outliers
+Not significantly affected by errors
-Doesn’t make use of all the data

25
Q

What are the advantages and disadvantages of using the mean?

A

+When the data set is very large, a few extreme vaalues have negligible impact
-When the data set is very small, a few extreme vlues have a large impact

26
Q

What are the advantages and disadvantages of using the range?

A

+Reflects the full data set
-Distorted by outliers

27
Q

What are the advantages and disadvantages of using the IQR?

A

+Not distorted by outliers
-Doesn’t reflect all the data

28
Q

What are the advantages and disadvantages of using standard deviation?

A

+When the data set is very large, a few outliers have negligible impact
-When the data set is very small, a few outliers have a large impact