Numerical Measures Flashcards

1
Q

What is the formula for the mean of a data set?

A

(Σx)/n where x represents the values of data and n is the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is the median value found for non-grouped data?

A

(n+1)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the median value found for grouped data?

A

n/2

(If you get an nth value ending in .5, work out the mean between the value in front and value behind to get the value the median corresponds too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the mode?

A

The mode is the most common value in a data set.

Note for grouped data, there is a modal class. Which is defined as the class in which the modal value is contained.
Also note, not all samples have a mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When working out the mean, median, mode or quartiles what information do you need?

A

You need a cumulative frequency column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the range?

A

The range is the difference between the highest and lowest value in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the lower quartile?

A

The data at the 25th percentile of the sample.

For non-grouped data, the nth value that represents the lower quartile is found by 0.25(n+1) where n is the cumulative frequency

For grouped data, the nth value that represents the lower quartile is found by 0.25(n) where n is the cumulative frequency

(if you get an nth value ending in .5 work out the mean between the value in front and the value behind to get the value the lower quartile corresponds to)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the interquartile range?

A

The difference between the upper and lower quartiles

(Q3 - Q1)

(this is a value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the upper quartile?

A

The data value at the 75th percentile of the sample.

For non grouped data, the nth value that represents the upper quartile is found by 0.75(n+1) where n is the cumulative frequency

For grouped data, the nth value that represents the upper quartile is found by 0.75(n) where n is the cumulative frequency

(if you get an nth value ending in .5 work out the mean between the value in front and the value behind to get the value the upper quartile corresponds to)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are variance and standard deviation related?

A

Variance = Standard Deviation 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What actually is variance?

A

A measure of how far each data point squared is from the mean, and therefore represents the spread of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is variance found?

A
  • Find the mean of the data points
  • Calculate the difference between each data point and the mean value (write this as a new list of values)
  • Square the difference between each data point and the mean
  • Find the sum of your new list of values
  • Write the final answer as the relevant unit squared
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you find the variance, SD, median and quartiles with your calculator?

A
  • MENU (6)
  • 1-Variable (1)
  • Enter data and frequency
  • AC
  • OPTN
  • 1-Variable Calc (2)

For grouped data, find the midpoint of the class and put it into the calculator

Note do not use cumulative frequency in the calculator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you deal with grouped data when inputting into the calculator to find numerical measures?

A

Use the midpoint of the data as the value to input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is grouped and non-grouped data?

A

Grouped data refers to data given in class intervals (e.g 10-20)

Non-grouped data refers to individual pieces of data (e.g 6,24,69,420)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can you convert grouped data into non-grouped data?

A

Write out the heading of the group as many times as the frequency states

(e.g a group of 3 people with 4 cats each,
becomes, 4,4,4)

Note this also works in reverse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When there are gaps in a continuous grouped data set (lengths 0-9, 10-19, 20-29), what do you always do first?

A

Adjust class widths to the value for which they would no longer round to the original values

(0-9, 10-19) becomes (0-9.5, 9.5-10.5)

Then find the midpoint column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When there are gaps in discrete grouped data sets (ages 0-5, 6-10, 11-15), what do you always do first?

A

Adjust class widths so that the final value of the width is the first value of the next width

(0-5, 6-10 11-15 … ) becomes (0-6, 6-11, 11-16 etc)

Then find the midpoint column

19
Q

What is continuous data?

A

Data which can take up any value (e.g girth, length and height)

20
Q

What is discrete data?

A

Data which can be counted and has finite values (e.g sausages, boys and pens)

21
Q

What is the ‘formula’ for linear interpolation?

A

(UB-LB)/(UF-LF) = (Q-LB)/(N-LF)

This basically states the proportion of the boundaries range to frequency range is the same as the proportion of the median - lowest boundary value to the median - lowest frequency

22
Q

State an assumption of linear interpolation

A

Data is evenly spread within the boundaries

23
Q

How do you find ‘N’ in linear interpolation?

A

For median n is the (cumulative frequency / 2)
For LQ n is the (cumulative frequency / 4)
For UQ n is 3 x (cumulative frequency / 4)

24
Q

What are the steps of linear interpolation?

A

– Adjust class widths of grouped data for any gaps
– Add a cumulative frequency column
– Input the cumulative frequency as n and sub into relevant equation (median / quartile)
– Find the class in which this value for n falls
– Draw interpolation diagram
– Find UB and LB by reading class width
– Find UF and LF by finding cumulative frequency on either side of the class
– Sub these values including N into the equation and solve for Q

25
Q

What is the equation for standard deviation as coded data?

A

Sy = Sx / b

Where y is coded data and x is the original data where b is a constant

26
Q

What are the advantages and disadvantages of using the median as a measure of location?

A

Advantages:
- Useful for non-numerical data
- Always an observed data value
Disadvantages:
- Affected by an outlier
- Does not use all data

27
Q

How do you draw a linear interpolation diagram?

A
  • Draw a horizontal straight line

-Draw 3 vertical lines at the top, bottom and middle of your line

  • Write upper and lower boundaries on the top as well as Q
  • Write upper and lower frequencies on the bottom as well as the value of n (calculated by frequency equation initially)
  • Solve for Q using the equation
28
Q

True or False data given in linear interpolation questions the data given is always grouped

A

True you will never be given non grouped data

29
Q

What is the equation for variance?

A

Sxx/n

or

((Σx2)/n) - x̄2)

30
Q

What is the equation for standard deviation?

A

(Sxx/n)1/2

(((Σx2)/n) - x̄2))1/2

31
Q

What is the general equation for coded data?

A

y = (x-a) / b

where y is the coded data value, x is the original data and a and b are constants

32
Q

What is the equation to find the coded mean?

A

ȳ = (x̄ - a)/b

Where y is the coded mean and x is the original mean and a and b are constants

33
Q

What are the advantages and disadvantages of using the mode as a measure of location?

A

Advantages:
- Not affected by an outlier
- Useful for non-numerical data

Disadvantages:
- Does not use all data
- May be multiple modes

34
Q

What are advantages and disadvantages of using the mean as a measure of location?

A

Advantages:
- Large data set makes outliers negligible
- Uses all data values

Disadvantages:
- Affected by outliers in small data sets

35
Q

When you have discrete data with gaps do you amend the gaps or not?

A

You do not amend the gaps. You only amend gaps in continuous grouped data

36
Q

What are the advantages and disadvantages of using the range as a measure of spread?

A

Advantages:
- Reflects the full data set

Disadvantages:
- Affected by outliers

37
Q

What are the advantages and disadvantages of using the interquartile range as a measure of spread?

A

Advantages:
- Not affected by outliers

Disadvantages:
- Does not reflect the full data set

38
Q

What are the advantages and disadvantages of using the standard deviation as a measure of spread?

A

Advantages:
- Outliers are negligible in large data sets

Disadvantages:
- Outliers have a big impact on small data sets

39
Q

What does sigma (Σ) notation express?

A

Sigma (Σ) refers to the ‘sum of’
For example, sigma x (Σx) means the sum of all the values of x

40
Q

What does standard deviation actually mean?

A

A measure of how far each data point is from the mean, and therefore represents the spread of the data

41
Q

Does addition/subtraction (when coding data) affect the mean and standard deviation? (skip this card)

A

Adding or subtracting will affect the mean of the data but not the standard deviation. This is because all data points have increased/decreased by the same value and so the distance from the mean is no different.

The mean will change by the same value as the addition or subtraction

42
Q

Does multiplication/division (when coding data) affect the mean and standard deviation?

A

Multiplying or dividing affects both the mean and standard deviation
The mean will change by the same factor as the division or multiplication

43
Q

What is the mean?

A

The mean is the sum of all data divided by the number of pieces of data

It is calculated in the same way for grouped and non-grouped data.

44
Q

What is the equation for standard deviation?

A

(Sxx/n)1/2