Summarising Data Flashcards

1
Q

What is an average

A

A single value used to describe a data set

It is a measure of central tendency

The mode median and mean are averages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the mode

A

The value which is most often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the median

A

The middle value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When the number of the data sets values are odd how do you find the median

A

It is the 1/2(n+1) observation

Where n is the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you calculate the mean

A

Sum of x(÷n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the mode in classed frequency data

A

The category / class with the highest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you find the median category in frequency table data

A

It is the class that contains the 1/2(n+1)th value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the modal class

A

The class with the highest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you find the median for continuous data

A

It is the 1/2nth value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is linear interpolation

A

A method used to estimate the median value in grouped data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you find the median / a quartile using linear interpolation

A

Find where the median is in the grouped data. This is done by finding the position, and then using cumulative frequency to see which group it is in.

Find out how far into the group your median is in

Then use the formula

(Median value - cumulative frequency before ÷ change in cumulative frequency) × class width

Basically what you are doing finding how far into the group your value is, finding this in proportion to the values in your group and multiplying by the class width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you calculate an estimated mean from grouped data

A

Sum of (f×midpoint) ÷ sum of f

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens to the averages when you increase / decreases by a set percentage

A

The averages increase or decrease for a set percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why do we transform data

A

To make it easier to calculate the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you transform data with decimals

A

Subtract each decimal from the same integer

Then multiply them until they are whole numbers

Now calculate your average

Then reverse the calculations you did to find a mean (divide by your multiple of 10 and add your number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a geometric mean

A

An average that multiplies all the values and roots the number

It is more accurate and effective than using an arethmatic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we calculate geometric mean

A

n√v¹×v²×v³ etc…

N is the number of values
The values are rooted by n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why are weighted means used

A

For data with different values or weightings in each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How are weighted means calculated

A

Sum of (value × weight) ÷ sum of weights

(Total wx) ÷ total w

20
Q

What is a range

A

Largest value - smallest value

21
Q

What is an interquartile range

A

Upper quartile - lower quartile

22
Q

What are the upper and lower quartiles

A

Upper = 3/4 th value

Lower = 1/4 th value

23
Q

How do you calculate quartiles in discrete data

A

The same ways as means but using the quartiles as a fraction rather than 1/2

E.g 1/4(n+1)

24
Q

How do you calculate the range in a frequency table

A

Take the largest possible value and smallest possible value from the table

Subtract these values

25
Q

How do you calculate quartiles in continuous data

A

The same way you would calculate the median in continuous data

E.g 1/4(n)

26
Q

What are percentiles

A

When a data set is divided into 100 equal parts

27
Q

What are deciles

A

When data is divided into 10 equal pwrts

28
Q

What is an interdecile / interpercentile range

A

The difference between two percentiles or deciles

29
Q

What is standard deviation

A

A measure of how far the values deviate from the mean value

30
Q

How do you calculate standard deviation

A

√(sum of x^2 ÷n)-(mean)^2

Where n is the number of values

31
Q

How do you calculate standard deviation for grouped data

A

√(sum of f×x^2 / sum of f) ÷ (sum of fx / sum of f)^2

32
Q

What does a blox plot show

A

Maximum and minimum values
Median
Upper and lower quartiles

33
Q

How do you find an outlier using quartiles

A

Small outlier < (lq - 1.5×iqr)

Large outlier > (uq + 1.5×iqr)

34
Q

How do you find an outlier using standard deviation

A

Mean + / - (3×standard deviation)

35
Q

What are the 3 types of skew

A

Symmetrical distribution - median in the centre
Positive - median closer to lower quartile
Negative - median is closer to upper quartile

36
Q

What does mean>median>mode indicate

A

Positive skew

37
Q

Mode>median>mean indicate

A

Negative skew

38
Q

How do you calculate skew

A

3(mean-median) / standard deviation

39
Q

Advantages and disadvantages of using the mode to show average

A

A:
Easy to find
Can be used with any data type
Unaffected by open-ended or extreme values
Mode is always a data accurate value

D:
Maybe no mode or multiple modes
Cannot be used to calculate a measure of spread

40
Q

Advantages and disadvantages of using the median to show average

A

A:
Easy to calculate
Unaffected by extreme values
Best to use when data is skewed
Can be used to calculate quartiles

D:
May not be a data value

41
Q

Advantages and disadvantages of using the mean to show average

A

A:
Uses all the data
Can be used to calculate standard deviation (statistical calculations)

D:
Always effected by extreme values
Can be distorted by open ended classes

42
Q

What do you need to do when comparing data sets

A

You need to compare an average and a measure of spread

Also you can compare the measure of distribution

43
Q

What is the distribution in a box plot like

A

50% of data is less than the median
50% of the data is more than the median

25% of the data is less than the lq
25% of the data is greater than the uq

50% of the data is between the quartiles

44
Q

Linear interpolation example

Estimate the median amount of time spent watching tv

Median = 12th term
Median group = 10<x<=15

Cumulative frequency before the group is 10
Cumulative frequency in the group is 20

A

The median is found 2 into the group

LCB + amount into group / group total × class width

10+ 2/20 × 5 = 11

45
Q

What does it mean to transform data

A

To alter data in order to make calculations easier.

Once you have finished your calculation you must re transform the data back