Statistics - Location and spread Flashcards

1
Q

What is a measure of location?

A

Single value which describes a position in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a measure of central tendency?

A

Single value that describes the centre of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the mean?

A

Sum of data values / number of data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the median?

A

The middle value when the data values are put in order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is mode?

A

The value or class that occurs most often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When should median be used?

A

Used when there are extreme values

Quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When should mean be used?

A

Quantitative to represent all data

But affected by extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When should mode be used?

A

Qualitative or quantitative
Either one or two modes
Not very informative if each value occurs once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can the mean of data in a frequency table be calculated?

A

Mean = Sum of products of data and their frequencies / sum of the frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the lower quartile?

A

Q1

One-quarter of the way through the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the upper quartile?

A

Q3

Three-quarters of the way through the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is data split if there is a 85th percentile?

A

85% of data is less than 85th

15% of data is more than 85th

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can you calculate the lower quartile for discrete data?

A

n/4
If whole number, Q1 is halfway between this point and the one above
If not whole number, round UP and pick this data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can you calculate the upper quartile for discrete data?

A

3n/4
If whole number, Q3 is halfway between this point and the one above
If not whole number, round UP and pick this data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you calculate Q1-3 for cumulative frequency table?

A

Q1 = n/4 th data set
Q2 = n/2 th data set
Q3 = 3n/4 th data set
NO ROUNDING

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define percentile

A

The value below which a percentage of data falls

17
Q

What is interpolation?

A

Technique to estimate the Q1-3 and percentiles

This assumes the data values are evenly distributed

18
Q

What is the equation for linear interpolation?

A

(Quartile - freq. below / freq of group) x width + lower class boundary

19
Q

What is the range?

A

Difference between largest and smallest values in the data set

20
Q

What is the interquartile range IQR?

A

Difference between upper and lower quartile

Q3 - Q1

21
Q

Why is IQR used?

A

It does not include extreme values

Only considers spread of middle 50% of the data

22
Q

What is the inter-percentile range?

A

Difference between the values for two given percentiles

23
Q

What is variance a measure of?

A

The spread of a data

24
Q

What is the equation for variance?

A

(Sum of x^2/n) - (Sum of x/n)^2

25
Q

What is the equation for standard deviation?

A
Sqrt of (Sum of x^2/n) - (Sum of x/n)^2
Square root of variance
26
Q

How is variance/standard dev. different for grouped data in a frequency table?

A

x is always times by its frequency

27
Q

What is coding?

A

A technique to simplify statistical calculations

Allows easier data to work with

28
Q

What is the equation for coding data?

A

y = (x-a)/b

29
Q

What is the equation for the mean of coded data?

A

mean of y = (mean of x - a)/b

30
Q

What is the standard dev. of coded data?

A

Coded standard dev. = standard dev. / b

31
Q

What affects measures of location and spread in coding?

A

Add/subtract affects mean not spread

All affected by stretch x or /

32
Q

What is the formula for sample variance?

A

Sum of (x - mean of x)^2 / (n-1)

33
Q

What is the formula for sample standard deviation?

A

Square root: Sum of (x - mean of x)^2 / (n-1)

34
Q

What are the advantages and disadvantages of range?

A

Adv: Easiest measure of dispersion to calculate
Dis: Heavily affected by extreme values, no info on spread of the rest of the values

35
Q

What are the advantages and disadvantages of interquartile range?

A

Adv: Not affected by extreme values (used when outliers present)
Dis: Difficult to calculate for grouped data

36
Q

What are the advantages and disadvantages of variance?

A

Adv: Depends on all data values
Dis: Difficult to calculate, affected by outliers, different units from actual data values

37
Q

What are the advantages and disadvantages of standard deviation?

A

Adv: Depends on all data values, same units as data values
Dis: difficult to calculate, affected by outliers