Topic 3 - Numerical Summaries Flashcards

1
Q

LO

A

LO3 Produce, interpret and compare graphical and numerical summaries, using base R and ggplot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Advantages of Numerical summaries

A
  • Numerical summaries produce all the data to 1 simple number/ stat
  • Loses lots of info, but is easy to comminicate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Major features used to create a numeric summary

A
  • Max
  • Min
  • Spread (stdev, range, IQR)
  • Centre (mean, median)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean

A

The average of the data
= sum of data / size of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Median

A
  • The middle point when data is smallest to largest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Robustness

A
  • The median is said to be robust and is a good summary for skewed data, as it is not affected by outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Comparing mean and median

A

Symetrical data:
- Excpect mean and median to be the same

Left skewed data:
- Mean expected to be smaller than median

Right skewed data:
- Expect mean to be larger than the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Limitations of mean and median

A

Both need to be paired with the spread of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard deviation

A
  • First define the Root Mean Square (RMS)
  • Measures the average of a set of numbers, regardless of the signs
    1) root the #
    2) Mean the result
    3) square the result
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stdev in terms of RMS

A

Stdev measures the spread of data
SDpop = RMS of gaps from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

pop Vs sample SD

A

SDpop = SD sample x Root((n-1)/n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SD rule of thumb

A

1 SD = 68%
2 SD = 95%
3 SD = 99.7%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard units

A
  • ā€˜z’ score
  • How many SD is a data point above or below the mean

Standard units = (Data Point - Mean) / SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

IQR

A
  • Another measure of spread
  • Range of the middle 50% of the data
    IQR = Q3 - Q1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Coefficient of Variation

A
  • Combines the mean and SD into one summary
    CV = SD / Mean
  • The higher the CV, the greater the spread around the mean