Basic Summary Stats and Data Visualisation Flashcards

1
Q

What types of data are there?

A

There is both discrete and continous data. Discrete data can be broken up into categorical (no order) or ordinal (there exists and explicit order)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What decriptive stats exist?

A
  • Max
  • Min
  • Average
  • Frequency
  • Variance
  • Standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are examples of central tendency measures?

A
  • Mean
  • Mode
  • Median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are examples of measures of spread?

A
  • Range
  • Variance
  • Standard deviation
  • Quartiles/Percentiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What five numbers are needed for a box plot?

A
  • Median: The middle value
  • Q1: The middle value below the median
  • Q3: The middle value abover the median
  • Minimum: The lowest value
  • Maximum: The highest value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Whats an important measure of spread in a boxplot?

A

IQR = Q3 - Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an outlier?

A

An outlier is a data object that deviates from the normal objects as if it were generated by a different mechanism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What makes a Tukey boxplot different to a normal boxplot?

A
  • Max value is the maximum within 1.5 IQR’s of Q3
  • Min value is the minimum within 1.5 IQR’s of Q1
  • Suspected outliers are denoted by a black outlined circle and are > 1.5 IQR’s above Q3 or below Q1
  • Outliers are denoted by a filled in black circle and are >3 IQR’s above Q3 and below Q1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a histogram?

A
  • A data visualisation that counts the frequency of a certain number of points.
  • Points are grouped together in bins that can be of the same size (common) or of varied size.
  • The x-axis has the bins and the y-axis has the relative frequencies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a bar plot?

A
  • A data visualisation that has shows for categorical variables some numeric quantity associated with it
  • The x-axis is the categorical variable and the y-axis has the relative frequency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a scatter plot?

A
  • A data visualisation that shows the relation between two numeric variables
  • The x-axis and y-axis both represent numeric varaibales
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are elements of good visualisations?

A
  • Meaningful titles
  • Labeled axes
  • Its suitable for the data set
  • Can be interpreted on it’s own
  • Has no redundant infomation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly