data visualisation Flashcards

1
Q

Statistical visualization

A
  • A fancy way of saying graphs?
  • Sort of, but visualisations we’ll learn about today draw on statistical calculations (median, IQR)
  • So we are not just graphing raw data but visualising statistical summaries of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why might we use graphs rather than numeric summaries (tables, numbers in text)?

A
  • Visually appealing
  • Can convey a ”story” without reader needing to know statistics
  • Quickly identify trends or patterns in data
  • Descriptive statistics can hide patterns in data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Histogram

A
  • Used for continuous data
  • Bar charts are used for categorical data
  • Divide up all the possible values into “bins” and then count the number of observations in each bin
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Tukey Boxplots

A
  • Suited to continuous data
  • Shows 5 descriptive statistics in one plot
  • Minimum bound, Q1, median, Q3, maximum bound
  • Plus outliers
  • Allow you to say a lot about your data, as we will find out
  • Each section contains 25% of the data
  • The size of these sections tell us about the amount of variation
  • The lower “whisker” is short, so data here are quite similar
  • The upper whisker is long, so there is more variability in the data here
  • Let’s see this in histogram form
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Minimum and Maximum Bounds

A
  • Not true min and max
  • Min and Max bounds are largest data points above and below the thresholds
  • These thresholds are Q1 – (1.5 x IQR) and Q3 + (1.5 x IQR)
  • Observations larger than this value are referred to as outliers, and plotted as dots
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

(r studio) histogram

A

• Histogram(data, x = dataset, bins = 16, by = dataset, position =’dodge’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(r studio) boxplot

A

• Tukeyboxplot(data = data, y = dataset, x = dataset) + labs(x = ‘data’, y = ‘dataset, in seconds’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

(r studio) scatterplot

A

• scatterplot(data = data, x =dataset, y = dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly