Lecture 16 - Data Visualisation Flashcards
Why do we need to visualise data?
- Makes it easier to understand datasets
- Get message across to the right people
- E.g. Florence Nightingale used data visualisation to highlight British soldiers’ living conditions in the Crimean War (1858)
- Got message across to government/Queen Victoria – resulted in government legislation to improve sanitation in the military (eventually led to decreased death due to sanitation in civilian population)
What is Anscombe’s Quarter (1973)?
- Simulated four datasets
- Different datasets but same summary statistics (means, SDs and correlations)
- Looking at numbers alone means you can’t make a judgement about which tests to run e.g. can’t run correlation on data sets 2, 3 and 4
- Same summary statistics don’t necessarily produce same graphs
What did the Datasaurus study by Matejka and Fitzmaurice (2017) show?
- Cairo (2016) produced a dataset that created a plot in the shape of a dinosaur
- Matejka and Fitzmaurice (2017) simulated 12 more datasets with identical descriptive statistics (created 12 very different looking plots)
- Can’t rely on numbers alone
What is the purpose of data visualisation?
Data analysis process:
- Check that assumptions have been met
- Understand the relationships between variables before inferential analysis
Report writing and publication process:
- Show clear relationships between variables
- Help the reader interpret the data in the way you want them to
What types of graphs are used for checking data assumptions?
Graphs can be useful for checking data assumptions before running statistical tests.
Histogram = checks for normality
Boxplots = checks for outliers
What types of graphs are used for summarising descriptive statistics?
Simple bar chart (summarise means in each subscale)
Clustered bar chart (compare groups on a similar category)
What types of graphs are used to graph relationships?
- We can use scatterplots to graph relationships between variables (and sometimes check assumptions)
- Benefits reader and researcher (e.g. check assumptions)
What are the basic properties of an APA formatted graph (‘figure’)?
- Don’t make graphs in SPSS! (not easily interpreted/APA friendly)
- Use e.g. Microsoft excel
- Better with error bars
What makes a good graph?
- Tufte (2001) and the American Psychological Association (2021) suggest that:
- Images are clear
- Units of measurement are provided
- Axes are clearly labelled
- Elements in the figure are clearly labelled or explained
- Avoid distorting the data (e.g. truncated axes)
- Induce the reader to think about the underlying messages of the figure
- Avoid using chartjunk (the use of unnecessary or misleading elements in the design of a graph)
- “Regardless of its cause… non-data-ink or redundant data-ink… is often chartjunk.” (Tufte, 2001)
- Create clearest graph possible with most minimal ink
What is an example of a bad graph in public health data?
- Confirmed COVID-19 cases by county (Georgia Department of Public Health)
- No x and y-axis labels
- Dates on the x axis are in a random order (makes the data misleading)
- Colours of bars are not consistent across clusters or bars
What is an example of a bad graph in politics and social media?
- 2024 election – trying to show company donations to both campaigners: companies not in same order, bars not in proportion with each other
- Student loans graph – no axis labels
What are tables used for?
Useful for summarising lots of numerical information in one go
When do you use a figure or a table?
- What is the aim of your research, and how does your visualisation help communicate that?
- Do you want to just summarise data? = table
- Do you want to identify trends in a dataset? = figure
What are some other ways to plot data?
- There are lots of different ways, depending on the purpose of the chart you are producing e.g. visual vocabulary
Standard London Underground Map
- The London Underground map helps travellers identify underground stations and their associated lines
- Useful for planning a route
- However, geographical distance and relationship to overground isn’t considered
London Connections Underground Map
- Purpose was to help readers understand position of underground stations in relation to London at street level
- This was in response to consumer feedback
- But, this can be difficult to read
Landmarks London Underground Map
- What if you just wanted to know where to get off for tourist landmarks?
- TfL also provide visualisations to help readers with this
- Here, the use of ‘chartjunk’ may actually be useful