L5 Data visualisation Flashcards
everything we measure carries a what?
element of uncertainty
how do we account for random error?
reporting results with a margin of error
what war was Florence Nightingale in?
Crimean
what did Florence Nightingale do from July 1854 to end of 1855?
document:
- how many soldiers died
- what month they died in
- cause of death
when was FN’s work published?
1858
what did FN created? and what did this do?
visualisations of her data to send back to London
- was effective as allowed British army to see where deaths were preventable and where to allocate resources
standard bar charts display -
categorical data clearly
if a bar chart is not shown as a percentage when what does it not need to do?
= 100
what are bar charts good for?
comparing the size of groups
what is the difference in a cluster bar chart?
each year has a cluster of bars
what is a cluster bar chart good at?
showing patterns
are 3D bar charts good / should we use them?
no
why are 3D bar charts so bad?
- look bad
- difficult to interpret
- difficult to directly compare the height of bars
when are graphs more beneficial than tables?
when there is lots of information points and data
- i.e. different factors (years / places / etc.)
what did Hans Rosling do?
help develop stats to do with health
- showed Govs. where to put money and resources
do statistics prove things? what do they do?
no
- they help us understand how uncertain we should be about a measurement
pie charts purpose = to
illustrate proportions
pie charts show -
relative size of categories that add up to 100%
who dislikes pie charts?
data scientists
why do data scientists dislike pie charts?
as they take up a lot of space
what can sometimes just show info better than pie charts?
tables
histograms are normally used to display -
continuous numerical data
what are histograms give us quickly?
an overview of the spread of data
with histograms what are we interested in that’s different to bar charts?
the size of each bin - not height
data is right skewed when is it….
bigger than median
what are density plots similar to?
histograms
draw what a density plot looks like:
.
density plots are a what?
smoothed out version of a histogram
what are density plots as if you did?
drew a freehand line over a histogram
what does histograms show?
the distribution of a continuous variable
box plots = a
standardised way of displaying the distribution of data on a five number summary
what are box plots sometimes known as?
whisper plots
what are box plots best at telling us about?
outliers and what their values are
line graphs are used to represent -
2 continuous variables
different lines show…
different things (i.e. countries)
what are line charts often used to illustrate?
- and example
financial data
- i.e. stock market performance over time
what is the x axis normally in line charts?
time
what axis is time normally in line charts?
x
what can line charts also sometimes show?
- and example
demographic information
- i.e. how life expectancy changes over time
scatterplots are sometimes known as -
scattergraphs
scatterplots are used to show -
2 continuous variables
colour in scatterplots show?
different things
example of a scatterplot =
-GDP per capita (x axis)
-Life expectancy (y axis)
example of use of different size plots in a scatterplot =
population size
logarithmic scales also known as -
log scales
what are log scales similar to?
scatterplots
log scales PRO =
easier to differentiate than a scatterplot, as not all squeezed together
log scales CON =
can be misleading + difficult to interpret
do the media normally use log scales?
no
(continuous data) Ordinal =
data that has natural ordering and hierarchy
example of ordinal data =
satisfaction rating / level of agreement
(continuous data) Nominal =
data has no natural ordering nor hierarchies
example of nominal data =
gender / eye colour
(categorical data) Discrete =
can take specific values + infinite options
example of discrete data =
shoe size / age in years
(categorical data) Interval =
infinite options, can take any value in a given interval
example of interval data =
weight / percentage
what do continuous variables measure?
things that vary continuously
examples of continuous variables =
height / weight / income / age / mass
continuous variables examples are most often seen in…..
nature
categorical variables measure?
things that fall into categories
categorical variable examples can often be seen in the….
social world
true or false - continuous variables can be transformed into categorical ones
true
true or false - data visualisations can mislead us (and intentionally sometimes?)
true
truncating the axes =
shortening the height of the bars
what does truncating the axes make it look like?
makes it look like there is a much larger gap than there is
what is the problem with truncating the axes?
does not give a true representation of the difference between groups
what does truncating the axes defeat the point of?
plotting a chart
Beware also of d… a…..
dual axes
dual axes can make things…
look more closely related than in reality they actually are
dual axes can sometimes try and…
force relationships
when it comes to dual axes us as the audience should always make sure to -
double check
must also be aware of researchers being ………… about ……
selective about data
researchers being selective about data can be -
misleading
where can we often see researchers being selective about data?
in headlines
x axis =
line on a graph that runs horizontally (left right)
y axis =
line on a graph that runs vertically (up down)
all axis’ run through what?
zero
what are the most common types of data visualisation
bar charts
line charts
scatterplots
…….. are an important part of data in the media
graphs