8 - Data Visualization Flashcards
Why visualize?
Explorative Analysis
- visualisation supports analyst intuition, e.g. finding outliers
- may help you look for connections you didn’t think of before
- manual data mining
- > requires both good visualization and flexible adaptation
- > you visualize for yourself
Why visualize?
Decision support & management information
- visualization can provide a quick overview of relevant trends and patterns, e.g. data from month to month
- > requires both good visualization and simple adaptation
- > you build visualization tools
Why visualize?
Presentation and argumentation
- Visualization can help you to underline your arguments with quantitative data in a way that is easy to communicate
- careful: ethics of visualization, e.g. What do I decide ti show? How do I show it? Manipulation without lying
- > requires polished visualization, no adaptation
- > you visualize for others
Good visualization …
- shows all relevant data
- makes the audience think about the content rather than the representations
- does not distort the data
- makes large data sets understandable
- enables comparisons
- layers details from overview to finer points
- has a clear purpose
- is integrated with the context of the representation
= substance + data analysis + design
Three steps for effective visualization
- formulate the question
- gather (and analyze) the data
- apply a visual representation
Question, concepts and visuals
What … is the best and the worst?
Concept:
- maximums and minimums
Visuals:
- bar graph
Questions, concepts and visuals
How has … changed over time?
Concept:
- temporal patterns (trend, seasonality)
Visuals:
- line graph
Questions, concepts and visuals
What … stand out from the rest?
Concept:
- outliers
Visuals:
?
Questions, concepts and visuals
What makes … different from …?
Concept:
- Clustering
Visuals:
?
Questions, concepts and visuals
How are … and … related?
Concept:
- correlation
Visuals:
- scatter plot
Questions, concepts and visuals
What is the breakdown for …?
Concept:
- distribution
Visuals:
- stacked bar graph
Warning: Visualizations can distort data
Data can be distorted by …
- changing the scale of the y-axis between diagrams
- modifying the base line
- switching the aggregation level
- using areas to show one-dimensional data
- using advantageous visual effects (shadows, highlights, …)
- > distortion can be the consequence of errors, mislead decoration, or intentional deception
- > data visualization is also a matter of ethics
Types of representation
- size: represent by area
- color: e.g. coloring values differently
- location: e.g. on a map
- network: e.g. identifying different groups
- time: e.g. line graph across different years
Different ways of visualizing distributions
Sorted
- you can show the median
Unsorted
- distribution according to the time of sample
- no median possible
Histograms: Beware the power of bins
- small bins show variations at higher granularity
- the larger the bins the less variation is visible
- the more we aggregate the the more the median becomes obvious (however further aggregation leads to loss of information)