Exam Flashcards
What is Data Visualization
Use of human visual perception to help us communicate data analytics.
What are the goals of DV?
Exploratory Analysis
- Starting point: We intend to discover new knowledge from the input data.
- Process: Explore the obtained visual representation and look for signs that could suggest indications of particular tendencies and relations.
- Results: Visualization of data that can form the basis of a hypothesis.
Confirmatory Analysis
- Starting point: Already have hypothesis and objective about the data.
- Process: Goal-oriented visual examination of the hypothesis.
- Results: Determine evidence for the acceptance or rejection of the pre-formulated hypothesis.
Presentation
- Starting point: Facts to be presented are fixed a priori in graphical display.
- Process: Choice of appropriate presentation techniques.
- Results: High-quality visualization of the facts.
What are the take-aways for DV?
Perception
- Visual perception is subjective.
- How we represent our data is not independent of how others will understand it.
- Not all visual features are alike (e.g. color is different from length).
Dana Analytics
- We need to show our data to others for them to believe our findings.
- Just the results of some metric can hide strong bias and outliers.
- Visual vocabulary can be used to encode information in a qualitative way that makes it easier to detect patterns and bias.
Communication
- Representation of data through visual cues can be used in different tasks.
- Not all tasks have the same requirements.
- Context is important!
What is context in DV?
Who
Identify your decision-maker and audience
What
Focus on the actions you expect from your audience and adapt your communication to the mechanism used
How
How your data will support your what
What are the different levels of detail?
Trying to get funding (Investors/ Board) < Brainstorming new ideas for project (Colleagues) < Designing KPI dashboard (Colleagues)
What are Tufte’s Five Laws of Data-Ink?
- Above all else, show the data: focus on the data itself and presenting it clearly
- Maximize the data-ink ratio: maximize the proportion of ink (or pixels) used to represent the data compared to the total ink used in the graphic
- Erase non-data-ink: gridlines, background colors, and other elements that do not directly contribute to conveying the information should be minimized.
- Erase redundant data-ink: elements that repeat information already present in the data should be erased.
- Revise and edit: review and refine the visualizations to improve clarity and effectiveness.
What are the Gestalt Principles of Visual Perception?
Proximity: Objects that are close to each other are perceived as forming a group.
Similarity: Objects that are of similar color, shape and size are perceived as being part of the same group.
Enclosure: If elements are enclosed together we see them as part of the same group.
Closure: When presented with an incomplete or partially obscured image, people tend to mentally fill in the missing information to perceive the whole.
Continuity: Lines or patterns that follow a smooth, continuous flow are perceived as more related and are grouped together.
Connection: We identify objects that are physically connected as part of a group - which is stronger than similarity or enclosure.
What is Figure and Ground?
Figures are perceived to be in the foreground, while Ground is whatever lies behind the figure.
The figure is distinguished from the background by Gestalt laws.
What are Preattentive Attributes?
Visual properties of an object or stimulus that the human brain can detect and process rapidly, effortlessly, and in parallel, without the need for focused attention.
These attributes are processed in the early stages of visual perception, often before conscious awareness kicks in.
E.g. Orientation, shape, line length, line width, size, curvature, added marks, enclosure, hue, intensity, spatial position, motion
What are the meanings of colors?
Earthtones: Calming, sinks into the page.
Cool: Smoothing, restful, calm.
Unnatural Colors: Alarming, unnerving, draws attention.
Warm: Optimistic, active, vivid.
Increasing Color Intensity: Draws the eye and means the point is more important.
Color-Blind Guide
Some people see colors in different ways.
Blue is the safest color.
Green/Red are not easy to distinguish.
We can use Blue/Orange or Blue/Red
Data Models vs Conceptual Models
Data Models: formal descriptions using mathematical operations.
Conceptual Models: mental constructions that include semantics to support reasoning.
Examples:
1D Float Number vs Temperature
3D Vector of Float Numbers vs Spatial Location (Coordinates)
What are the different data types?
1. Qualitative (Categorical):
a) Nominal:
- No quantitative relationship between categories
- Classification without ordering
- Example: Gender, nationality, type of animal
b) Ordinal:
- Attributes can be rank-ordered
- Distances between values do not have any meaning
- Example: Education, health, customer satisfaction ratings
2. Quantitative (Numerical):
- Attributes can be rank-ordered
- Distances between values have a meaning
- Mathematical operations are possible
- Example: age, temperature, and salary
c) Discrete: Product of counting (e.g. heart rate, number of siblings)
d) Continuous: Can be measured with infinite values (e.g. height and weight)
Priorities Table in Relation to Perceivable Visual Attributes
From more to less perception:
Quantitative Data: Position, Length, Angle, Slope, Area, Volume, Density, Color Saturation, Color Hue, Texture
Ordinal Data: Position, Density, Color Saturation, Color Hue, Texture, Connections, Containment, Lenght, Angle, Slope, Area, Volume
Nominal Data: Position, Color Hue, Texture, Connections, Containment, Density, Color Saturation, Shape, Length, Angle, Slope, Area, Volume
What are dashboards?
Visual display of the most important information needed to achieve one or more objectives, consolidated and arranged on a single screen so that the information can be monitored at a glance.
An information display designed for people to help maintain situational awareness.
Set of interactive charts (primarily graphs and tables) that simultaneously reside on a single screen, each of which presents a somewhat different view of a common dataset and is used to analyze that information.