Quiz 1 Flashcards
Interactive visualization
- used for discovery
- intended for a single investigator or collaborators
- rerenders based on input
- prototype quality
Presentation Visualization
- used for communication
- intended for larger group or mass audience
- does not support user input
- highly polished
Interactive storytelling
Presentations via interactive webpages
Modes of visualization
Interactive Visualization
- User Interaction
- Graphics Rendering
- Target
- Medium
- User controls everything, including dataset
- Real-time rendering
- individual or collaborators
- software or internet
Modes of visualization
Presentation Visualization
- User Interaction
- Graphics Rendering
- Target
- Medium
- user only observes
- precomputed rendering
- colleagues, mass audience
- slide shows, videos
Modes of visualization
Interactive Storytelling
- User Interaction
- Graphics Rendering
- Target
- Medium
- user can filter or inspect details of preset datasets
- real-time rendering
- mass audience
- internet or kiosk
Ultimate goal of data visualization
not just about seeing data, it is about understanding data and being able to make decisions based on the data
Why data visualization is important?
- generating a lot of data and information
- need to process such information
- need to communicate increasing levels of information
- stats not enough
Visual variables
- position
- shape
- size
- brightness
- color
- orientation
- texture
- motion
most important:
- position
- mark/shape
Data Types/Attribute Types
Attribute types
- Categorical
- Ordered
-> Ordinal
-> Quantitative
Ordering Direction
- Sequential
- Diverging
- Cyclic
Nominal scale of measurement
- Only satisfies the identity property of measurement
- Categorial and Arbitrary
Ordinal scale of measurement
- Has the property of both identity and magnitude
- Ranked (and all the numeric)
Interval scale of measurement
- Has the properties of identity, magnitude, and equal intervals.
- Discrete. e.g., Fahrenheit (or centigrade) scale to measure temperature
Ratio scale of measurement
- Satisfies identity, magnitude, equal intervals, and a minimum value of zero.
- Continuous. e.g., weight, distance, etc. Can apply operations of / and
Steven’s law
- change in these parameters (area, loudness and brightness) is in some way underestimated by the human perception
- when marks are represented with graphics that contain sufficient area, the quantitative aspects of size fall, and the differences between marks becomes more qualitative
Size
- easily maps to interval and continuous data variables
- more difficult to distinguish between marks of near similar size -> size can only support categories with very small cardinality
brightness
- human perception cannot distinguish between all pairs of brightness values
- used to provide relative difference for large interval and continuous data variables
- mark distinction for marks drawn using a reduced sampled brightness scale
color map
continuous range of hue and saturation values
Best marks for orientation
those with natural single axis
Texture
considered as a combination of many of the other visual variables:
- marks (texture elements),
- color (associated with each pixel in a texture region)
- orientation (conveyed by changes in the local color)
most commonly associated with a polygon, region or surface
Motion
- associated with any of the other visual variables, since the way a variable changes over time can convey
more information - common use is in varying the speed at which a change is occurring
- other aspect is in the direction for position, this can be up, down, left, right, diagonal, or basically any slope,
- for other variables it can be larger/smaller, brighter/dimmer, steeper/shallower angles, and so on
Issues with color
- relationship btw light we see and colors we perceive is very different
- multiple types of data, each suited to a different color scheme
- significant no of people are color blind
- arbitrary color choices an be confusing
- light color on dark field and dark color on light field are perceived differently which complicated visualization tasks
Munsell’s model
properties of color:
- lightness (black to white)
- hue ( red, orange, yellow, green, blue, indigo, violet)
- saturation (dull to bright)
Most common data types
- sequential data
- divergent data
- qualitative data
Selecting a color map
Properties of the attribute:
- spatial frequency
- continuous or discrete nature
- type of analysis to be performed
divergent/bipolar data
data that varies from a central value
- profits and losses
- differences from the norm (e.g. daily temperature vs. monthly average)
- change over time
Qualitative data
- also categorical or thematic data
- color is used to separate areas into distinct categories
Perception
Process of:
- recognizing (being aware of)
- organizing (gathering and storing)
- interpreting (binding to knowledge) sensory information
- interpreting the world around us
- brain makes assumptions about the world to overcome the inherent ambiguity in all sensory data, and in response to the task at hand
“preattentive” properties
- limited set of visual properties that are detected very rapidly and accurately by the low-level visual system
- tasks that can be performed on large multi-element displays in less than 200 to 250 milliseconds
- certain information in the display is processed in parallel by the low-level visual system
conjunction target
target made up of a combination of non-unique features
Target detection
- Users rapidly and accurately detect the presence or absence of a “target” element with a unique visual feature within a field of distractor elements
Boundary detection
- Users rapidly and accurately detect a texture boundary between two groups of elements, where all of the elements in each group have a common visual
Region tracking
users track one or more elements with a unique visual feature as they move in time and space
Counting and estimation
users count and estimate the number of elements with a unique visual feature
Feature Integration Theory (Anne Treisman)
when perceiving a stimulus, features are “registered early, automatically, and in parallel, while objects are identified separately” and at a later stage in processing
response time:
- if task completion time is relatively constant and below chosen threshold, independent of number of distractors, task is said to be preattentive
accuracy:
- If viewers can complete task accurately, regardless of the number of distractors, the feature used to define the target is assumed to be preattentive
Change blindness
an interruption in what is being seen renders us “blind” to significant changes that occur in the scene during the interruption
change blindness explanation
- Overwriting: information that was not abstracted from the first image is lost.
- First Impression: hypothesis that only the initial view of a scene is abstracted
- Nothing Is Stored: after a scene has been viewed and information has been abstracted, no details are represented internally
- Everything Is Stored, Nothing Is Compared: only compared is requested
- Feature Combination: details from an initial view might be combined with new features from a second view
Absolute Judgment of Multidimensional Stimuli
- Combining different stimuli does enable us to increase the amount of information being communicated, but not at the levels we might hope
- added stimuli resulted in the reduction of the discernibility of the individual attributes
- having a little information about a large number of parameters seems to be the way we do things
Weber’s Law
The likelihood of detecting a change is proportional to the relative change, not the absolute change, of a graphical attribute
Steven’s Law
perceived scale in absolute measurements is the actual scale raised to a power
- For linear features, this power is between 0.9 and 1.1
- for area features, it is between 0.6 and 0.9
- for volume features it is between 0.5 and 0.8
Expanding Capabilities
- reconfigure the communication task to require relative, rather than absolute, judgment (adding grid lines and axis tick marks)
- increasing the dimensionality with caution and in a limited way
- reconfigure the problem to be a sequence of different absolute judgments, rather than simultaneous stimuli
Channel Rankings by Tamara Munzner
Magnitude Channels: Ordered Attributes
- Position on common scale
- Position on unaligned scale
- Length (1D size)
- Tilt/angle
…
worst:
- Volume (3D size)
Channel Rankings by Tamara Munzner
Identity Channels: Categorical Attributes
- Spatial Region
- Color hue
- Motion
- Shape
Separability
- Fully separable (Position + Hue)
- Some interference (Size + Hue)
- Significant interference (Width + Height)
- Major interference (Red + Green)
Core idea of data visualization
The mapping from data variables to visual variables
Tableau’s Key features
- VizQL - Visual Query Language that translates drag-and-drop actions into data queries and then expresses that data visually
- Live Query Engine - A technology that lets people query databases, cubes, warehouses, cloud sources, spreadsheets, etc. without any programming knowledge
- With In-Memory Data Engine - uses the complete memory hierarchy (Disk-RAM-L1 Cache) on ordinary computers to speedup access to slow databases
Tableau data types
- Text (string) values
- Date values
- Date & Time values
- Numerical values
- Boolean values (relational only)
- Geographic values
Dimensions and Measures
- terms from Data Warehousing and Multidimensional Models
Dimensions allow data analysis from various perspectives:
- Time, Product, Supplier
–> categorical data
Measures are numeric representations of facts that occurred:
- Sales amount, Store percentage of profit, number of returned products
–> numeric values
Independent/dependent variables
Independent: Answer questions like Who?, When?, What?
Dependent: Aggregated values
Drill up
Decrease the Level of detail of VIS by removing dimensions to the VIS
Level of detail of VIS
= {dimensions that are present in the VIS}
Drill down
Increase the Level of detail of VIS by adding more dimensions to the VIS
Dimensionality choices for visual analysis
- Dimension subsetting (choose 2 dimensions)
- Dimension embedding (mapping dimensions to color, size etc.)
- Multiple displays (e.g. matrix)
- Dimension reduction (transform to lower dimension)
Dimension reduction
Principal component analysis (PCA)
- computes new dimensions/attributes which are linear combinations of the original data attributes
- new dimensions can be sorted according to their contribution in explaining the variance of the data
- most relevant new dimensions: minimized average error of lost information
Multidimensional scaling (MDS)
- projecting M points in N dimensions into L dimensions (L=2 or 3)
- key goal: maintain the N-dimensional features and characteristics through projection process -> keep relationships
RadViz
- force-driven point layout technique
- n-dimensional data set: N anchor points are placed on the circumference of the circle
- different placement will give different result
Superimposed
One line put over another line
Parallel Coordinates
Each dimension is one axis, one line is one data point in dataset
- shows relationships between axis
Radial axis techniques
- circular line graph
- polar graphs
- circular bar charts
- circular area graphs
- circular bar graphs
Region-based techniques
- bar charts/histograms (stacked, clustered, simple)
- area (stacked, aligned)
- heat tables
- tree maps
- stacked bubbles