Lecture 9: Quantitative and Qualitative Evaluation Flashcards
Why do a user study?
To test and compare interfaces, technologies, visualizations, interaction techniques
‣ Test usability (learnability, efficiency, satisfaction,…)
‣ Get user feedback
‣ Refine visual design
A user study can be carried out to answer a specific research question.
Explain Quantative Methods
‣ Objective metrics -> Measurements
‣ Use numbers for interpreting data
Explain Qualitative Methods
‣ Subjective metrics
‣ Description of situations, events, people, interactions, and observed behaviors, the use of direct quotations from people about their experiences, attitudes, beliefs, and thoughts Upcraft and Schuh 1996
‣ Focused on understanding how people make meaning of and experience their environment or world Patton 2002
What evaluation methods do we have?
Controlled experiment
Interviews / questionnaires
‣ Unstructured, structured, semi-structured
Field observation, lab observation
‣ Video / audio analysis
‣ Coding / classification of user behavior
Usability testing
**Algorithmic performance measurement
Log analysis
Crowdsourcing study **
‣ e.g. Amazon Mechanical Turk
What is the Scope of evaluation?
**Pre-design **
‣ e. g., to understand potential users’ work environment and workflow
**Design **
‣ e.g., to scope a visual encoding and interaction design based on perception and cognition
**Prototype **
‣ e. g., to see if a visualization has achieved its design goals, to see how a prototype compares with the current state-of-the-art systems or techniques
Deployment
‣ e.g., to see how a visualization influences workflow and work processes, to assess the visualization’s effectiveness and uses in the field
Re-design
‣ e. g., to improve a current design by identifying usability problems
What is Internal Validity?
‣ High when tested under controlled lab conditions
‣ Observed effects are due to the test conditions (and not random variables)
What is External Validity?
‣ High when interface is tested in the field e.g. handheld device tested in museum
‣ Results are more generalizable to other people or situations
What is the Internal vs. External Validity Trade-off?
‣ The more akin to real-world situations, the more the experiment is susceptible to uncontrolled sources of variation
What are some valid sampling strategies?
‣ Random sampling (same chance for every member of population)
‣ Representative sampling
‣ Convenience sampling
What are Independent Variables?
‣ Manipulated through the design of the experiment
‣ It is “independent” of participant behavior
‣ e.g., gender, age, visualization technique, interface
What are Test Conditions?
**‣ Levels, values, or settings for an independent variable
‣ Example: **
* Independent variable: gender
* Conditions: male, female
What are Dependent Variables?
‣ Measurements or observations 30
‣ e.g., task completion time, error rate, …
What are Control Variables?
- e.g., room light, noise…
- If controlled —> more internal and less external validity
What are Random Variables?
not controled
- e.g., fatigue
- More influence of random variable —> less internal validity
What is Between Subjects Design?
‣ Each participant is tested on only one level/condition
‣ A separate group of participants for each condition
* e.g., one group uses Tableau, one Spotfire, one R
Important: randomized assignment of participants to groups
What is Within Subjects Design?
‣ Participant is tested on each level/condition
* e.g., participants use all interfaces
* Repeated measurement
What are Inferencial Statistics?
‣ Using data to reach some conclusion
‣ Make some inference about the characteristics of the larger population (generalization)
* e.g., Interface A is significantly easier to use than Interface B
What are Descriptive Statistics?
‣ Describing the characteristics of a sample
* e.g., 50% of the participants are female