Analytical Evaluation Flashcards
Evaluation 1
- Evaluation is a process of systematically assessing a system, guided by questions about the system and criteria
- Does the system work as it is supposed to work?
- Does it meet certain standards?
- Is the performance improved over a previous version?
Evaluation 2
In HCI, evaluations are concerned with how well users can learn and use a system to achieve their goals, and how satisfying the system is to use.
- Can users complete the tasks the system is meant to support?
- Can users achieve their goals faster or with less effort, compared to earlier versions, or competing products?
- Would a first-time user be able to figure out how to use the system?
- Has a change made in interface had the desired impact?
Why Evaluation is important
- Motivation
- Developers are often not aware of the effect their system has on users
- People are often vague about their goals and expectations
- Developers often see no need for evaluation when the system “works”, but there are always ways to improve it!
- Getting the right design / getting the design right
- Comparing different ideas against each other, for example by making prototypes and simulating functionality
- Identifying usability problems in the design of an interface, to improve it
- Economical importance
- Evaluating usability as differentiating factor / Assesing return on investment
Objectives & Criteria
Evaluations are a goal-directed process for which objectives and criteria need to be clear\
- What aspects of the system and user experience is being evaluated? For example:
- Accessibility of the system
- Design alternatives for the interface
- What the user thinks and feels about the system
- Impact of small changes
- Criteria (for example)
- Compliance with rules, guidelines, standards
- Performance against metrics
- Identification of problems
- Insight into how users use a system and reason about it
Forms of Evaluation #1
- Reviews
- Informal or formal, for example using scenarios, guidelines or checklists
- By the design team and/or external reviewers (usability experts)
- Using models
- Formal models of interactive systems, for example to analyse complexity of tasks, and whether goals are reachable
- Models that predict joint performance of human and system
- Evaluation with experts
- Having usability experts try out and critique with the system
- Expert walkthrough, putting themselves in the shoes of a user
- Evaluation with users
- Usability tests, interventions, experiments
Forms of evaluation
Formative / Summative
Evaluating an Axe
- Analytical evaluation identifies the object properties
- “If the axe does not cut well, what do we have to change?”
- Empirical evaluation helps to understand the tool and it’s properties in the context
- “Why does the axe have a special-shaped handle?”
Evaluation - Key Points
- Evaluation is fundamental in HCI research
- Can be approached analytically or empirically
- with users, with experts, or model-based
- Can have a focus on qualitative and/or quantitative data
- Allows us to assess how users perform with a system and how the system affects users
- It enables improvement of a system, at all stages of the development
- … must be carefully considered to be able to draw the right conclusions!
Models
- Models are useful for understanding and analysing behaviour of interactive systems
- Models are not perfect: as simple possible, as complex as necessary
Descriptive models
- Describing a system or phenomenon
- Providing a structure for understanding, designing, specifying, analysing (classification, taxonomy, process model, …) - e.g. 7 Stages of Action, Layered Model of Input
Predictive Models
- Mathematical model of system behaviour
- Makes it possible to simulate and predict future behaviour
Cognitive Modelling
- Cognitive models in HCI are based on viewing humans as rational informational processors
- Modelling goal-directed interaction
- Applies for routine tasks with clear goals
- Cognitive modelling supports analysis of tasks
- How can the user accomplish a goal?
- What steps are involved?
- How complex is the task?
- Are there different ways of doing it?
- Some models extend to the prediction of performance
- How long does it take to complete a task?
Hierarchical-Sequential Organisation
GOMS
GOMS is a formal model that describes interactions in terms of Goals, Operators, Methods and Selection rules.
Goals in GOMS
Description of what the user aims to accomplish
Action-object pair, e.g. print-document, delete-word
Operators in GOMS
Actions that cause change in the system
- Elementary operations on the interface
- Action-object pair, e.g. voice-command, click-button
- Defined by the basic operations an interface supports
Methods in GOMS
Sequence of operators to achieve a goal
- Well learned sequence of steps to accomplish a goal
- Specific way to complete a (sub-) task with the system
Selection Rules in GOMS
Rules that define the choice of method
- Only when there are different methods for achieving the same goal
GOMS Model
Example: ATM
Cognitive modelling - Key Points
- Cognitive modelling is useful for analysing and predicting human performance with interactive systems, without actually having to observe users
- Using models of human performance as approximation
- GOMS models task structures and support task analysis:
- Capturing procedural knowledge -> which steps in which order
- More steps to accomplish a goal -> indicates more effort
- More depth in the goal structure -> indicates higher complexity and demand on short-term memory
Evaluation using GOMS
- Analytical
- Analysing procedures using formal models
- Formative
- Generally motivated for improvement
- Finding problems in task structures
- Analyse performance of alternative solutions to inform choices
- Possible early in design but also on (parts of) complete system
- Qualitative
- GOMS provides qualitative description of procedural knowledge required for system operation (KLM, in comparison, is quantitative)
- Objective
Expert Evaluation #1
- Expert evaluations involve somebody with experience in interaction design and usability look over the system
- Expert evaluation utilize the knowledge of specialists such as usability professionals to find usability problems in a formative process.
- Expert evaluations are not a replacement for evaluation with users, but they can be easier to arrange and help address basic issues before evaluation with users
- Experts will find problems based on their experience, and will suggest improvements and make recommendations
- Depending on method, expert reviews might also include review other materials, such as product specifications or system logs.
Expert Evaluation #2
Expert evaluations are inspection methods in which experts walk through the system
* Using representative tasks
* Concrete examples of what a user would
want to accomplish with the system
- Considering the context of use
- e.g. based on scenarios about people using the system, conveying people’s motivation for using the system, and the kinds of situation in which this would happen
- Adopting the perspective of a user
- e.g. based on personas that are concrete representations of the type of person the
system is designed for
- e.g. based on personas that are concrete representations of the type of person the
Heuristic Evaluation #1
- Heuristic evaluations referes to examination of a user interface by experts against a list of principles or guidelines
- Originally developed by Jakob Nielsen as discount usability engineering
- In the early days of HCI, more reluctance of business to invest in usability and concern over costs of studies with actual users
- Instead, a small number of experts examine the interface and judge its compliance with “heuristics” for good design
- Relatively quick and effective for finding most of the problems
Heuristic Evaluation #2
- Now used for any “quick and dirty” evaluation where the aim is to get useful and informed feedback as soon as possible
- Quick review against usability principles by a single reviewer
- Rigorous process with multiple evaluators, and rating of problems
- Not really effective for evaluating one’s own design
Heuristic Evaluation #3
- A small group of experts examine the interface and judge its compliance with recognized usability principles (the “heuristics”)
- Either just by inspection or by scenario-based walkthrough
- Evaluators independently produce a list of critical issues, weighted by severity grade
- Evaluators only communicate afterwards
- Opinions are consolidated into one report
Heuristic Evaluation #4
- Implicit assumptions
- There is a fixed list of desirable properties of user interface (the “heuristics”) that can be checked by experts with a clear result
- The issues that experts identify are a good prediction of the problems that users would have
- If the number of evaluators is sufficiently large, most problems will be found
Nielsen’s 10 Heuristics
- Visibility of system status
- Match between system and the real world
- User control and freedom
- Consistency and standards
- Error prevention
- Recognition rather than recall
- Flexibility and efficiency of use
- Aesthetic and minimalist design
- Help users recognize, diagnose, and recover from errors
- Help and documentation
Other Heuristics
- Heuristic evaluation can be done against any list of heuristics, and designers and evaluators can adapt these - For example, Cognitive Design Principles by Jill Gerhardt-Powels
1. Automate unwanted workload
2. Reduce uncertainty
3. Fuse data
4. Present new information with meaningful aids to interpretation
5. Use names that are conceptually related to function
6. Group data in consistently meaningful ways
7. Limit data-driven tasks
8. Include in the displays only that information needed by the user at a given time
9. Provide multiple coding of data when appropriate
10.Practice judicious redundancy
How many Evaluators?
Nielsen & Landauer’s model for the finding of usability problems
- Suggest use of 3-5 experts for a Heuristic Evaluation
- Widely used to argue that it is sufficient to evaluate a user interface with 5 people
Limitations of Heuristic Evaluation
Woolrych and Cockton (2000) conducted a large-scale trail of Heuristic Evaluation showing limitations to finding problems users actually experience.
- Evaluators were trained in using HE, and then evaluated the interface of a drawing editor. The interface was then trialled with actual users
- Many of the issues the experts identified were not experienced by users (false positives)
- Many false positives stemmed from a tendency by the experts to assume people had no intelligence, or even common sense
- Some more severe problems were missed by the evaluators
- Problems users experienced as a result of a series of mistakes and misconceptions rather than isolated misunderstandings
Heuristic Evaluation - Key Points
- Heuristic evaluation is quick, cheap and easy to do
- Effective as formative method
- Identifying issues for improvement early in development
- HE is subjective and depends on experience
- Relies very much on the knowledge, creativity and experience of the evaluator
- HE has limitations in predicting problems
- “Usability checklists and inspections can produce rapid feedback, but may call attention to problems that are infrequent or atypical in real worlds use.” (Rosson, Carroll & Hill, 2002)
- HE is not a replacement for evaluation with users
Cognitive Walkthrough
Cognitive walkthrough is a formal method for checking through the detailed design of steps in an interaction
- Focuses on first-time use
- What happens when somebody uses a system for the first time
- Evaluates learnability of a system
- How well does the system support learning through exploration
- Task-oriented
- Requires tasks and walkthrough scenarios
- Fundamental question : “Will users be able to follow this scenario”
- Can you tell a believable story that users would be doing this
- Requires that evaluators have good understanding of user capabilities
Input to the Process
Cognitive walkthrough follows a rigorous procedure, starting with gathering input to the process, followed by the walkthrough
- Input to process are:
- A clear understanding of the people who are expected to use the system (can use on personas)
- A set of concrete scenarios representing both common and uncommon but critical sequences of activities
- A complete description of the interface
- Can be a paper prototype
- Must have a comp
Walkthrough Procedure
For each scenario, and each individual step (=action) in the scenario, the analyst asks the following four questions:
- WIll the people using the system try to achieve whatever effect the action has?
- Does the user understand that this step is needed to reach their goal? (Mental model)
- Will they be able to notice that the correct action is available?
- Visibility
- Will they associate the correct action with the effect they are trying to achieve? (or might they select a different action instead?)
- Labels and signifiers
- If the correct action is performed, will people be able to tell that progress is being made towards the goal of their activity?
- Feedback
Outcome
Learning about initial user experience and how to improve it
- Do users understand how to carry out tasks
- WIll users really see the right control
- Will they understand this is the right control
- Will they understand labels, signifiers
- WIll they understand the feedback
By the end of this simple procedure, the evaluators will have found some number of missing goals, missing affordances, gulfs of execution, and gulfs of evaluation
Research & Ethics
- By conducting research ethics reviews and adhering to guidelines we ensure that the work we are going to do is fair, safe, and does no harm
- (BAD!) Examples:
- Milgram (1963) – Electric Shocks
- Harlow (1958) Rhesus Monkeys – Maternal Deprivation
- Zimbardo (1971) Stanford Prison Experiment – Situational Variables
Research & Ethics
- By conducting research ethics reviews and adhering to guidelines we ensure that the work we are going to do is fair, safe, and does no harm
- (BAD!) Examples:
- Milgram (1963) – Electric Shocks
- Harlow (1958) Rhesus Monkeys – Maternal Deprivation
- Zimbardo (1971) Stanford Prison Experiment – Situational Variables
Introduction to Ethics #1
- By conducting research ethics reviews and adhering to guidelines we ensure that the work we are going to do is fair, safe, and does no harm
- Promotes moral and social values
- Promotes the principal aim of research works (truth, knowledge, avoid error)
- Promotes value vital for collaborative work (trust, fairness, respect)
- Ensures researchers are held accountable (often research uses public money)
- Builds public support for research work
Introduction to Ethics #2
Honesty: Be honest when stating research methods, procedures, or reporting data or results. Avoid any urge to falsify data or methodology. It could come back to haunt you.
Integrity: Keep to your agreements and promises. Be consistent.
Openness: You have to be open to suggestions, new ideas, and criticisms (constructive or not). Share data, resources, tools, and ideas.
Confidentiality: Do not expose confidential records.
Respect intellectual property: Honor copyrights and other intellectual property forms. Seek permission before using unpublished results, data or procedures.
Respect each other!
Consent
These principles make sure that participation in studies is voluntary, informed, and safe
- Voluntary: Do not coerce people into taking part, or offer substantial rewards
- Informed: Provide accurate information about what the study entails
- Safe: Do not expose participants to unsafe situations or practice.
We also need to gain consent when using human participants
Consent Example
- Participants must be able to make an informed decision
- Understand what the study involves, what data it will collect and how the data will be used, and any risks
- This is often a formalized process