- Evaluation is a process of systematically assessing a system, guided by questions about the system and evaluation criteria - Does the system work as it is supposed to work? - Does it comply with relevant standards? - Is the performance improved over a previous version? - You need to have a clear objectives in order to plan your evaluation - What questions should the evaluation answer? - If you have developed a system, what claims would you want to be able to make about it?

- Why evaluate with users? - Designers are experts in using in their own systems - That does not make them experts in usability and UX - Analytical evaluation is limited by the ability of the reviewer to test a system from a user perspective - Analytical evaluation can answer some questions but not other - Why evaluate without users, first? - Many problems can be found analytically - Rigorous testing of interactive workflows by the design team - Respect users and their time

- User are given typical tasks to perform with a prototype, part of a system, or finished product - Identifying usability issues for improvement (formative evaluation) - Validating the design against project goals (summative evaluation) - Qualitative focus on issue, i.e. problems users encounter when they try to complete tasks - But usability issues can also be quantified by measuring frequency of issues

Usability issues are problems users encounter toward achieving their goals Something they are not able to do, or find difficult to do Something they do that leads to problems Examples User actions that prevent task completion Performing an action that leads away from task success Not seeing something that should be noticed Participants says task is completed when it isn’t User misinterprets some information presented to them

- Lab studies focus on performance and user experience - The purpose of design is to achieve an improvement of something - Develop a prototype, system, app or product that in some respect is better than what we had before - e.g., more efficient, easier to use, faster to learn, less error-prone, ... - Users are given tasks to perform under controlled conditions - Observing the effect of specific designs on performance (e.g., completion time, error rate) and/or user experience (user-reported ratings)

User Studies and UX Metrics Flashcards by Unknown Unknown

Evaluation

Evaluation is a process of systematically assessing a system, guided by questions about the system and evaluation criteria
- Does the system work as it is supposed to work?
- Does it comply with relevant standards?
- Is the performance improved over a previous version?
You need to have a clear objectives in order to plan your evaluation
- What questions should the evaluation answer?
- If you have developed a system, what claims would you want to be able to make about it?

How well did you know this?

Not at all

Perfectly

Evaluation of Usability and UX

For systems that are used interactively, a main concern is their usability and user experience
- How well can users can learn and use a
  system to achieve their goals
- How satisfying is the use of the system
Any evaluation should be guided by clear objectives and questions
- Can users complete the tasks the system is
  meant to support?
- Would a first-time user be able to figure out
  how to use the system?
- Has a change made in the interface had the
  desired impact? …
- Can users achieve their goals faster or with
  less effort, compared to earlier versions, or
  competing products?

How well did you know this?

Not at all

Perfectly

Forms of usability evaluation

Analytical evaluation
- Informal or formal review, for example using scenarios, guidelines, checklists or models
- By the design team and/or external reviewers (usability experts)
Empirical evaluation
- Evaluation with users (“User studies”)
- Assessment based on observation

How well did you know this?

Not at all

Perfectly

Forms of usability evaluation

How well did you know this?

Not at all

Perfectly

User Studies

Why evaluate with users?
- Designers are experts in using in their own systems
- That does not make them experts in usability and UX
- Analytical evaluation is limited by the ability of the reviewer to test a system from a user perspective
- Analytical evaluation can answer some questions but not other
Why evaluate without users, first?
- Many problems can be found analytically
- Rigorous testing of interactive workflows by the design team
- Respect users and their time

How well did you know this?

Not at all

Perfectly

Types of User Studies

Usability tests
- Focus on identifying usability issues
  - Problems that users encounter when they use a system
Lab Studies
- Focus on user performance and user experience
- Controlled experiments, often to compare interfaces or systems
Field study
- Focus on use in the real-world
- Little or no control over the interaction, but observing use in context

How well did you know this?

Not at all

Perfectly

Usability Tests

User are given typical tasks to perform with a prototype, part of a system, or finished product
Identifying usability issues for improvement (formative evaluation)
Validating the design against project goals (summative evaluation)
Qualitative focus on issue, i.e. problems users encounter when they try to complete tasks
But usability issues can also be quantified by measuring frequency of issues

How well did you know this?

Not at all

Perfectly

Usability issues

Usability issues are problems users encounter toward achieving their goals
Something they are not able to do, or find difficult to do
Something they do that leads to problems
Examples
- User actions that prevent task completion
- Performing an action that leads away from
  task success
- Not seeing something that should be
  noticed
- Participants says task is completed when it
  isn’t
- User misinterprets some information
  presented to them

How well did you know this?

Not at all

Perfectly

Issue-based Metrics

Using metrics to prioritise improvements
Pareto Principle (80/20 rule):
20% of the effort will generate 80% of the results
Example: problem frequency in a usability study

How well did you know this?

Not at all

Perfectly

“What one thing would you improve”

Asking users at the end of the usability test, what one problem to fix
Coding responses to identify categories
Example: top five cover 75% of suggested improvements

How well did you know this?

Not at all

Perfectly

How many users for a test?

“Five Participants is Enough”
* It is widely believed that >75% of issues are found by the first five users (Nielsen’s model)
* “Testing one user is 100 percent better than testing none”

“You can find more problems in half a day than you can fix in a month” (Steve Krug)
Do not expect to find and fix all issues
Some issues can only be discovered after other issues have been fixed
What works for most people might remain an issue for some people

How well did you know this?

Not at all

Perfectly

Lab Studies

Lab studies focus on performance and user experience
The purpose of design is to achieve an improvement of something
- Develop a prototype, system, app or product that in some respect is better than what we had before
- e.g., more efficient, easier to use, faster to learn, less error-prone, …
Users are given tasks to perform under controlled conditions
Observing the effect of specific designs on performance (e.g., completion time, error rate) and/or user experience (user-reported ratings)

How well did you know this?

Not at all

Perfectly

Comparative Evaluation

Lab studies are usually comparative
Comparing a new user interface with of a previous version
- Is there an improvement?
Benchmarking of a new interactive system against the best existing solution (comparison against a “baseline”)
- Important in research and innovation
Comparing alternative designs to see which one works best
- Formative studies

How well did you know this?

Not at all

Perfectly

Controlled Experiments

Lab studies are conducted as controlled experiments
Experiments are an empirical research method for answering questions of comparison and causality
- “Does the new feature added to the UI cause a lower error rate?”
- “Is search engine A more effective in finding what users are looking for than search engine B?”
The aim of an experiment is to determine cause-effect relationship between variables

How well did you know this?

Not at all

Perfectly

Principles of Experiment Design

Reduction to observation of specific variables
- Reducing a question about cause and effect to specific variables that can be manipulated and specific variable that can be observed
Repetition: repeated runs/trials to gain sufficient evidence
- Experiments study a relationship between variables; Repetition is necessary to build up evidence of the relationship
Control to limit confounding factors
- Expertiments are controlled to minimize the influence of other variables on the observed effect.

How well did you know this?

Not at all

Perfectly

Variables in Experiments

Study These Flashcards

Independent variables
Something that is manipulated or systematically controlled
In HCI experiments, we call an independent variable a factor
Factors are manipulated across multiple levels (at least two)
Each combination of factor and level defines a test condition
e.g. factor Search Engine with levels [Google, Bing]
Dependent variables
Something we measure in the experiment, as an effect
In HCI lab studies: a human behaviour or response

Example

Study These Flashcards

Webcomic xkcd ran a study to see what men
and women call different colours
Factors:
Gender
Colour they were shown (RGB)
Gender is controlled
Colour is manipulated
Dependent variable
The colour name they typed in

Planning User Studies

Study These Flashcards

Five practical steps to follow for a lab study / experiment
Step 1: Define your study objectives
Step 2: Identify your variables
Step 3: Design the experiment: tasks, procedure, setup
Step 4: Recruit participants and run the study
Step 5: Evaluate and report the outcome

Define your study objective

Study These Flashcards

A clear objective is essential for deciding on your study approach
Is the study formative or summative?
What question(s) should the study answer?
What will the results be used for after the study?
If you conduct an evaluation of something you designed …
What do you want to be able to say/claim about your design?
What defines “better performance” or “better user experience” for your
design?
What should it be compared against?

Reflect user goals in your study objectives

Study These Flashcards

What are the assumptions about the users’ goals?
Are the users required to use the system regularly? Or will they only use it occasionally?
What alternatives do they have to using the system?
In what kind of situations will they use the system?
When they are busy? When they are bored? When they under extreme stress?
What matters most to the user?
Complete tasks as quickly as possible? Feel in control? Not making
mistakes? Have fun interacting? Feeling immersed?

Identify your variables - Factors

Study These Flashcards

What are the factors and conditions that you want to study and compare?
Examples:
Comparing three products – one factor with three levels (1x3)
Interface with new feature v. prior version – one factor, two levels (1x2)
Two calendar apps, on small v. large screen – four conditions (2x2)
Two input devices, left- v. right-handed people – four conditions (2x2)
Focus on one factor if possible (keep it simple)
More factors make it harder to determine cause-effect relationships
Aim for small number of conditions, large number of repetitions

Identify your variables - Data collection

Study These Flashcards

What measurements do you take? What data do you collect?
What aspect of usability or user experience do you want to evaluate?
- Effectiveness: ability to complete a task accurately
- Efficiency: the amount of effort required to complete a task successfully
- Satisfaction and other aspects of user experience
What type of measurement? What metrics?
- Performance measurement: task success, time, error rate, …
- Self-reported metrics: user ratings / questionnaire scores

Types of measurement in user studies

Study These Flashcards

Performance measures
- Measuring user performance on tasks they are given
- Observation or automated logging of performance data
Self-reported metrics
- Measuring user experience and their perception of the interaction
- Using rating scales and questionnaires as instrument
Behavioural and Physiological metrics
- Measuring the response of the body during interaction with a system
- e.g. eye-tracking to measure what users look at

Example: Usability metrics in ISO 9241-11:1998

Study These Flashcards

Performance Measures

* Performance measures assess * Effectiveness: ability to complete a task accurately * Efficiency: the amount of effort required to complete a task successfully * Measuring task success, time, errors * Performance evaluation relies on clearly defined tasks and goals * Users are given tasks to accomplish * Task success has to be clearly defined * Performance evaluation can focus on different usability aspects * e.g. learnability: how long it takes to reach proficiency

Task Success

* Task success is a fundamental measure of effectiveness * Task success rate: percentage of users who succeed on a task * Requires clear definition of a task and of an end state to reach * Requires clear criteria for pass/fail * Giving up – users indicate they would give up if they were doing this for real * Moderator calls it – when user makes no progress, or becomes too frustrated * Too long – certain tasks are only considered successful if done in time limit * Wrong – user thinks they completed successfully but they did not

Example: AED

* Usability evaluation of Automated External Defibrillators (AED) * Are lay people able to use defibrillators successfully? * Comparison of 4 Devices * 64 participants, 35-55 years, none from a medical background * Each device tested by a subgroup of 16 (“between-subject”) * Task: rush into a room where they find the mannekin fully-dressed and an AED nearby * Task success: successfully deliver a shock

Example: AED #2

Time on Task (Task completion time)

* Time on task is a basic measure for efficiency * Requires that there is clearly defined start and end of a task, for starting and stopping the clock * Great for comparative evaluation * The more often the same task is performed by the same user, the more important efficiency becomes * e.g. frequent data-entry, or information look-up * reduced time on task saves costs * Faster is not always most important for user experience

Time on Task (Task completion time) #2

* Time on task can vary, and improves with repetition * Repeated measures * Variance in performance has larger effect with shorter tasks * Using multiple trials of the same (type of) task to determine mean performance * Training effects * Is the goal to determine time of a first-time user or trained user? * How much training are users given before the evaluation starts? * Using blocks of trial to measure learning effect

Error rate

* The rate at which errors occur affects both effectiveness and efficiency * Speed-accuracy trade-off * Errors also effect user experience / satisfaction * Issues are the cause of a problem, errors the outcome * Error rate: average number of errors, for each task * Requires clear definition of what counts as an error * Based on what users do (actions) or fail to do * e.g. data-entry errors; wrong choices; key actions not taken

Efficiency

* Efficiency is the amount of effort required to complete a task successfully * Time is a good indicator but does not show whether a task was completed with the least effort required (Users don’t always take the shortest path) * Can also measure number of actions the user performs to complete a task, relative to optimum number of actions * e.g. number of clicks, menus opened, pages visited * Relevant for assessing the usability of transactions, navigation and information architecture

Example: Lostness

* How lost do users get on web sites? * N: Number of different web pages visited while performing a task * S: Total number of pages visited, counting revisits to same page * R: Minimum (optimum) number of pages that must be visited to accomplish the task

Choice of metrics

* Choose metrics and collect data that reflect the study objectives * For example * Study completion of transactions (bookings etc) – task success, user satisfaction (perceived usability) * Study frequent use of the same product – ease of use, efficiency * Usability for a critical product – fast learnability, no errors * Comparing products that offer the same service – can be different criteria but satisfaction is important

User Studies and Metrics - Key Points

* Choose appropriate metrics based on your study goals * Clearly identify your variables * Assign a name to the factors you study and to the levels or test conditions and use these consistently * Assign a name to the dependent variables, and report the units in which they are measured * Pilot test how measurements are taken to ensure that data is recorded consistently and correctly (also when the data collection is manual or by questionnaire)

User Studies and UX Metrics Flashcards

(35 cards)