Block 4 - Unit 1: An evaluation framework Flashcards by Rob Someone

Key points for evaluation. (4)

Evaluation is a key activity in ID lifecycle.

An essential requirement for any interactive product is understanding users’ needs.

Needs of users can be usefully expressed as goals for the product - both usability and UX goals.

Purpose of evaluation - check users can use the product and like it. Assess how well goals have been satisfied in a design.

How well did you know this?

Not at all

Perfectly

3 main approaches to evaluation?`

Usability testing.

Field studies.

Analytical evaluation.

(Each differs according to its theories, philosophies (beliefs) and practices for evaluation.

How well did you know this?

Not at all

Perfectly

Methods (def and 6 examples)

Practical techniques to answer questions set in relation to an evaluation goal, include:

Observing users.

Asking users their opinions.

Asking experts their opinions.

Testing users’ performance.

Modelling users’ task performance.

How well did you know this?

Not at all

Perfectly

Opportunistic evaluation.

Designers informally and quickly get user feedback from users / consultants to confirm their ideas are in line with users’ needs, and are liked.

Generally used early and needs only low resources.

‘Quick and dirty’.

How well did you know this?

Not at all

Perfectly

Evaluation and accessibility. (Include example).

If system should be usable by disabled, must evaluate both ‘technical accessibility’ (can user physically use?) and usability.

Eg. Blind user - screen reader might technically access data in a table, but user also need to read cells in a meaningful and useful way, eg. access contextual info about cells - relate cells to rows / columns.

How well did you know this?

Not at all

Perfectly

6 evaluation case studies (SB)

Early design ideas for a mobile device for rural Indian nurses.

Cell phones for different world markets.

Affective issues - collaborative immersive game.

Improving a design - Hutchworld patient support system.

Multiple methods help ensure good usability - olympic messaging system.

Evaluating a new kind of interaction - an ambient system.

How well did you know this?

Not at all

Perfectly

DECIDE intro.

Well planned evaluations are driven by ‘goals’ which aim to seek answers to clear ‘questions’, whether stated up front or emerge.

Questions help determine the kind of ‘evaluation approach’ and ‘methods’ used.

‘Practical issues’ also impact decisions.

‘Ethical issues’ must also be considered.

Evaluators must have enough time and expertise to evaluate, analyse, interpret and present the ‘data’ they collect.

How well did you know this?

Not at all

Perfectly

DECIDE framework checklist.

Determine the ‘goals’.

Explore the ‘questions’.

Choose the ‘evaluation approach and methods’.

Identify the ‘practical issues’.

Decide how to deal with the ‘ethical issues’.

Evaluate, analyse, interpret and present the ‘data’.

(Common to think about and deal with items iteratively, moving backwards and forwards between them. Each is related to the others).

How well did you know this?

Not at all

Perfectly

Determine the goals and Explore the questions. (3 points)

Determine ‘why’ you are evaluating - high-level goals.

If evaluating a prototype the focus should match the purpose of the prototype.

Goals identify the scope of the evaluation and need to be specific rather than general; identifying questions based on these goals clarifies the intention of the evaluation further.

How well did you know this?

Not at all

Perfectly

Example of general -> specific goal.

‘Help clarify users’ needs have been met in an early design sketch.’

More specific goal statement:

‘Identify the best representation of the metaphor on which the design will be based.’

How well did you know this?

Not at all

Perfectly

How to make goals operational (DECIDE)

We must clearly articulate questions to be answered.
Eg. what are customers’ attitudes to e-tickets (over paper)?

Questions can be broken down to very specific sub-questions to make evaluation more finegrained.
Eg. ‘Is interface poor?” to “… difficult to navigate?”, “… terminology inconsistent?”, “… response slow?”, etc.

How well did you know this?

Not at all

Perfectly

What will an evaluation be focused on?

Guided by key questions, and any other questions based on the usability criteria to see how well usability goals have been satisfied.

Usability criteria - specific quantified objectives to assess if goal is met.

Also, how well UX goals have been satisfied - how interaction / experience feels to the user (subjective).

UX usually evaluated qualitatively, eg. ‘users shopping online should be able to order an item easily without assistance.
Possible to use specific quantified objectives for UX goals, eg. ‘85% + should be able to order without assistance.’

How well did you know this?

Not at all

Perfectly

What effects choice of approaches / methods of evaluation.

Approach influences the kinds of methods used.
Eg. analytical evaluation - methods directly involving users won’t be used.

Choice of methods:
Where you are in the lifecycle.
Goals being assessed.
Practical issues - time, money, technology, appropriate participants.

WHAT you are evaluating and type of data being collected.
Eg. low-fi prototypes - any time in lifecycle, but predominantly useful fo qualitative data, or assessing certain UX goals or interface features (eg. underlying metaphor).

How well did you know this?

Not at all

Perfectly

Why use more than one evaluation approach / method?

Often choosing just one approach is too restrictive for evaluation.
Take a broader view - mix and match approaches / methods according to goals, questions and practical / ethical issues.
Eg. methods used in field studies tend to involve observation, interviews or informal discussions.

Combining methods for evaluation study, especially if complementary, can give different perspectives for evaluation, and may help to find more usability problems than a single method might.

How well did you know this?

Not at all

Perfectly

Usability defect (problem)

A difficulty in using an interactive product that affects the users’ satisfaction and the system’s effectiveness and efficiency.
Usability defects can lead to confusion, error, delay or outright failure to complete a task on the part of the user. They make the product less usable for the target users.

How well did you know this?

Not at all

Perfectly

Identify practical issues.

Study These Flashcards

Many issues - important to identify as many as possible before study.
Pilot study is useful to discover surprise events.

Issues include:

users
facilities and equipment
schedule and budget constraints
evaluators expertise

May need to compromise, eg. less users for shorter period (budget).

Users (practical issues) (3 points)

Study These Flashcards

Where possible, a sample of real (prospective / current) users should be used, but sometimes representatives of the user group are necessary (identified in requirements).
Eg. experience level, age / gender, culture, education, personality.

How will users be involved?
Tasks in lab should represent those for which the product is designed.
Users should get frequent (every 20 mins) and feel at ease - product is being tested.

Field studies - onus is on evaluators to fit in with users and cause as little disturbance as possible.

Facilities and equipment (practical issues) (3 points)

Study These Flashcards

Video - where to place cameras? - can change behaviour.

If you’re not confident of observer skills, ask participant to use ‘think-aloud’ protocol.
Or, ask if you need them to pause to write notes.

Audio / video takes time to analyse - aproximately 6 hours for 1 hour of video.

Schedule and budget constraints (practical issues). (1 point)

Study These Flashcards

Usually have to compromise according to resources and time available.

Expertise (practical issues).

Study These Flashcards

Different requirements for different evaluation methods.

Eg. user tests - knowledge of experimental design and video recording.
Consult a statistician before, if there’s a need to analyse results using statistical measures, and after during data collection / analysis.

Accessibility and evaluation methods - practical issues. (2)

Study These Flashcards

Asking users:

interviews for deaf / dumb - written questions / answers, or sign language.
questionnaires for blind - describe designs, emulate screen reader.

Asking experts:
- may need to provide information not easily attained by users.
Eg. descriptions of visual images - expert may be best judge of accuracy of description.

Why are ethical issues important to evaluation?

Study These Flashcards

Users are in unfamiliar situations.

Privacy should be protected - name not associated with data.

4 principles for ethical issues. (3 ACM + 1)

Study These Flashcards

Ensure users an those who will be affected by a system have their needs clearly articulated during the assessment of requirements.

Articulate and support policies that protect the dignity of users and others affected by a computing system.

Honour confidentiality.

Ask users’ permission in advance to quote them, promise anonymity and offer to show the report before it’s disclosed.

Some other ethical issues.

Study These Flashcards

Web use - activity can be logged, possibly without knowing.
Privacy, confidentiality, informed consent.

Children - legal issues; may need parent / teacher present.

English not first language - may miss nuances, possibly causing bias.

Speech impairment, learning difficulty, etc. - may need helper / interpreter.
Should address remarks to participant, not intermediary.

Cultural constraints may make criticism difficult - eg. Japanese ‘polite’.

Evaluate, analyse, interpret and present the data (overview).

Need to decide what data needed to answer study questions, how it will be analysed and how findings are presented. Method used often determines the type of data collected, but there are still choices. Eg. should data be treated statistically?

General question areas for final 'E'. (5)

Reliability. (Consistency) Validity. Biases. Scope. Ecological validity.

Reliability (consistency) - final 'E'.

How well the method produces the same (or similar) results under the same circumstances, even if done by a different evaluator. Different methods have different degrees of reliability. Eg. carefully controlled experiment - high; observation in natural settings - variable; unstructured interview - low.

Validity - final 'E'.

Does evaluation method measure what is intended. Covers method and how performed. Eg. goal to find how product is used in homes - don't plan a lab session.

Biases - final 'E'.

Occurs when results are distorted. Eg: experts more sensitive to design flaws than others; observers may not notice certain behaviours they don't deem important, ie. selectively gather data; interviewers may unconsciously influence responses - tone, expression, phrasing of questions.

Scope - final 'E'.

How much can evaluation's findings be generalised? Eg. some modelling methods, eg. keystroke model, have a narrow, precise scope. Predicts expert, error-free behaviour so results can't be used to describe novices learning to use the system.

Ecological validity - final 'E'.

How environment influences or distorts results. Eg. lab experiments have low ecological validity as results are unlikely to represent what happens in the real world; ethnographic studies don't impact the environment as much, so have high ecological validity. Ecological validity is also affected when participants are aware they're being studied. "Hawthorne effect", eg. placebo

Evaluating (UB - paragraph)

If it's important that findings should be generalisable to a wider group than the sample used, you should design the study to have the qualities of reliability and validity, avoids biases and uses representative sample.

Analysis (UB - sentence and 3 steps)

Evaluation can generate lots of data - needs turning into info on which you can make decisions. 3 steps: 1. Collating the data - gathering all data collected (of all types) and organising it for processing. 2. Summarising the data - extracting key comments from the collated data. Apply statistical tests if applicable. 3. Reviewing the data - assess if usability and UX goals met. If not, outcome should be usability problems to be dealt with.

Interpretation (UB - paragraph and 3 steps).

Process of actively considering what caused the problems that have been identified, and what to do about them - ie. implications of the evaluation data and findings for the design of the interactive product. 3 steps: 1. Finding causes for the usability problems that have been identified during analysis and rating the seriousness of each. 2. Proposing recommendations for changes to the design, to address these problems. 3. Reporting on your findings and recommendations.

Presenting data (UB)

How best to present results, how data has been analysed and to whom the results are being presented. Eg. to development team - video of edited highlights of difficulties users had can be powerful communication. Alternatively, as most usability tests have few people, tabulations, charts and rankings that provide a visual representation of data may be sufficient, as would descriptive stats, eg. mean, median and mode. Usually need 50 - 100 users to be statistically significant to the wider user population.

Problem with numbers.

Sometimes, people not comfortable with numbers tend to invest unwarranted authority in them, even when meaningless.

Block 4 - Unit 1: An evaluation framework Flashcards

(36 cards)