Observation Flashcards

1
Q

two basic classes / times for studies and evaluation

A
  • formative:
    at the beginning to inform about context and to study possible options
  • summative:
    to judge on the impact of a HCI design
    (a summative evaluation of a design might be a formative one for the next step)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why, what, where and when to evaluate

A
why: 
study question (check user' requirements and that they can use the product and they like it) 

what:
a conceptual model, early prototypes of a new system and later, more complete prototypes, human behaviour…

where:
in natural and laboratory settings

when:
* formative: throughout design;
* summative: finished products can be evaluated to collect information to inform new products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

three classes of measures

A

user effectivity

user efficiency

user satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

evaluation classes

A
  • setting
  • evaluation time
  • evaluation partner
  • result type
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

controlled settings

A
  • setting conditions are controlled
  • non-controllable conditions are measured
  • e.g. lab experiments, living labs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

natural settings

A
  • study in ‘everyday’ and natural conditions that cannot be controlled
  • some, but not all non-controllable conditions can be measured
  • e.g. field studies, in-the-wild studies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

types of evaluation time

A

inspective:
* inspection / evaluation while run of an experiment or while use

retrospective:
* evaluation after run of the experiment or after use

short term: short session
long term: long session

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

evaluation partners

A

the user:

  • gives direct feedback e.g. for use
  • best for gaining new insights into context
  • if its an experiment: called “subject”

the expert:

  • allows for best practice information
  • reported expert experience may require many users / test subjects to be collected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Result types

A

subjective:
* results cannot be directly compared between subjects

objective:
* results can be directly compared between subjects

quantative:
* results are numbers

qualitative:
* results are text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interviews - Five key issues

A
  1. setting goals
    decide how to analyze data once collected
  2. Identifying participants
    decide who to gather data from
  3. relationship with participants
    clear and professional, informed consent when appropriate
  4. Triangulation
    look at data from more than one perspective
    collect more than one type of data, e.g. qualitative from experiments and quantitative from interviews
  5. Pilot studies
    small trial of main study
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data recording

A
  • notes, audio, video, photographs can be used individually or in combination
  • always use a visual impression
  • different challenges and advantages with each combination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

three types of interviews

A

structured interviews

  • pre-developed questions
  • strictly following the wording
  • easy to carry out - but limited to the question set
  • more precise to evaluate

semi-structured interviews
* structured part + ‘open’ questions

unstructured interviews

  • used when little background information available
  • minimizes the influence of the questioner
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Running the interview - structure

A

Introduction - introduce yourself, explain the goals of the interview, reassure about the ethical issues, ask to record, present the informed consent form

warm-up - make first questions easy and non-threatening

main body - present questions in a logical order

a cool-off period - include a few easy questions to defuse tension at the end

closure - thank interviewee, signal the end, e.g. switch of the recorder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

encouraging a good response

A
  • make sure purpose of study is clear
  • promise anonymity
  • ensure questionnaire is well designed
  • follow-up with emails, phone calls, letters
  • provide an incentive
  • 40% response rate is good, 20% is often acceptable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard questionnaires used in HCI

A

SUS - system usability scale

TLX - NASA task load index

QUIS - Questionnaire for User interface satisfaction

CSUQ - Computer system usability questionnaire

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SUS - benefits and restrictions

A

+ very easy to scale (likert)
+ useful in small sample sizes with o.k. results
+ validity o.k. (you see differences in bad and good design)

  • Score 0-100 -> association with percentage
  • not diagnostic, just to classify
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

problems with online questionnaires

A
  • sampling is problematic if population size is unknown
  • preventing individuals from responding more than once can be a problem
  • individuals have also been known to change questions in email questionnaires
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Types of observation

A

direct observation in the field

  • structuring frameworks
  • degree of participation
  • ethnography

direct observation in controlled environments

indirect observation: tracking user’s activities

  • diaries, experience sampling method
  • interaction logging
  • video and photographs collected remotely by drones or other equipment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Planning and conducting observation in the field

A
  • decide on how involved you will be: passive observer to active participant
  • how to gain acceptance
  • how to handle sensitive topics, eg. culture, private spaces, etc.
  • how to collect the data:
    • what data to collect - what equipment to use - when to stop observing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Ethnography

A

Goal: to experience the participant and it’s context

Ethnographers immerse themselves in the culture that they study

analyzing video and data logs can be time-consuming

collections of comments, incidents and artifacts are made

co-operation of people being observed is required

informants are useful

data analysis is continuous

interpretivist technique

questions get refined as understanding grows

reports usually contain examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

online enthography

A

interaction online differ from face-to-face

virtual worlds have persistence that physical worlds do not have

ethical considerations and presentations of results are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

observations and materials that might be collected

A
  • activity or job descriptions
  • rules and procedures
  • descriptions of activities
  • recordings
  • informal interviews
  • diagrams (of the physical layout,…)
  • photographs, videos, workflow diagrams, process maps, …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

observation in a controlled environment

A

direct observation

  • think aloud techniques
  • also used in conjunction with other interview and questionnaire techniques

indirect observation

  • diaries
  • interaction logs
  • web analytics

video, audio, photos, notes are used to capture data in both types of observation

24
Q

Think Aloud

A

While using an application, a user is constantly explaining what he is thinking what he is doing

Quality of the evaluation depends on

  • selection of test candidates
  • appropriate preparation of the candidates
  • appropriate setting so that a natural usage can be guaranteed
25
Q

Think aloud preperation

A
  • explain the system
  • explain the setting
  • explain expectation
  1. using the scenarios prepared earlier, write a draft list of tasks
  2. try out the tasks and estimate how long they will take a participants to complete
  3. prepare a task sheet for the participants
  4. get ready for the test session
  5. tell the participants that it is the system that is under test, not them; explain and introduce tasks
  6. participants start the tasks. Have them give you running commentary on what they are doing, why they are doing it and difficulties or uncertainties they encounter
  7. encourage participants to keep talking
  8. When the participants have finished, interview them briefly about the usability of the prototype and the session itself. Thank them
  9. write up your notes as soon as possible and incorporate into a usability report
26
Q

Think aloud evaluation

A
  • qualitative, subjective mostly
  • ethnographic, delivers to the point experience for specific issues / problems
  • generalisations are very difficult, require high level of experience
  • interpretations can be done based on various different psychological theories and models
27
Q

Living labs

A
  • People’s use of technology in their everyday lives can be evaluated in living labs
  • such evaluations are too difficult to do in a usability lab
28
Q

Ubicomp Studies

A
  • Are field studies, not lab studies
  • In situ, means result includes measurements of the context
  • context and situation is not controlled
  • such studies are more expensive
  • more likely to find novel insight and experience
  • Ubicomp studies requrie additional effort
  • Ubicomp studies e.g. normally also require control conditions, prestudies, calculation of number of participants, selection of participants, data selection and statistics
29
Q

3 main types of ubicomp field studies

A

study current behavior:
* what are people doing now

proof-of-concept studies:
* does my technology function in the real world

experience studies.
* how does using my prototype change people’s behaviour or allow them to do new things

30
Q

Wizard of Oz studies

A

good for proof of concept

person simulates and controls system from behind the scenes

  • use mock interface and interact with users
  • good for simulating system that would be difficult to build
31
Q

Experience Studies

A

Surveys

  • often used as prestudy
  • carried out after any change of condition in a between-subject study
  • regular in-between survey while a study to measure change of participants reaction

Logging
* use the mobile device to also collect data about usage

32
Q

Logging - design considerations

A

how will you use the logged data?
* select appropriate data to log (at the right frequency)

make a list of specific questions that you expect to answer from the log data

will your logging help you know if the study is going smoothly?

33
Q

Logging - web analytics

A

A system of tools and techniques for optimizing web usage by measuring, collecting, analyzing and reporting web data

typically focus on the number of web visitors and page views.

34
Q

Experience Sampling Methodology (ESM)

A

ESM is a study method using questionnaires

Participants are asked to fill out short questionnaires at various points throughout the day

You get a different picture than to recall later

Considerations:

  • how often to ask the participant
  • how many questions
  • collect experience or sensor information
35
Q

Study Design

A

For any study:

A: start with a concrete research question

B: answer the following questions:

  • what will your participants do during the study
  • what data will you collect
  • how long will the study be
36
Q

steps to a successful study

A
  1. Have a clear research goal and question
  2. Create a study design document containing
    * 1. Research question / Hypothesis
    * 2. Detailed participant Profile
    * 3. Detailed method description (what will part. do)
    * 4. Detailed timeline description
    * 5. Types of Data you collect
    * 6. Analysismethod
    * 7. How you draw conclusion / validate hypothesis
37
Q

How long should your study be

A

Depends on type of study

  • experience studies (several weeks) are longer than proof of concept studies (serveral days)
  • studies of current behaviour may start from hours to weeks

Depends on novelty
* usage of novel systems is often very different at the start (enthusiasm or scepticism) and after longer period of use

Practical considerations

  • If it requires much effort from the participants you have to restrict measurement time
  • Frequency of need for interaction with participants: High frequency means shorter measurement time

Frequency of use
* High frequency of use reduces measurement of time required

38
Q

Things to consider when interpreting data

A

*Reliability
does the method produce the same results on separate occasions?

  • Validity
    does the method measure what it is intended to measure
    internal validity - external validity
  • Ecological validity
    does the environment of the evaluation distort the results? Is the result transferable to a general environment?
  • Biases: Are there biases that distort the results?
  • Scope: How generalizable are the results
39
Q

selecting participants

A

First you have to answer 3 questions before you start:

  • representation of participants to the intended user group
  • grouping of participants
  • data sampling strategy
40
Q

Representation of study participants

A
  • Representative Participant Set
  • Non-Representative Participation Set
  • Be careful: Many statistics assume a representative set
41
Q

Grouping Participants

A

one group only or multiple groups

group selection based on

  • self-reported experience
  • frequency of use
  • amount of experience
  • demographics
  • different activities the participants have to perform
42
Q

Sampling Strategy

A

Random sampling
* everyone has equal probability of being selected as participant based on a list

Systematic Sampling
* Based on predefined criteria, e.g. every 10th person entering the ECE Center

Stratified sampling
* Additionally, it is important to select people reflecting the distribution in your intended user group. So you care e.g. that your final set contains 50% male and 50% female

Samples of convenience
* Volunteer based. Must be adjusted to the wanted user group

43
Q

Sample Size

A

Depends on acceptable error!

  • Major problems can be identified by 3-4 people.
  • Early stage design require less participants
  • But this is an oversimplification

–> Perform a pre-test where participants have to first detect known usability issues,
caluculate averatge percentage of found usability issues over all participants
Gives you the percentage of found issue in average

44
Q

Test order

A

Participants learn fast - test order may have a significant influence on the outcome of the experiment
–> Reschedule order of tasks for each participant

This is not necessary with unrelated tasks.
Sometimes it is impossible because tasks depend on each other

45
Q

Types of Evaluation without user

A
  • Experts use their knowledge of users & technology to review software usability
  • Expert critiques can be formal or informal
  • Heuristic evaluation is a review guided by a set of heuristics
  • Walkthroughs involve stepping through a pre-planned scenario noting potential problems.
46
Q

Revised version of Nielsen’s original heuristics

A
  • Visibility of system status
  • Match between system and real world
  • user control and freedom
  • consistency and standards
  • error prevention
  • recognition rather than recall
  • flexibility and efficiency of use
  • aesthetic and minimalist design
  • help users recognize, diagnose, recover from errors
  • hep and documentation
47
Q

3 stages for doing heuristic evaluation

A
  1. briefing session to tell experts what to do
  2. Evaluation period of 1-2 hours in which
    * each expert works separately
    * take one pass to get a feel for the product
    * take a second pass to focus on specific features
  3. Debriefing session in which experts work together to prioritize problems.
48
Q

vorteile & nachteile heurisitic evaluation

A

+ few ethical problems - no users involved
+ few practical problems - no users involved

  • can be difficult to find experts
  • important problems may get missed
  • many trivial problems are often identified
  • experts have biases
49
Q

Cognitive Walkthroughs

A
  • focus on ease of learning and or usage
  • designer presents an aspect of the design & usage scenarios
  • Expert is told the assumptions about user population, context of use, task details
  • one or more experts walk through the design prototype with the scenario
  • experts are guided by questions
50
Q

cognitive walkthrough questions

A
  1. Will the correct action be sufficiently evident to the user?
  2. Will the user notice that the correct action is available?
  3. Will the user associate and interpret the response from the action correctly?
  4. If correct action is performed, will the user see that progress is made towards his goals?
51
Q

Pluralistic walkthrough

A

variation on the cognitive walkthrough, performed by a carefully managed team

The panel of experts begins by working separately

Then there is managed discussion that leads to agreed decisions

The approach lends itself well to participatory design

52
Q

Criteria for Creating and Measure of Mental Workload

A

Sensitivity
* index must be sensitive to changes in task difficulty or resource demand

Selectivity
* index should NOT be sensitive to changes unrelated to resource demands

Diagnosticity
* index should indicate not just that workload is varying but the cause of variation

(Un)obstusiveness
* an index should not interfere with or contaminate the primary task being assessed

Reliability (Reproducibility)
* index should produce the same estimate for a given task and operator

Bandwidth
* the index should respond to high-frequency changes in workload

53
Q

4 primary approaches to workload assessment

A

1) primary task: direct
2) secondary task: indirect
3) physiological correlates
4) subjective ratings: does not interfere with task, but subjective

54
Q

workload assessment - primary task

A

Measure performance metrics:
* time * speed * strength

Derived workload metrics
* no absolute value * difference in performance my indicate difference in workload

55
Q

workload assessment - secondary task

A

popular types:

  • rhythmic tapping task
  • random number generation
  • probe reaction time task
  • time estimation
  • time production
56
Q

workload assessment - physiological measurments

A
  • Heart rate (ECG), Muscle Activity (EMG), Brain Activity (EEG)
  • Respiration (GSR)
  • Oxygen uptake
  • Eye-Tracking

In principle precise, but

  • difficult to set-up
  • needs extensive physiological conditioning to bring subjects to same conditioning level
  • different to compare between subjects due to high variation in physiological condition

Conditioning:

  • Baseline Phase
  • Interaction Phase
  • Recover Phase
57
Q

workload assessment - subjective ratings - NASA TLX - 5 dimensions

A

Mental demand
* how mentally demanding was the task?

Physical demand
* how physically demanding was the task?

Temporal demand
* how hurried or rushed was pace of the task

Performance
* How successful were you in accomplishing what you were asked to do

Effort
* How hard did you have to work to accomplish your level of performance

Frustration
* How insecure, discouraged, irritated, stressed and annoyed were you?