10 Evaluation Flashcards

1
Q

Why Evaluate?

A

Well-designed products sell
To ensure that system matches the users‘ needs
To discover unforeseen problems
To compare your solution against competitors ( „We are x % better than…“)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Where to Evaluate?

A

Naturalistic Approach: Field Studies

Usability Lab

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When to Evaluate and who evaluates when?

A

Evaluation should happen throughout the entire software development process

Early designs: evaluated by the design team, analytically and informally
Later implementations: evaluated by users, experimentally and formally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Evaluation Methods

A
  1. Determine the Goals
  2. Explore the Questions
  3. Choose the Approach and Methods
  4. Evaluate, Interpret & Present Data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Important aspects in creating an evaluation process?

A

Reliability: can the study be replicated?
Validity: is it measuring what you expected?
Biases: is the process creating biases?
Scope: can the findings be generalized?
Ethics: are ethics standards met?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

External vs Internal Validity

A

External validity
-> confidence that results apply to real situations
-> usually good in natural settings
Internal validity
-> confidence in our explanation of experimental results
-> usually good in experimental settings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ethics Approval

A

Researchers must respect the safety, welfare, and dignity of human participants in their research and treat them equally and fairly*

Criteria for approval:

  • research methodology
  • risks or benefits
  • the right not to participate, to terminate participation, etc.
  • the right to anonymity and confidentiality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ethics - Before the test (5 things)

A
Only use volunteers
Inform the user
Maintain privacy
Make users feel comfortable
Don’t waste the user’s time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ethics - During the test (4 things)

A

Maintain privacy
Make users feel comfortable
Don’t waste the user’s time
Ensure participant health and safety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Ethics - After the test

A

Inform the user
Maintain privacy
Make users feel comfortable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Usability Testing

A

Focus on: how well users perform tasks with the product (time to complete task and number & type of errors)

-> Controlled environmental settings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Signal & Noise Metaphor

A

Experiment design seeks to enhance the signal (variable of interest),
while minimizing the noise (everything else (random influences))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Controlled Experiment: Steps

A
  1. Determine the goals, explore the questions, then formulate hypothesis
  2. Design experiment, define experimental variables
  3. Choose subjects
  4. Run pilot experiment
  5. Iteratively improve experiment design
  6. Run experiment
  7. Interpret results to accept or reject hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Experimental Variables

A
  • Independent Variables
  • Dependent Variables
  • Control Variables
  • Random Variables
  • Confounding Variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Independent Variable - Definition & Examples

A

An independent variable is under your control
Independent because it is independent of participant behavior

Interface, device, button layout, visual layout, feedback mode, age, gender, background noise, expertise, etc.

Must have at least two levels (values/settings) -> test conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dependent Variable - Definition & Examples

A

measured human behavior, depends on what the participant does
is measured during the experiment

Task completion time, speed, accuracy, error rate, throughput, target re-entries, task retries, presses of backspace, etc.

17
Q

Control Variable - Definition & Examples

A

a circumstance that is kept constant

more control -> less variability, less generalizable

18
Q

Random Variable - Definition & Examples

A

circumstance that is allowed to vary randomly -> more variability (bad), but more generalizable

19
Q

Confounding Variable - Definition & Examples

A

circumstance that varies systematically with an independent variable

20
Q

Experiment Task - Good Task Qualities:

A

Represent activities people do with the interface

Discriminate among the test conditions

21
Q

Hypothesis vs Claim

A

A claim predicts the outcome of an experiment
Example: Reading a text in upper case takes longer than reading it in sentence case
A hypothesis claims that changing independent variables influences dependent variables
Example: Changing the case (independent variable) influences reading time (dependent variable)

  • > Experiment goal: Confirm hypothesis
  • > Statistical approach: Reject null hypothesis
22
Q

Statistical Tests - 2 Types

A

Parametric
-> Data are assumed to come from a distribution, such as the normal distribution, t-distribution, etc.
Non-parametric
-> Data are not assumed to come from a distribution

23
Q

Statistical Tests - Which test for nominal and ordinal (gender, age groups, …)

A

Non-parametric tests (e.g., Chi-square test)

24
Q

Statistical Tests - Which test for Interval and Ratio (temperature in C or K, …)

A

Parametric tests (e.g., t-test, ANOVA), or Non-parametric tests

25
Q

too few vs too many participants?

A

Too few: experimental effects fail to achieve statistical significance
Too many: statistical significance even for very small effect sizes

26
Q

Within-subjects, Between-subjects

A

Within-subjects: each participant is tested on each condition
Between-subjects: each participant is tested on one condition only

27
Q

Order Effects and how to avoid them

A

Order effects / learning effects can occur when the same participant is doing a similar task multiple times

-> only relevant for within-subject factors

Avoid them by:

  • participants divided into groups, with different orders for test conditions (latin square)
28
Q

Longitudinal Studies

A

research that seeks to promote and investigate learning

-> practice is the independent variable

29
Q

Analytical Evaluation Methods (2)

A

heuristic evaluation

cognitive walkthrough

30
Q

Golden rules of UI design

A
  1. Keep the interface simple
  2. Speak the user‘s language
  3. Be consistent and predictable
  4. Make things visible and provide feedback
  5. Minimize the user‘s memory load
  6. Design for error: Avoid errors, help to recover from errors, offer undo
  7. Design clear exits and closed dialogs
  8. Include help and documentation
  9. Offer shortcuts for experts
  10. Make the system responsive
31
Q

Heuristic Evaluation- How many evaluators?

A

3-5 evaluators

32
Q

Cognitive Walkthrough

A

Experts “walk” through the design prototype with usage scenario(s)

Experts analyze each task following 3 questions:

  1. Will the correct action be sufficiently evident to the user?
  2. Will the user notice that the correct action is available?
  3. Will the user associate and interpret the response from the action correctly?
33
Q

Model-Based Evaluation - 3 Examples

A

GOMS
Keystroke Level Model („daughter“ model of GOMS)
Fitt‘s Law

34
Q

GOMS - Name and Main princile

A

use model of execution time for basic tasks to predict how long a sequence of actions takes

GOMS = Goals, Operators, Methods, Selection rules

(Selection rules decide which method to select when there is more than one)

35
Q

Keystroke Level Model

A

refinement of GOMS that provides a quantitative model about execution times
assigns each operator a context-independent average duration