11: A/B Testing Flashcards

1
Q

What is A/B Testing

A
  1. simple and controlled experiment
  2. randomly split traffic between 2(or more) versions
    - A. control, existing system
    - B. treatment, new version
  3. collect metrics of interest(dependant variable)
  4. analyse data
    - run statistical test to confirm difference not by chance
    - best scientific way to prove causality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give two examples of A/B testing and explain the variables

A

Facebook
Goal: encourage people to open up information they are comfortable to share
independent variables:
- version 1: broad set of choices with preset ‘recommended’
- version 2: settings grouped into smaller set of features, preset recommended
- version 3: similar to version 1 but with skip for now
- version 4: similar to version 2 but with skip for now
- version 5: similar to version 2 but without any presets
dependent variables: user preferences, amount of users that open up information
result: users preferred version 5, give control to the users

Amazon
Goal: get users to buy more items
independent variables: 
 - version 1: old webpage 
 - version 2: after add items to basket, user presented with "users who bought xx also bought xx"
dependent variables: amount of sales
result: version 2 was very successful
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the concept of user assignment in A/B testing

A
  1. good randomisation
  2. consistent assignment
  3. independent assignment
  4. monotonic ramp-up
    - as experiment is ramped up, user who are exposed to treatment must say in those treatments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Overall Evaluation Criterion OEC

A
  1. long term metric that company really cares about
    - time on site
    - visit frequency
  2. using short term metrics that predict long term value
  3. optimise for customer lifetime value
  4. determines whether to launch treatment
    - if experiment is negative, relook at metrics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ramp up and auto abort

A
  1. start experiment at 0.1%
  2. run simple analysis to make sure no problems
  3. ramp up to higher % and repeat till 50%
  4. detecting big difference is easy
    - detecting 10% requires only a small sample
    - detecting 0.1% is hard, run 50/50 for a longer time
  5. abort the experiment if treatment is significantly worse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A/B testing advantages

A
  1. test of causal relationships, not just correlation
  2. reduce effect of external factors
  3. ease of test design and scalability
    - decide on number of versions
    - split available traffic among versions
    - test a range of alternatives
  4. measures user actual behaviour
  5. ease of implementation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A/B testing disadvantages

A
  1. needs to agree on OEC
    - requires clear goal
    - needs to define independent, dependent variables
  2. problem with quantitative metrics
    - does not tell why A is better than B
    - needs to have same subjective measures
  3. primary effect
    - changing app may degrade user experience regardless of which one is better
    - takes time to get used to
  4. consistency contamination
    - assignment is cookie based.
    - users may erase cookies or use a different machine
  5. multiple experiments
    - statistical variance increase, harder to get statistically significant result
  6. outlier detection
    - 5-40% are bots
    - show skewed results
  7. ethics
    - emotional manipulation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly