Kohavi and Thomke (2017): The Surprising Power of Online Experiments Flashcards

1
Q

Key Lessons from Online Experiments
The Value of Controlled Experiments:

A
  • Definition: Online experiments (like A/B testing) allow businesses to assess ideas by
    comparing a control (current state) with a treatment (proposed change). This scientific method
    ensures decisions are evidence-based rather than intuitive.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A/B testing for large companies:

A

Allows to experiment on multiple ideas concurrently at a low cost per test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Tiny Changes Can Have a Big Impact:

A
  • Contrary to popular belief, progress often comes from implementing numerous small
    improvements rather than disruptive changes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Role of Infrastructure
- Large-scale experimentation requires:

A
  • Instrumentation: Collecting data on clicks, interactions, and behaviors.
  • Data pipelines: For real-time and batch analysis.
  • Teams of data scientists: To ensure rigor and reliability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Challenges with Experimentation:

A
  • Failure Rates: At companies like Google and Bing, only 10%-20% of experiments yield
    positive results. This underscores the need for numerous tests to identify breakthroughs.
  • Complexity and Bugs: Introducing multiple features simultaneously increases the likelihood of
    errors. Example: If each new feature has a 10% failure chance, adding 7 features has a >50%
    probability of failure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Importance of Data Quality:
- Rigorous Validation:

A
  • A/A Tests: Testing a feature against itself ensures systems detect no differences when none
    exist.
  • Identify and exclude outliers (e.g., bots or outlier accounts like libraries on Amazon).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Importance of Data Quality:
- Twyman’s Law:

A

“Any figure that looks interesting is usually wrong.” Surprising results should
be replicated to ensure accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Importance of Data Quality:
- Segment Variability

A

Some user segments may react differently to experiments, skewing
overall results. For example, a bug in Internet Explorer 7 significantly distorted Bing’s test
results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Avoiding Assumptions About Causality:

A
  • Correlation ≠ Causation:
  • Example: Observational studies in Microsoft Office falsely suggested advanced features
    reduced attrition. In reality, heavy users (who use advanced features) naturally have lower
    attrition rates.
  • Controlled Testing Is Essential: Observational studies may misrepresent the impact of
    changes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Defining Success with Metrics:

A
  • Overall Evaluation Criteria (OEC):
  • Composite metrics should align with long-term strategic goals (e.g., revenue, engagement).
  • Example: Bing tracks metrics like tasks completed per session to gauge user satisfaction.
  • Continuous Refinement:
  • Successful experiments often result from understanding short- and long-term metric trade-
    offs.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly