A/B Testing Flashcards
A/B Testing is a type of _________ testing
hypothesis
A/B testing and multi-armed bandits can help you determine if a change adds ______
value
In what situations might you want to use A/B testing?
Exploring usability improvements
Establishing the effectiveness of promotions
Staged rollout of major changes
What is the general process of performing A/B testing?
Create a hypothesis between A and B
Determine what data to collect (population, metric, size)
Randomly serve A to one pop and B to the other
Use a t-test to measure the differences in the populations
What is a type 1 and type 2 error?
Type 1: Reject the null hypothesis even though it’s true
Type 2: Fail to reject the null hypothesis even though it’s false
What are some problems when choosing populations for your hypothesis?
Hypothesis may not apply to everyone
Hypothesis may affect subpopulations differently
Population needs to be representative
What are some problems when choosing your hypothesis?
Need to define clear goals, otherwise they are useless
Testing many things increases the likelihood of false positives and p-hacking
What are some problems when selecting stopping criteria and confidence?
Size of a test campaign must be set up front, instead of running the process and stopping when significance is reached
What is regression to the mean? Why is it a problem?
Following an extreme event, the next event is likely less extreme
Can cause the illusion of siginificance
What are novelty effects?
The novelty of a change for the sample may bias the underlying results of the study
Review sequential hypothesis testing on video
Review sequential hypothesis testing on video
What are multi armed bandits used for?
Figuring out how to make a good choice now. It chooses between exploration vs exploitation
What is the epsilon-greedy multi armed bandit strategy? What are some issues with it?
Choose a percentage. If over, pull best arm so far. Else pull a random arm. Update arm stats.
It is sensitive to variance and performs worse than other approaches
What is the Thompson Sampling multi armed bandit strategy?
For each arm, sample from it’s successes and failures. Pull the arm that has the max value
Why might you prefer multiarmed bandits over A/B testing?
You can start using bandits immediately, as they don’t require a large population or much setup. You will, however, not make any long term business decisions