lecture 4 (1) Flashcards

1
Q

Main point of an experiment

A

Test some idea

Not proven something

most experiments “fail”

I.e. change does not lead to an improvement

This is a good thing

Bad ideas fail quickly
Investment is typically small
As are the sample sizes

But failed experiments are not normally what one sees

In publications or when talking to a firm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Steps to analyze experimental data

A

build an understanding of the data structure

COmpute some descriptive statistics

Visualize the data

Run (the correct) statistical test

(use test results to inform decision making)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What descriptive statistics do you want to know

A

How many observations in total

How many observations many per treatment

What is the mean/median number of sales per store in each treatment group

What is the standard deviation of the number of sales pers store in each treatment group

Do observable characteristics of stores differ across treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to best visualize differences between promotions

A

Histogram/bar plot

Scatterplot

Boxplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the right statistical analysis to run?

A

two sample tests of means
Limit to binary comparisons

Two sample test of proportions

ANOVA

Linear Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Comparing means - two alternatives

A

Form null and alternative hypothesis

H0: u1-u2=0
HA:u1-u2 != 0

set a significance level alpha = 0.05

Test statistic (assuming unequal variances)

tstat= …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Intermezzo: type 1 and type 2 errors

A

Type 1 error: False positive

Type 2 error: False negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what if you want to compare all treatments

A

need anova

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Anova assumptions

A

independence of errors
constant variance
Normality of errors

Of these (2) is the most important

Homoskedasticity
Assuming errors are normally distributed, tested via bartletts test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bartletts test

A

assumes normality of errors

If non normal Brown forysth test

If non constant variance choranes test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Barlett test explained

A

null hypothesis: Variances are equal across treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

two promotion comparison?

A

use regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

why only an estimate of promotion 2 and not also promotion 1

How to interpret this regression

A

When a constant is included in the regression 1 categorical variable must be left out

We have two categories since we have two treatments (promotion 1 and promotion 2)

beta0 is the average revenue for stores who were in promotion 1

beta0+beta1 is the average revenue for stores who were in promotion 2

beta1 is the average difference in revenues between promotion 2 and promotion 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Regression estimates from analysis of experiments have causal interpretations, why

A

Counterfactual outcomes -compare to an alternative promotion

As good as random assignment to treatments - lurking variables wont trouble us

No sample selection bias… analyst picked the sample to match the group they care about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Regression estimates from experiments allow us to

A

Test whether treatments have effect

Same as ANOVA or a T-test

Estimate a magnitude of the effect sizes (and standard errors)

Which our T-test and anova didnt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to interpret regression with logy

Could we also take the log of x variable

A

beta0 is the average log return of stores in promotion 1 (not very useful)

beta is appx the average percentage difference in revenue between promotion 2 and promotion 1
(expbeta1-1 is the exact percentage difference)

We cannot take the log of promotions 2 this variable is either zero (not in promo 2) or 1 (in promo 1)
and log(0) undefined
thats ok, the interpretation is still nice

17
Q

Three promotion comparison with logy
How to interpret each coefficient from this regression?

A

beta0 is the average log revenue for stores in promotion 1
beta1 is the average percentage difference in revenue for stores in promotion 2 compared to promotion 1

beta2 is the average percentage difference in revenue for stores in promotion 3 compared to promotion 1

beta3-beta2 is the average percentage difference in revenue between stores in promotion 3 compared to promotion 2

18
Q
A