Communication Flashcards

Question 1

Q

How would you explain a con dence interval to an engi- neer with no statistics background? What does 95% con- dence mean?

Answer

A

Suppose we are interested some characteristic of a population; for example, the average height h of all adult males in the U.S. We can estimate h by drawing a random sample of adult males in the U.S. and calculating the average height H in the sample. This is called a point estimate of h. If the sample is large, H will be a good estimate of h, but by itself it does not tell you how good it is.

A 95% confidence interval is a different kind of estimate. It consists of two numbers L (lower) and U (upper), which are derived from the sample in some way without knowledge of the unknown h (or any other unknown parameters). The interval (L,U) is supposed to contain the unknown h. A procedure for finding (L,U) which does in fact contain h for 95% of the possible samples is called a 95% confidence interval. If the interval is short, it gives us a small range of “likely” values for h.

That is the definition. Now, a few comments. Why the strange word “confidence,” which is never used by itself in probability or statistics? Why the scare quotes around the word “likely” in the previous paragraph?

Confidence intervals are a tool of the frequentist school of statistics, which holds that we should use the concepts of probability and randomness only to describe the mechanics of certain kinds of sampling from populations, and not to describe our certainty or degree of belief. Frequentists aim to use probability in an objective way.

For a frequentist, a statement like “the probability that the average height h of the all males in the US lies between 70 and 74 inches is 95%” is meaningless: h is just a number we don’t know. It either lies in the interval (70, 74) or it doesn’t.

Confidence intervals are a trick frequentists use to make statements resembling the one above without violating their rules about how probability should be used. According to the definition given above, it is legitimate to write:
P(L≤h≤U)=95%
if (L, U) is derived according to a rule so that it does contain h for 95% of samples. This resembles a subjective statement about our certainty that h lies in the range (L, U). But it isn’t: it’s an objective statement about how often, in the long run, our random interval will contain the fixed but unknown h, according to the randomness in our sampling. The “subject” of the probability statement above looks like it is h (and that is the trick) but actually it is the interval.

The probability statement above makes sense only before we draw a sample. What happens after we draw the sample, when we find L=70 and U=74? There is a strong temptation to plug into the previous expression to get:
P(70≤h≤74)=95%
which is exactly the sentence held to be meaningless by frequentists earlier. h either lies between the other two numbers or not; there is no probability involved.

Any inference we draw about h must of course happen after we draw the sample, but frequentist rules prevent us from invoking probability at this point. So instead we refer back to the randomness which gave us the interval (70, 74). It is not that this particular interval contains h with 95% probability, but rather that an interval constructed in this way will contain h 95% of the time

Everyone who uses confidence intervals, including every frequentist statistician on the planet, would actually interpret the interval (70, 74) as representing a “likely range” for h, implicitly invoking something like the illegal probability statement P(70≤h≤74)=95%. Without using some terms related to probability, it is almost impossible to explain what useful relation the interval (70, 74) bears to h.

Confidence intervals are useful mainly because we can misinterpret them as probability statements about unknown parameters.

Question 2

Q

How would you explain an A/B test to an engineer with no statistics background? A linear regression?

Answer

A

A/B testing, or more broadly, multivariate testing, is the testing of different elements of a user’s experience to determine which variation helps the business achieve its goal more effectively (i.e. increasing conversions, etc..) This can be copy on a web site, button colors, different user interfaces, different email subject lines, calls to action, offers, etc.

Communication Flashcards

(2 cards)