Hypothesis Testing in AI Flashcards

Question 1

Q

What are the 4 steps in hypothesis testing?

Answer

A

State the hypothesis
Design the statisitcal test
Calculate the test statistic
Reject or fail to reject the null hypothesis

State, Design,Calcuate, R or A

Question 2

Q

What is a test statstic?

Answer

A

It is a function of the observed values of the random variables of interest

Question 3

Q

What is meant by the p-value?

Answer

A

The probability in the tail

The probabability of observing a sample estimate as extreme as the one observed, assuming the null hypothesis is true.

Question 4

Q

What is meant by the level of significance?

Answer

A

The probability of making a Type I error

Question 5

Q

What is Type I error?

What is Type II error?

Answer

A

There are 2 types of errors,

Type I: Wrongly rejecting a true null hypothesis

Type II; Accept (Fail to reject) an unture null hypothesis.

The probability of a Type I error is the level of significance.

Question 6

Q

What are the reasons for unrepresentative datasets?

Answer

A

Selection bias (leaving certain obervations out);
Survivorship bias (leave out funds no longer in existence);
Self selection bias (Fund Managers leave out certain funds)

The 3 S’s

Question 7

Q

Data mining

Data dredging (Data snooping)

Answer

A

Data mining; vigorously testing until valid relationships are established

Data dredging: Overuse of statistical tests or Running hundreds of statistical tests to identify significant relationship without regard to economic rationale.

Question 8

Q

Backtesting, overfitting and Backfilling

Answer

A

Backtesting: Apply models on historical data to determine how well the model would have explained the actuals results: Good sometimes
- Overfitting;When many parameters are used to a model in histotical data;Not good
- Backfilling: Updating databases by inserting returns that pre-date the date of entry in the database. Backfill bias is also known as instant history bias. Generally not good

Question 9

Q

Cherry picking and chumming

Answer

A

Cherry picking ; Selectivley reporting results;
Chumming: If enough predictions are made, some will be correct

Question 10

Q

Why are cumulative return charts deceptive?

How is deception avoided?

Answer

A

Because of compounding effects. Gap widens with an early period advantage.

Deception is avoided by using cumulative log chart becase log returns are additive, not multiplicative.

Hypothesis Testing in AI Flashcards

(10 cards)