L29 - Methodological Issues Flashcards

Question 1

Q

What does Leamer believe people do to models to improve them?

Answer

A

Called the typology of Specification Search

Hypothesis testing search
Interpretative search
Simplification search
Proxy search
Data selection search
Postdata model construction

1 to 3 can be thought of as ‘general to specific’ because they start with an unrestricted model and test restricted versions.

4 to 6 are ‘specific to general’ because they involve modifying theoriginal model by introducing new or alternative variables

Question 2

Q

What is Hypothesis testing search?

Question 3

Q

What is Data selection search?

Answer

A

splits the data to see if the model behaves differently
- is the relationship robust between different samples and data sets

Question 4

Q

What is Proxy variable search?

Answer

A

testing for better proxy’s
- income might not be accurate as some people may lie about their actual income
Attempts to see if the relationship is robust when different proxy’s are used

Question 5

Q

What is Post data model construction?

Question 6

Q

What is Interpretative Search?

Answer

A

used theory to test the model

Question 7

Q

What is Simplification Search?

Answer

A

Testing restrictions on a model in order to reduce the number of parameters, in order to improve the efficiency of the others

Question 8

Q

What are the implications of the hypothesis testing, interpretative and simplification search?

Question 9

Q

What are the three guilty secrets of econometrics?

Answer

A

Economic significance and statistical significance are not the same thing.
Data mining means that reported levels of significance are often not correct.
Many apparently significant relationships are really spurious regressions

Question 10

Q

How are Economic significance and Statistical significance different?

Answer

A

A test is statistically significant if we can reject the null hypothesis at a given level of significance.

A result is economically significant if it has an important influence on economic behaviour.

The two above definitions of significance are not the same thing.

(in the second case the elasticity is less accurate as it has a higher SE)

Question 11

Q

What is Data mining?

Answer

A

Data mining refers to the process of searching a data set for correlations between variables.
The problem is that we cannot then use the same data set to test the significance of relationships we identify.
For example, suppose we regress a variable of interest on 100 different explanatory variables. Even if ALL of these are unrelated to the variable of interest we would expect to find 5 significant relationships. –> at the 5% level
If we wish to test a relationship detected through data -mining then we need to do so on a new data set.

chosen the best possible results from all 10 regression run –> even though 8 signify a insignificant relationship

Question 12

Q

What is cherry-picking?

Answer

A

one potential problem with ‘big data’ is that they comb through data running hundreds of regression till they get a significant result they want
- this okay if you go on to test if the relationships are robust but by itself is likely to lead to a poor model

Question 13

Q

what are spurious regressions?

Answer

A

e.g. can arrive a spurious regression by cherry-picking the data

Question 14

Q

What is the most common reason for spurious regressions?

Answer

A

the present of unit roots in data series (a series that contains a random walk element)
- As the data is not stationary a lot of statistical results do not apply to this series
even though they are independent random walks there is high level of correlation between the data
- You find this out usually from having a highly significant variable, a high R-squared and very low DW statistic

Question 15

Q

How does the Monte Carlo Simulation highlight the issue of spurious regression?

Answer

A

67% of the time there was a significant t-ratio
- not saying you should not regress one unit root process on another - it can be useful,
- but you need to be aware that you shouldn’t that the statistics at face value

Question 16

Q

What happens to a regression of a random walk if it has a drift problem?

Answer

Study These Flashcards

A

Question 17

Q

What does modelling require other than statistical techniques?

Answer

Study These Flashcards

A

In relation to spurious regression
The appearance of serial correlation is produced by the presence of the structural break.
Before the structural break, the residuals are mainly negative. After it, the residuals are mostly positive –> We would see correlation in the errors
‘Correcting’ for serial correlation may appear to deal with the problem but is disguising the real problem.
- If we don’t allow for the possibility of a structural break, you may end up correcting for the wrong problem (serial correlation)
A better solution is to deal with the structural break by including a dummy variable in the regression.

L29 - Methodological Issues Flashcards

(17 cards)