Lecture 1 Week 1 Histogram Flashcards
What are parametric models and what are two examples of parametric models?
see slide 4,5
What is the advantage and disadvantage of parametric models?
The advantage of parametric models compared to non-parametric models is that they are more efficient if the underlying assumptions are satisfied.
However, parametric models can lead to misleading results if the underlying assumptions are violated.
On the other hand, non-parametric models allow us to impose much less assumptions and obtain information that may be lost
using parametric models.
What is the conclusion based on the results of the parametric density and non-parametric density?
Summary of the results:
- Parametric density: The estimated log-normal distributions seems
not to change much over time.
- Non-parametric density: The mode of the density (maximum
value of the density) seems to change over time. The mode seems
to get lower.
Conclusion: The parametric density can be too restrictive since it
imposes a certain structure on the shape for the distribution. We see
almost no variation in the distribution over time. Instead, the
non-parametric density (kernel) is more flexible and captures some
variation in the distribution.
What is the conclusion based on the summary of the results of the parametric regression and the non-parametric regression?
Parametric regression: The relationship between log wage and
School is forced to be linear, instead, the relationship between
log wage and Exp is a quadratic function.
Non-parametric regression: The model allows for more general
relationship between the variables. For instance, it captures that the
relationship between log wage and School is flat for small values of
School.
In this course, we will explore several approaches for non-parametric
regression that range from kernel regression to Neural Networks.
What is the histogram, what does the histogram split and what is affected by the width of the bins?
The histogram is a non-parametric estimator of the density function
of the observations.
The histogram splits the data into intervals, called bins. Then the
frequency of the observations in each bin is represented by a vertical
bar.
The width of the bins affects the interpretation of the histogram.
How is the histogram constructed
slide 21
What is the formal definition of an histogram?
slide 22
What are the statistical properties of the histogram?
slide 26
What is the expectation and bias of the histogram and what happens to the bias as h–> 0?
slide 27
What is the variance of the histogram and what is the approximate expression for the variance and when does the variance go to 0?
slide 28
What is the MSE of the histogram and what are some conclusions based on the MSE and when MSE goes to 0?
slide 29
Explain the trade-off between bias and variance using h and the sample size n?
slide 30
How to choose h?
slide 32
What is the MISE of the histogram and what is the approximate expression for the MISE (AMISE)?
How is the ||f’||22 interpreted?
slide 34
How can we select the optimal bandwidth?
What is the interpretation of the result?
slide 35