Questions #2 Flashcards
True or false : Putting too many parameters into a model results in overfitting the model
True
True or false : The best model should provide the most information
True
Can you tell me the definition of Information
The information of an outcome is defined as the decrease in uncertainty from observing the outcome
What are the 3 properties for a measure of uncertainty?
- Continuity : Should be a continuous function of the parameters of the distribution
- Additivity
- Monotonicity
What are the unique measure of uncertainty that satisfies the 3 properties ?
Information entropy
What is the definition of cross-entropy?
Cross-entropy is a measure of uncertainty of using a different distribution with event probabilities q, to estimate a distribution with the same events with probabilities p
True or false : Cross-entropy is symmetric
False
True or false : Using a low-entropy distribution to predict a high entropy distribution is worst than the opposite
True
True or false : The Kullback-Leibler Divergence grows as the esitmate moves away from the true distribution
True
True or false : The Kullback-Leibler Divergence is symmetric
False
Define the Deviance formula
-2 times the loglikelihood
True or false : The lower the deviance, the better
True
Tell me the steps to calculate LPPD
- At each point, take the average of the sample
- Log the average
- Sum all the logs over all points
True or false : Deviance is measure of predictive accurary, not of truth
True
True or false : When doing cross-validation, to truly use deviance or lppd as measure of accuracy, it should be calculated on the test data
True, because the deviance will be lower on the training data when we add parameters, even if they are not relevant