ch. 1 Flashcards
define: training set
data used to tune parameters
define: test set
data used to test accuracy
define: generalization
correctly predict outcome of new input data, according to the expected outcome
define: preprocessing/feature extraction
processing applied to input data to facilitate learning by reducing variability
define: classification
assigning inputs to a finite number of discrete categories
define: regression
assign input to one or more continuous variable(s)
define: supervised learning
training data is comprised of inputs with corresponding outputs
define: clustering
(in unsupervised learning) goal of distinguishing groups with similar properties
define: density estimation
(unsupervised learning) goal of finding distribution of data in input space
define: reinforcement learning
maximize reward
subject: “Here the learning algorithm is not given examples of optimal outputs, but must instead discover them by a process of trial and error. Typically there is a sequence of states and actions in which the learning algorithm is interacting with its environment. In many cases, the current action not only affects the immediate reward but also has an impact on the reward at all subsequent time steps.”
reinforcement learning
vocable: reinforcement learning problem in which reward is attributed to all steps in successful outcome
credit assignment problem
(reinforcement learning) define: exploration
system tries new actions to see their efficacy
(reinforcement learning) define: exploitation
system uses action that it knows yield high reward
vocable: system accurately predicts training data, but not test data
over-fitting
what happens to severity of over-fitting as the size of the data set increases
decreases
“By adopting a ___ approach, the over-fitting problem can be avoided. We shall see that there is no difficulty from a ___ perspective in employing models for which the number of parameters greatly exceeds the number of data points.”
Bayesian
“in a Bayesian model the effective number of parameters adapts automatically to the ___”
size of the data set
what is the goal of regularization
minimize over-fitting (in a non-Bayesian model)
what is the goal of regularization
minimize over-fitting (in a non-Bayesian model)
how does regularization work
adds penalty term to error function to discourage coefficients from reaching extremely large values
define: shrinkage
method that reduces the value of coefficients
shrinkage in context of neural-networks
weight decay