Exercise 2 - ML part 2 Flashcards
what does the model M do?
encode, stores and retrieves the outcomes of a learning process
What does H do?
The hypothesis space determines which aspects of the data are captured and how they are represented
How is it called when you combine hypothesis?
Ensemble method
What is boosting
computes a strong learner by incrementally constructing an ensemble of hypothesis
law of parsimony that is stated in the principle of Occam’s razor
Of two competing theories, the simpler explanation of an entity is to be preferred
discriminate data
compute prediction y for an input x
What are discriminative models based on?
posterior probabilities P(y|x)
What are generative models based on?
prior probabilities P(x|y)
- -> How likely is it to have data of a certain label?
- -> Can be computed via the Bayes’ algorithm
- -> Generative models are compact representations that have considerably less parameters
How can overfitting be detected?
By applying h to unseen data samples
Avoidance of overfitting through
- regularization
- more training data (increasing the complexity of the dataset)
- dataset augmentation (e.g. adding noise)
value domain of learning methods
- discrete (classification)
- continuous (regression)
different models
deterministic - stochastic
parametric - nonparametric
generative discriminative
nonparametric
An easy to understand nonparametric model is the k-nearest neighbors algorithm that makes predictions based on the k most similar training patterns for a new data instance. The method does not assume anything about the form of the mapping function other than patterns that are close are likely to have a similar output variable.
parametric
e. g. linear function
- you make assumptions
Types of reasoning
- inductive
- deductive
- transductive
perceptron
Linear classifier that is based on a single Neuron with a digital threshold function. It outputs +1 (if x >=0) and -1 (if x < 0).
- the perceptron criterion punishes only incorrectly classified samples
Can the perceptron learning rule find a solution?
If a solution exists, so if the data set is linearly separable, then the perceptron learning algorithm finds a solution within a finite number of steps
- the solution depends on the initialization of the parameters and the order of presentation of the training samples
What led to the AI winter?
- The perceptron learning rule only converges for linearly separable data sets. It therefore cannot classify the XOR dataset correctly. This led to the abandonment of connectionism for almost two decades.
Examples of basis functions
- linear functions
- sigmoid functions etc.
What is interpolation
a function that maps to every training data instance
Is an analog neuron model a generalized linear model?
yes
Issues of Least Squares Linear Classification
sensitive to outliers
What solves the problem of Least Squares Linear Classification
Support Vector Machines (SVM) minimizes the generalization error.