Machine Learning Flashcards
- What is (machine) learning? - What different types of datasets do exist? - Which learning paradigms do result from these types? - How does the Perceptron learning rule work? - What is a common method for supervised learning?
What is learning?
Learning is the acquisition of new information or knowledge or the process of doing so by trial and error.
What is Machine Learning?
Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.
What are the four components of machine learning?
- Dataset S: set of samples generated by some process; either labelled or unlabelled
- Model M: an representation of a input/output relationships intended to model the process that generates S
- Objective Function L: a function that encodes the current performance of M (e.g. loss or reward)
- Algorithm A: the learning algorithm that adjusts M based on S and L
What is the cognitive function of Autonomy within a cognitive system?
The ability to dynamically adapt to changes in the environment (e.g. continuous online learning from a live data stream)
What is the cognitive function of Perception within a cognitive system?
The ability to learn how to detect and categorize perceptual stimuli (e.g. unsupervised learning of visual features)
Why are conventional programming methods not possible in complex dynamic environments?
- The environment is changing continuously
- Dynamics and objects are too complex to be modelled explicitly
- The system itself is subject to change
What is a feature?
A feature is an individual measurable property or characteristic of a phenomenon being observed.
They abstract from raw data and represent semantic information in the data.
What is feature engineering?
Feature engineering is the process of using domain knowledge to extract features from raw data.
Selected features are grouped into a feature vector.
Designing features is one of the most challenging tasks in machine learning. Manual feature engineering can become extremely complex for real-world datasets.
What is the definition of the Machine Learning Task?
Train a model M in a hypothesis space H using a learning algorithm A so that M minimizes loss L for dataset S
Why can many real-world tasks not be modelled as pairs of input and desired outputs?
Many real-world tasks have complex dynamics and unknown goal representations.
What is the goal of Unsupervised Learning?
To discover structural features in the data set, given solely unlabelled data. Often applied as a pre-processing tool for initial data analysis
What is semi-supervised learning?
Semi-supervised learning combines a small amount of labelled data with a large amount of unlabelled data during training. A priori assumptions on input data is required
What is a Model?
The model M of a machine learning system encapsulates the outcome of the learning process. The learning algorithm A’s goal is to find an optimal model M* = argmin_(M ∈ H) of L(M).
It is often not possible to find a global minimum; therefore local optima have to suffice.
What is the hypothesis space H?
All possible models.
E.g. decision trees, polynomials, neural networks, …
What are ensemble methods?
Ensemble methods combine multiple learning algorithms to obtain better predictive performance.
What is Boosting?
Boosting is an ensemble method in supervised learning. By combining several weak learners, it is possible to build a strong learner.
Boosting algorithms compute a strong learner incrementally by constructing an ensemble of hypothesis and increasing the weights of incorrectly learned samples. The training of new hypotheses focuses on samples with large weights.
What is a weak learner?
A classifier that is just slightly better than randomly guessing.
How can the performance of a hypothesis be judged?
By how well it predicts seen data from the training set (“fit”) and unseen data (generalization).
What is Occam’s Razor?
Is is desirable to choose a model that is a simple as possible (fewer parameters). “Of two competing theories, the simpler explanation of an entity is to be preferred”
What are generative and discriminative models?
- Discriminative models learn the boundary between classes so they can discriminate between different kinds of data instances.
- Based on the posterior probabilities P(y|x)
- Generative models learn the distribution of data so they can generate new data instances.
- Based on the prior probabilities P(x|y); predictions can be computed by applying Bayes’
- compact representations of the training data set that have considerably less parameters than the dataset S
How can overfitting be detected?
By training on a training set while continuously assessing the performance and adjusting parameters based on a validation set. Finally, a test set is used to assess the performance of the final model.
What is cross-validation?
The dataset is partitioned into k subsets and learned on in k iterations, with a different subset chosen each time. The overall performance corresponds to the averaged performance of the k iterations.
How can overfitting be avoided?
- Regularization: include a regularization term in the objective function L that punishes complexity
- More training data: increase complexity of the dataset
- Dataset augmentation: transform training samples (add noise, shifts, rotations, etc.)