L10 - Basics of Machine Learning Flashcards
Machine Learning and Artificial Intelligence
Machine Learning means to use learning algorithms on (big) data to make accurate predictions and detect previously unknown patterns.
Machine Learning, Deep Learning & AI have all different meanings. Not every system categorized as an intelligent system uses machine learning.
Supervised Learning: Regression
The goal is to make quantitative (real valued) predictions on the basis of a (vector of) features or attributes. It reveals causal relationships between the independent variables (input) and dependent variables (output).
Example Bundesliga: Estimate the position of a team based on the scored goals. Findings Slide 14
Supervised Learning: Classification
The goal is to use training data to build a classification model that predicts the correct category (label) for previously unknown data with a high accuracy (Framework Slide 16).
Supervised Learning: Decision Trees
The method is to create a model that predicts the value of an output target variable at the leaf nodes of the tree, based on several input variables at the root and interior nodes of that tree.
How to build a decision tree?
• Trees are built from the root to the leaves. In each iteration, one further attribute is defined. The decision tree learning algorithm selects the attribute which provides the most information gain (IG)
• IG is a measure on how well an attribute will split up the remaining data into disjunct groups
• The algorithm prefers an attribute with a higher IG over attributes with lower IGs
• The IG must be computed for each attribute (not relevant for exam)
Example Tennis Friends
Supervised Learning: k-Nearest-Neighbor
The goal is to use labeled data to predict the class of a previously unknown instance based on its similarity to other data points.
The idea is to identify the closest neighbor(s) and classify the new instance similarity. If the closest neighbors have different classes, perform a majority vote.
This approach does not require a training phase and no model is built.
Example Slide 25
Training and Test Data
In supervised learning algorithms, we have labeled data that we can use to train our model. However, we also need some data to test our trained model. But never train on test data. Split training and test data. E.g. use 900 data sets to train and 100 to test. One has to take this tradeoff.
The Results of Classification: The Confusion Matrix
Optimal Character Recognition (OCR) is the conversion of handwritten or printed text into digital machine-readable text. An OCR algorithm needs to detect a character, analyze the glyph features and predict the character.
The relation between the predicted outcome and the actual outcome is visualized in a Confusion Matrix (Example Slide 28)
True vs False and Positive vs. Negative
From the confusion matric of the two or more classes, we can derive the confusion matrix for each individual class.
Definitions:
• Positive: Instance is the predicted object
• Negative: Instance is not the predicted object
• True: Prediction is correct
• False: Prediction is
So, the values in a Confusion Matrix can be TP, FP, TN and FN.
Evaluating Classification Models: Accuracy
Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got right. Formally: TP + TN/ TP + TN + FP + FN
Evaluating Classification Models: Precision and Recall
Precision: What proportion of positive identifications was actually correct?
TP/ TP + FP
Recall: What proportion of actual positives was identified correctly?
TP/ TP + FN
Accuracy vs. Precision vs. Recall
A high accuracy is a good first evidence but lacks. A high precision is important when false alarms are costly. A high recall is important when its vital to detect every single positive instance.
Unsupervised Learning: Association Analysis
The goal is to find frequent patterns, associations or causal structures that exist in collections of objects.
The idea is that algorithms find rules in an unlabeled data set. (Slide 41 Example)
Unsupervised Learning: Principle Component Analysis
The goal is to identify patterns in high dimensional data and reduce the number of dimensions without much loss of information.
The idea is that multiple features in one dataset are correlated and have a similar impact on the variance of the data. (Slide 42 Example)
Unsupervised Learning: Clustering
The goal is to divide data into meaningful or useful groups (clusters)
The method is to divide the data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. (Slide 43 Example)
Reinforcement Leanings: Games
The goal is to learn a behavior without the need for labeled data.
The method is that the algorithm does only learn by the feedback it receives as a result of its actions. There are no correct input/ output pairs presented to the machine. Instead, good outcomes are rewarded, and bad ones are punished. (Example Slide 45)