Algos Problem 6: Machine Learning - Classification Trees, Neuronal Networks Flashcards
a) Was versteht man unter überwachtem und unüberwachtem Lernen? (2 Punkte)
Supervised: Class labels are known (observed):
We have objects from several classes and want to distinguish between them.
–
Unsupervised/clustering: Class unknown(hidden)
Determine meaningful groupings of the samples
Nennen Sie zwei Methoden des überwachten Lernens und beschreiben Sie diese kurz (je 1-2 Sätze) (4 Punkte)
Neuronale Netze (KNN)
künstliche Neuronen ahmen durch Algorithmen die Nervenzellen im Gehirn nach.
Durch diese komplexen Verknüpfungen ist es so möglich, Aufgaben aus Statistik, Informatik und Wirtschaft zu lösen.
Dabei lernt das neuronale Netzwerk ständig dazu und kann sich selbst verbessern.
k means Clustering
k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster.
This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances),
Hierarchical Clustering
Classification tree - entropy?
Entropy - quantifies the amount of uncertainty associated with a specific
probability distribution
Classification tree -Information Gain?
Informat ion gain - the expected reduction of the entropy of the instance set S due to sorting on the attribute A
Classification tree - Formula for Shannon Entropy?
H(X) = - Sum from 1 to n of [ p(x_i)log_2(p(x_i)) ]
Classification tree - Formula for conditional Entropy?
H(Y|X = x_i) = - Sum from 1 to m of [ p(y_j|x_i) * log_2(p(y_j|x_i) ]
Classification tree - Information Gain - definition?
IG(Y|X) = H(Y) - H(Y|X)
Classification tree - How to choose a characteristic as the parent node / next node?
Choose the one with the highest information gain.
IG(Y|X) = H(Y) - H(Y|X)
Neural networks
a) Wie sind Perzeptronen in einem Mehrschicht-Perzeptron miteinander verschaltet? (2 Punkte)
b) Was wird durch die Verwendung mehrerer Schichten ermöglicht? (2 Punkte)
a)
nodes are connected with every node in adjacent layers
b)
this leads to an effective learning algorithm
–> define non-linear classification functions
Regression: 2-layer networks can approximate any function
Neural networks
Was ist ein Perzeptron? (2 Punkte)
The perceptron
* A Perceptron is an Artificial Neuron
* It is the simplest possible Neural Network
* Neural Networks are the building blocks of Machine Learning.
* First machine learning algorithm
* built as analog hardware in 1959 (weight updates with motors!)
* Defined as a “neuron”/compute node that takes a linear combination of inputs and passes it through an “activation function”
– Originally: threshold/ step function
– Sigmoid as activation function: perceptron == logistic regression!
– These days: tanh (differentiable) or “rectified linear” reLU]
The Perceptron Algorithm - Frank Rosenblatt suggested this algorithm:
Set a threshold value
Multiply all inputs with its weights
Sum all the results
Activate the output
Neural networks
Give an example of a perceptron
Perceptron Example
- Imagine a perceptron (in your brain).
- The perceptron tries to decide if you should go to a concert.
- Is the artist good? Is the weather good?
- What weights should these facts have?
Neural networks
what is Backpropagation?
Laymans terms:
In summary, backpropagation is a way for a computer to learn from its mistakes while trying to do a task, like recognizing cats in pictures, by adjusting its rules based on feedback from a teacher. This allows it to improve its performance over time and become more accurate in its predictions.
from VL:
Abbreviation for ‘backward propagation of errors’
Specified version of gradient descent to train MLPs
- compute the gradient (partial derivatives) of the loss function with respect to all the weights in the network
- passes the gradient to the optimization method to update the wieghts, in an attempt to minimize the loss function
- can be done one sample at a time (“online”) or for a group of training samples (“batch”)
Relies on simple derivatives of activation functions
- h(a) = tanh(a) = …..
- Logistic: h)a) = sigm(a) = …..
watch this video! https://www.youtube.com/watch?v=IHZwWFHWa-w&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&index=2
Neural Networks - what is a convolutional neural network (CNN) ?
In deep learning, a convolutional neural network (CNN) is a class of artificial neural network most commonly applied to analyze visual imagery.[1] CNNs use a mathematical operation called convolution in place of general matrix multiplication in at least one of their layers.[2] They are specifically designed to process pixel data and are used in image recognition and processing.
Definition of neural network?
An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The “signal” at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold.
Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.