Machine Learning (part of exam 2/3) Flashcards

1
Q

Why can it be useful for a machine to learn?

A
  • it’s essential for unknown environments (ie when the designer isn’t omniscent)
  • it’s useful as a system construction method (ie expose the agent to reality rather than trying to write down reality)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Assign the correct names to the following quotes:

  1. Learning is making useful changes in our minds.
  2. Learning denotes changes in the system that […] enable the system to do the same task or tasks drawn from the same population more efficiently and more effectively the next time.
  3. Learning is constructing or modifying representations of what is being experienced.
A
  1. Minsky
  2. Simon
  3. Michalski
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Mitchell’s definition of Machine Learning (1997)?

A

A computer program is said to learn from experience E with respect to some class of task T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What information needs to be given in order to reach the goal of improving the performance on a task?

A
  • a task T
  • a performance measure P
  • some experience E
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

On the example of teaching a machine to play Backgammon, what are the task T, the performance measure P and the experience E?

A
  • T: play backgammon
  • P: percentage of games won
  • E: previously played games
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What examples have we learned where machine learning is used in our daily life?

A
  • teaching machines to play games
  • regognizing spam-Mail
  • handwritten character recognition
  • classifying stars, galaxies, quasrs,..
  • market basket analysis (recommendation systems, store layouts)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

On the example of Spam-Mail, what are the task T, the performance measure P and the experience E?

A
  • T: sort E-Mails into categories
  • P: weighted sum of mistakes (letting spam through is weighted less than misclassifying regular E-Mails as spam)
  • E: handsorted E-Mails by user
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s the name of the learning method most spam-filters use to teach their machines to recognize spam mails?

A

Bayesian Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

On the example of Handwritten Character Recognition, what are the task T, the performance measure P and the experience E?

A
  • T: recognize a handwritten character
  • P: recognition rate
  • E: MNIST handwritten digit database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

On the example of Classifying Stars, what are the task T, the performance measure P and the experience E?

A
  • T: classification of celestial bodies
  • P: accuracy of classifying
  • E: classificatios of astronomers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What method is used to classify stars?

A

learning of multiple decision trees and combining the best rules of each tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

On the example of the Market Basket Analysis, what are the task T, the performance measure P and the experience E?

A
  • T: discover items that are frequently bought together
  • P: ? possibly revenue of those items
  • E: Supermarket check-out data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What types of different Learning Scenarios are there?

A
  • Supervised Learning
  • Semi-supervised Learning
  • Reinforcement Learning
  • Unsupervised Learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Supervised Learning?

A
  • a lot of labeled examples are provided for training purposes
  • machine has to assign labels to examples
  • concept learning, classification, regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Semi-supervised Learning?

A
  • a few labeled examples are provided for training purposes
  • machine has to assign labels to examples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Reinforcement Learning?

A
  • there are no labeled examples for training purposes
  • machine only receives feedback on the labelling assignment it does
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Unsupervised Learning?

A
  • there is no information except the training examples
  • clustering, subgroup discovery, association rule discovery
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Assign the following examples to the correct types of Learning Scenarios.

  1. In a video game you find out what to do by how many xp you receive for different actions.
  2. You download a few webpages and classify them into various types of webpages. then you tell an algorithm to classify every webpage it finds.
  3. An algorithm receives a pack of thousands of tweets and the instruction to sort them into clusters.
  4. A handwritten letter is scanned and run through a handwritten character recognition software.
A
  1. Reinforcement Learning
  2. Semi-supervised Learning
  3. Unsupervised Learning
  4. Supervised Learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Inductive Learning?

A
  • Given: input x and output f(x) of a function
  • Not given: function f
  • Problem: given a set of training examples, find a hypothesis h that is as close to the function f as possible, on all examples (so it must generalize from the training examples)
  • it ignores prior knowledge
  • it assumes that examples are given
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is Ockham’s Razor and how does it pertain to curve fitting in the Inductive Learning Method?

A

“The simplest explanation is often the best/correct explanation.”

When trying to fit a curve to data points, the best curve for machine learning is the curve that is both simple and mostly right (it doesn’t necessarily have to hit all points but should be relatively easy to foresee into the future.)

21
Q

What is Overfitting?

A

A curve is overfitting if its made to fit all points at the expense of being too complex and inconsistent. The curve can’t realistically be used for other data points because it’s too fitted for the example data points.

22
Q

How can Overfitting be avoided?

A

Keep a separate validation set (different from training and test sets) to watch the performance. If the error on the validation set rises, stop training.

23
Q

How does Performance Measurement work? How do we know that we have reached the closest possible solution?

A
  • use theorems of computational and statistical learning theory
  • try the solution h on a new set of examples where f is known
24
Q

How did the “Pigeons as Art Experts” experiment (Watanabe et al 1995, 2001) work and what were its findings?

A

Pigeons were presented with paintings of Chagall and Van Gogh. They received food when they pecked on paintings by Van Gogh.
After some time the pigeons were able to differentiate between the two artists with 95% accuracy when shown paintings they’ve been trained on and 85% accuracy on previously unseen paintings.

25
Q

What are Neural Networks?

A

They are modelled off the human brain and nervous system. They can extract and detect pattern and generalise them to make predictions.

26
Q

Who were David Hubel and Torsten Wiesel?

A

They explored the visual cortex’ of cats in the 1960s and discovered line and edge detectors in the visual system.
They received the Nobel prize in 1981.

27
Q

How are biological neurons built and how do they work?

A

Each neuron/cell has a center (nucleus), a body (soma), dendrites and axons. The dendrites and axons are on the edges of the cell and are responsible for receiving input (dendrites) and sending output (axons).
When a dendrite connects with an axon of another cell, this connection is called a synapse.
If the input reaches a certain strength, it fires, meaning the cell gets activated and passes the activation on to all connected cells along it’s axons.

28
Q

How are artificial neurons built and how do they work?

A

A neuron is called a node or unit.
The input (in) is weighed: Sum of (all inputs a * all weights w).
The output (a) is determined by the activation function (g).

So we have:
a = g(in)
a = g (sum of (all inputs a * all weights w))

29
Q

What is a perceptron (Rosenblatt 1957, 1960)?

A

It’s a single node that connects n input signals with one output signal, typically resulting in either -1 or +1.
The activation function is a simple threshold function.

30
Q

How can perceptrons and boolean functions be combined?

A

The boolean functions “and”, “or” and “not” can be combined by linearly separating the result from the rest.
More comple functions like “xor” can’t be modeled.

31
Q

What is the Perceptron Learning Rule for Supervised Learning?

A

It’s a function that includes a learning rate (alpha) and an error calculation, which calculates the weights needed for given inputs and outputs.

32
Q

How can the error of a network be measured?

A

The error of one training example x can be measured by the squared difference of the output value h(x) and the desired target value f(x).

For evaluating the performance of a network we can try the network on a set of datapoints and average the values (values are calculated like above).

33
Q

How does a machine find the correct weights for given inputs and outputs?

A

It’s almost always a search problem. The machine has to search for the right values and it can use different methods for that, e.g. heuristic search functions, local search functions,..

34
Q

What is Hypothesis Space?

A

It’s the space of all possible values for the weights we’re looking for.

35
Q

Out of all the values for weights in the hypothesis space, how do we know which ones are better than others?

A

We use an evaluation function that measures the error of each weight. So we search for weights that have a low error on the training data.

36
Q

What is the best weight setting for one example in an error landscape?

A

It’s where the error measure for this example is minimal.

37
Q

How can we find the best weight setting in an error landscape?

A

via Gradient Descent: go downhill in the direction where it is steepest (hill-climbing search).

38
Q

Why is the regular threshold activation function not useable in machine learning?

A

Because it’s not differentiable.

Suggested answer: Because there is no function that can exactly show this kind of pattern (going directly from 0 to 1 without any “curves”. (the Way from 0-1 is not defined).

(??? honestly i don’t get this at all but this is what he said in the lecture and what it says on the slides so ¯_(ツ)_/¯ there is a lot of math involved here. If you wanna check it out for yourselves, it’s in the video of the lecture of 13.11.2024 at approx 1:18:00. have fun!)

39
Q

What is the commonly used activation function and why?

A

The Sigmoid Activation Function. Because it’s easy to differentiate and non-linear.

40
Q

True or False: Every continuous function can be modeled with three layers (incl one hidden layer).

A

True

41
Q

What is Backpropagation Learning?

A

Usually we compute the error of a function but if there is a hidden layer, we need to calculate the error of the output layers, then backpropagate that to the hidden layer.
Delta is the error term of the output node times the derivation of its inputs.
This is done to update the weights and minimize errors, so in a way, this is how a machine learns.

42
Q

What is Deep Learning?

A

It’s a neural networks with many hidden layers.

43
Q

In what area has Deep Learning made great successes and how?

A

Image classification.
The many layers are fully connected (every node on one level is connected to every node on the next level). There are layers specifially trained on recognizing i.e. edges, corners, diagonal lines, faces, trees,…

44
Q

What is needed for Deep Learning?

A
  • a lot of training data (big data)
  • fast processing
  • unsupervised pre-training of layers
45
Q

What is Convolution in Neural Networks?

A

It’s a technique in image processing, where for each pixel of an image a new feature is computed using a weighted combination of its neighborhood (n*n pixels around it).
Depending on the weights this could blur the image or have it show only the edges, etc.

46
Q

What is Neural Artistic Art Transfer?

A

Using Deep Learning an input image can be altered to look like a reference image (ie look like as if Van Gogh had painted it).

47
Q

What are Generative Adversarial Networks (GANs)?

A

They’re methods for increasing robustness of a learning machine. “Invisible” changes are made to an image to confuse the machine, to make mistakes and further down the road, to make the machine recognize such mistakes.

48
Q

What are Recurrent Neural Networks?

A

They allow to process sequential data by feeding back the output of the network into the next input.