Session 4 Flashcards

1
Q

what is information gain?

A

it is the most common splitting criterion and is based on entropy
-> it measures how much a split reduces entropy (measures the change in entropy before and after splitting)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does disorder correspond to?

A

how mixed the segment is with respect to the values of attribute of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is entropy?

A

it is a measure of disorder in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how can you calculate entropy?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a parent set?

A

original set of examples (data points before splitting)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a children set?

A

an attribute (e.g., age) can segment the parent set into k children sets (subsets)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When is an attribute chosen for splitting?

A

The attribute that reduces entropy the most (= has the highest information gain) is chosen for the split

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the formula for information gain?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are disadvantages of ID3 decision trees?

A
  • tends to prefer splits that result in large number of partition each beaing pure but small (we get a very wide decision tree)
  • overfitting with less generealization capability (will try to fit in every outlier -> will make a segment for Musk in ranking of CEO salary, even if he is the only one so high up)
  • cannot handle missing value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the application possibilities of ANN (artificial neural networks)?

A
  • spam detection
  • time series prediction
  • pattern recognition (how does van gogh paint)
  • computer games
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how does ANN function?

A

it functions like human neurons -> learning by making interneuron connections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is a single perceptron algorithm ANN?

A

uses no hidden layer and mimics biology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how does an ANN work?

A

inputs go into a propagation function that calculations the net input, then a transform cuntion calculates an activation level, then we reieeve an output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the propagation function?

A

where inputs are independent variables, such as # of amenities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the activation function?

A

function/ level that determines whether a neuron (whether the whole process starts) produces an output or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how does learning work in ANNs?

A
  • comparing computed (predicted) outputs to desired (true target values) outputs of historical cases
  • is defined as a change of weights between units
17
Q

what are the three tasks in the process of learning in ANNs?

A
  1. compute temporary outputs
  2. compare outputs with dired targets
  3. adjust the weights and repeat process
18
Q

when is a data set linearly separable?

A

if there exists a straight line (in 2D) or a hyperplane (in highe dimensions) that can perfectly separate all data points of one class from those of another class without any errors

19
Q

when do we need multilayer perceptron?

A
20
Q

what are the three layers in multilayer perceptrons?

A
  1. input layer: includes single attributes
  2. hidden layers: the middle layer of ANN which has three or more layers - each layer increases the training effort exponentially
  3. output layer: the layer containing the solution of the problem
21
Q

how does the development process of an ANN look like?

A
22
Q

what is the activiation function in MLPs?

A
  • relation between the internal activation level and the output
  • can be linear or non-linear
  • differentiability means if we can build derivates of the cuntion
  • there are different types
23
Q

what are the different types of activation functions?

A
24
Q

what are the four types of learning?

A
  1. supervised learning
  2. unsupervised learning
  3. reinforcement learning (you don’t tell correct output, just say if correct or incorrect)
  4. direct design methods
25
Q

what are the two times of learning?

A
  1. incremental training (you adapt model step by step, by adding new data incrementally)
  2. batch training (you train a model only using a subsample of data at a time)
26
Q

what are the 5 learning rules in ANN?

A
  1. delta rule
  2. gradient descent
  3. back propagation
  4. hebbian rule
  5. competitive leaning
27
Q

what is back propagation?

A
  • similar to delta rule, but also calculates weight changes for hidden layers
28
Q

what is gradient descent?

A
  • finding combinations of all weights so that the sum of the squared errors F is minimized
  • but required high computational complexity in high dimensional spaces
29
Q
A