Supervised Learning Flashcards

1
Q

Bias error

A

Model does/can not correctly represent the concept (underfit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variance error

A

Model specializes in training set (overfit)
Regularization (favoring smoother
functions, output varies slowly with input) helps to mitigate the variance error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multilinear Regression assumes

A
  • Relation between xi and y is linear
  • All variables (x) have Normal distributions
  • Variables are independent and residual / error is constant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The input of an artificial
neuron:

A

Comes from all neurons of the previous layer or it is an external
input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The output of an artificial neuron

A

Is sent to all neurons of
the next layer or is (part of) the network output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Backpropagation

A
  1. Present each example (x(i),d(i))
  2. Calculate network response x(i) : f(x(i))
  3. Propagate error backwards (iteratively building error
    derivative at each layer)
  4. Save partial derivatives
  5. After all examples processed, update weights
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Artificial Neural Networks are

A
  • Robust to noise and approximations
  • Based in a simplified model of a neuron
  • Support incremental training
  • Compress information of many examples in a small model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Deep Learning

A

Alternating prediction layers with feature detection and decorrelation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Deep Learning - network structure

A
  • Convolutional layers: apply convolutions to get the feature maps
  • Pooling (sub-sampling) layers: reduce feature maps’ dimensions (combine features and/or decorrelate)
  • Dense layers – similar to the “hidden” layers on a classical neuronal network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

kNN problems

A
  • Define distance
  • Define class selection
  • Non-linear problems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A set has the largest entropy if

A

each of its elements belongs to a
different class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

PlayTennis(no/yes) - entropy(S)

A

− ( P(no) x log2 (P(no)) + P(yes) x log2 (P(yes)) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Decision Tree - the best split

A

The best split is the split that results in the largest entropy reduction, that is, the largest information
gain (IG)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Decision Tree - C4.5 / C5.0

A

Similar to ID3, but. . .
* Support for continuous attributes - discretizes continuous attributes
* Allows missing values - examples not used when calculating entropy
* Allows different costs for attributes
* Prunning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Learning ensembles

A

Boosting (Kearns 88)
* Can a set of weak learners create a single strong
learner?
* Classification combines the results of all the subtrees
* Misclassified examples become more important for
the error in each iteration
* New trees are trained to fit the residual error

Bagging - Bootstrap aggregating: (Breiman 96)
* Selects randomly the subsets
* Trains several learners,
* Classification by voting, regression by averaging

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

XGBoost (eXtreme Gradient
Boosting)

A
  • An optimized Gradient Boosting Machine
  • Uses many small trees
  • Classifies an example by joining the scores of each of the various trees
  • Train by adding trees that improve the result or pruning
  • Trees are fitted to predict the residual error