3 - Neural Net for modern AI Flashcards

1
Q

Action Potential

A

Voltage change spikes on a neuron membrane.

Natural brains do this but current AIs do not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Perceptron

A

inputs are combined with weights to create an output of 1 or 0

x0 is a input called bias and is set to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Perceptron Learning rule

A

Compute the value of function and then compare with the real value and modify the weights

new w is old one plus learning rate multiplied by the difference of the real value and network output and multiply by input

newW = oldW + alpha * (real-out) * input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Issue with one perceptron

A

One perceptron cannot solve the XOR problem

(Imagine you have . and | you cannot split them with one straight line)
. |
| .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Limitations of the perceptron (other than the XOR issue)

A

Multi layer perceptrons can solve more complex problems

but

there is no method to train the weights w of stacked perceptrons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you train a network with multiple layers (perceptron)

A

Avoid the step function and use a function that has a computable derivative.

This is so that error can be propagated back through the layers

Sigmoid etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why use the sigmoid function?

A

You can compute the derivative and use this to backpropagate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SImple linear classifier example

A

f(x,W) = Wx+b

Matrix multiply W and x then add the matrix b

Image = 32* 32 * 3
W = 3072 * 10 (h,w matrix)
x = 3072 * 1 (h,w matrix)
f = 10 * 1 (h,w matrix)
b = 10 * 1 (bias values)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Loss function

A

Whether the system is doing well. or not.

For example
Li = sum(where j /= yi) max(0,sj-syi+1)

The +1 is a buffer. The true class needs to be at least one better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Limitation of hinge loss

A

The scores have no meaning, other than a comparison measure. So use probabilities instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Softmax

A

S=f(xi,W)
Take the score for each class and divide it by the score for all classes

((e^S)K)/(sum(j) (e^S)j)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Softmax why e^x

A

All values are positive

sum of all the values is 1 so they are transformed into probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Softmax log is computed as

A

Li = -log(P(Y= yi | X = xi))

Loss is zero when probability is 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cross ENtropy Loss

A

L = -1/N * (sum(i->1 to N)(sum(k->1 to k)(yik* log(Yik)))

Where N is the num of observations
yik is binary indicator of whether i belongs to class k
Yik is predicted probability that i belongs to k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly