3 - Neural Net for modern AI Flashcards

Question 1

Q

Action Potential

Answer

A

Voltage change spikes on a neuron membrane.

Natural brains do this but current AIs do not.

Question 2

Q

Perceptron

Answer

A

inputs are combined with weights to create an output of 1 or 0

x0 is a input called bias and is set to 1

Question 3

Q

Perceptron Learning rule

Answer

A

Compute the value of function and then compare with the real value and modify the weights

new w is old one plus learning rate multiplied by the difference of the real value and network output and multiply by input

newW = oldW + alpha * (real-out) * input

Question 4

Q

Issue with one perceptron

Answer

A

One perceptron cannot solve the XOR problem

(Imagine you have . and | you cannot split them with one straight line)
. |
| .

Question 5

Q

Limitations of the perceptron (other than the XOR issue)

Answer

A

Multi layer perceptrons can solve more complex problems

but

there is no method to train the weights w of stacked perceptrons.

Question 6

Q

How can you train a network with multiple layers (perceptron)

Answer

A

Avoid the step function and use a function that has a computable derivative.

This is so that error can be propagated back through the layers

Sigmoid etc

Question 7

Q

Why use the sigmoid function?

Answer

A

You can compute the derivative and use this to backpropagate

Question 8

Q

SImple linear classifier example

Answer

A

f(x,W) = Wx+b

Matrix multiply W and x then add the matrix b

Image = 32* 32 * 3
W = 3072 * 10 (h,w matrix)
x = 3072 * 1 (h,w matrix)
f = 10 * 1 (h,w matrix)
b = 10 * 1 (bias values)

Question 9

Q

Loss function

Answer

A

Whether the system is doing well. or not.

For example
Li = sum(where j /= yi) max(0,sj-syi+1)

The +1 is a buffer. The true class needs to be at least one better

Question 10

Q

Limitation of hinge loss

Answer

A

The scores have no meaning, other than a comparison measure. So use probabilities instead

Question 11

Q

Softmax

Answer

A

S=f(xi,W)
Take the score for each class and divide it by the score for all classes

((e^S)K)/(sum(j) (e^S)j)

Question 12

Q

Softmax why e^x

Answer

A

All values are positive

sum of all the values is 1 so they are transformed into probabilities

Question 13

Q

Softmax log is computed as

Answer

A

Li = -log(P(Y= yi | X = xi))

Loss is zero when probability is 1

Question 14

Q

Cross ENtropy Loss

Answer

A

L = -1/N * (sum(i->1 to N)(sum(k->1 to k)(yik* log(Yik)))

Where N is the num of observations
yik is binary indicator of whether i belongs to class k
Yik is predicted probability that i belongs to k