Activation Functions Flashcards

Question 1

Q

What are the advantages of linear activation functions?

Answer

A

You can have multiple outputs rather than just 0 and 1 which are porportionate to the input.

Question 2

Q

Whats a disadvantage of linear activation functions?

Answer

A

You cannot use gradient descent as the gradient of a linear function is a constant irrespective of input. Also combinations of linear functions are also linear, meaning you essentially collapse networks into single nodes.

Question 3

Q

What are the advantages of the sigmoid function ?

1/(1 + e^-x)

Answer

A

Unlike step sigmoid has smooth curve meaning probabilities of things being in a class can be determined as well.
Outputs are normalised between zero and 1.

Question 4

Q

What are the disadvantages of the sigmoid function?

1/(1 + e^-x)

Answer

A

Vanishing gradient problem, For large or small inputs the results will not change that much causing slow learning.

Question 5

Q

What are the advantages of tanh function?

(e^x - e^-x ) / (e^x + e^-x )

Answer

A

Also S shaped allowing continuous results.
range between 1 and -1
Tanh’s gradient can reach a max of 1, whereas sigmoid can only reach 0.25

Question 6

Q

What are the disadvantages of the tanh function?

(e^x - e^-x ) / (e^x + e^-x )

Answer

A

Vanishing gradient
computationally expensive due to exponentials

Question 7

Q

How are tanh and sigmoid releated?

Answer

A

tanh(x) = 2 * sigmoid(x) - 1

Question 8

Q

what is the derivative of sigmoid?

Answer

A

sig(x)(1 - sig(x))

Question 9

Q

What is the derivative of tanh?

Answer

A

1- tanh(x)^2

Question 10

Q

What are the advantages of the relu function?

Answer

A

Simple and easy to compute causing fast convergence
No vanishing gradient problem

Question 11

Q

What are the disadvantages of the relu function?

Answer

A

negative inputs always result in 0, this can cause the relu neuron to become inactive known as dying relu.
Not a zero centred function meaning no derivative at zero exists.

Question 12

Q

What is leaky relu?

Answer

A

Like relu but makes negative inputs smaller rather than 0.

Question 13

Q

What are the advantages of Leaky Relu?

Answer

A

easy to compute
no vanishing gradient
no dying relu

Question 14

Q

What are the disadvantages of Leaky Relu?

Answer

A

not zero centred.

Question 15

Q

What are the advantages of ELU?

Answer

A

smoother than leaky relu at x=0
no dying relu
no vanishing gradient

Question 16

Q

What are the disadvantages of ELU?

Answer

Study These Flashcards

A

more expensive to compute than leaky relu.

Question 17

Q

What is softmax?

Answer

Study These Flashcards

A

takes an array of numbers, and normalises them between 0 and 1 based on their exponential values. Used for multi class classification.

Activation Functions Flashcards

(17 cards)