Chapter 8- Neural Networks Flashcards

1
Q

what type of neural networks demonstrate above human level performance in chess and go?

A

convolutional neural networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is alex net?

A

a convolutional neural network that outperformed other models in the imagenet challenge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a spiking neural network?

A

it aims to mimic a biological neuron more closely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

give a neuron mathematically in sum notation y(x,y) =

A

f(sum: wx +b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

give a neuron mathematically in matrix notation

A

f(W.X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

give three different activation functions?

A

threshold, sigmoid, softmax

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the activation function used for logistic regression?

A

sigmoid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

give the sigmoid activation function y(X,W) = ?

A

1 / (1 + e^-z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

when is the softmax activation function used?

A

when we have multiple, mutually exclusive classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

softmax is an extension of…?

A

the logistic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

give the equation for gradient descent, w_new = ?

A

w_old - lamda (dL/dw)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

give the equation for squared error loss, L = ?

A

0.5(y-t)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is backpropagation?

A

the application of the chain rule for neural networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the two stopping criteria we use for neural networks?

A

maximum number of epochs

early stopping criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does the learning rate determine?

A

how large an adjustment we make to each weight at each iteration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what neural network structure should be sufficient to approximate any function?

A

a multilayer perceptron with one hidden layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is the advantage of adding more layers to a model, rather than more neurons?

A

increases flexibility with fewer free parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are the three approaches to establishing neural network architecture?

A

experimentation, heuristics, pre-trained models (transfer learning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

when we add too many layers/neurons to a model, we risk…?

A

overfitting

20
Q

what is bias error?

A

error due to an erroneous assumption in the model

21
Q

what is variance error?

A

error due to the algorithm fitting to noise in the training data

22
Q

what kind of error decreases as we make a model more complex?

A

bias

23
Q

describe the idea behind a drop out scheme

A

begin with an overly complex model

during training, the output of any individual neuron is ignored with probability p

24
Q

what is the traditional error curve of simpler models

A

test error decreases up until the model is sufficiently complex and then increases

25
Q

what is double descent?

A

if we continue to increase the number of hidden layers, the test error decreases again

26
Q

what are the two problems that deep(er) neural networks face?

A

training time

vanishing gradient problem

27
Q

what is the vanishing gradient problem

A

the weight update in the early layers can be extremely close to zero

28
Q

what are the two ways to fix the vanishing gradient problem?

A

relu, rectified linear unit activation function

feed the output of a neuron directly into a later stage of the network

29
Q

describe the relu activation function

A

all negative values for y are set to 0, has a gradient of 1 for positive values for y

30
Q

what is the input to a cnn?

A

raw 2d image

31
Q

how do we represent each layer of a cnn?

A

rectangle

32
Q

what is a fully connected layer?

A

each neuron in the current layer is connected to all neurons in the next layer

33
Q

what do the different layers of the cnn learn, how is it different to a standard mlp? (hint: features)

A

early layers learn the features that are used for classification, rather than them needing to be pre-selected

34
Q

give a kernel for horizontal lines

A

1 1 1
0 0 0
-1 -1 -1

35
Q

give a kernel for vertical lines

A

1 0 -1
1 0 -1
1 0 -1

36
Q

what is padding?

A

ensure the size of the output remains the same size by extending the original image in a non informative way

37
Q

what is a pooling operation?

A

select the (max/min/average) value in a local area

38
Q

what is the effect of pooling, why do we do it?

A

makes the model more robust to variations in position

39
Q

what is the flatten layer?

A

converts the 2d map into a 1d array of features

40
Q

what is a recurrent neural network

A

include loops in the hidden layer neurons which allow the network to use historical information

41
Q

on what type of data are recurrent neural networks useful?

A

time series

sequence

42
Q

what is a generative adversarial network?

A

generates data that the discriminator thinks is from the training set.

43
Q

what are the two parts of a generative adversarial network?

A

generator and a discriminator

44
Q

relu vs sigmoid

A

relu is faster and solves the vanishing gradient problem

45
Q

cons of neural networks (2)

A

does not take into account spatial or temporal information

black box

46
Q

what is a capsule neural network

A

we not only have a representation of the image, put also its pose