lecture 5: backpropagation Flashcards

Question

convolution

Answer 1

- core operation of CNNs - mathematical operation where a kernel slides across an input image (or feature map) to produce an output feature map

Answer 2

- Each neuron in a layer is connected only to a small local patch of the input image, rather than the entire image. - The same kernel is reused across the entire image, reducing the number of parameters. - i.e., only local connections, leading to fewer weights to train

Answer 3

- from layer to layer - convolution is implemented with with more and more ‘complex’ kernels - e.g., edges or colors > shapes or textures > complex features, such as object parts.

Answer 4

Both CNNs and the visual system use hierarchical, localized processing (similar to neuronal receptive fields) to build an understanding of the input.

Answer 5

- ReLU (Rectified Linear Unit) - has a steep gradient when active and no activation for negative input, making training faster.

Answer 6

takes the maximum value within a small neighborhood (e.g., 2x2) in the feature map to reduce dimensions while retaining important information.

Answer 7

- Converts raw scores into probabilities - ensuring that: 1. All outputs are positive. 2. They sum to 1. 3. This makes the output interpretable as a probability distribution.

Answer 8

stacked convolutional layers with max pooling on top of one another, followed by fully (dense) connected layers

Answer 9

- after each convolutional block, a max-pooling layer reduces the size of the feature map while retaining the most important features. - after all convolutional and pooling layers, the flattened feature maps are passed into fully connected layers. - the final dense layer has as many neurons as there are classes and outputs outputs probabilities for each class using using softmax.

Answer 10

1. many parameters 2. large training sets 3. fast computers

Answer 11

better than human performance on visual recognition task

Answer 12

- **avoids the vanishing gradient problem of sigmoid** - has a simple computation - allows for sparse activation, speeding up learning.

Answer 13

1. Backpropagation is not biologically plausible. 2. Most human learning is unsupervised, while backpropagation focuses on supervised learning.

Answer 14

He describes supervised learning as "the cherry on the cake of neural networks" because most learning, including human learning, is unsupervised.

Answer 15

- The human brain has about 10^14 synapses but only about 10^9 seconds in a lifetime. - There are far more parameters than data, so most learning must be unsupervised.

Answer 16

1. It requires global knowledge of all gradients and weights to compute local error contributions, which is impossible in the brain. 2. Backpropagation through time (BPTT) needs error signals to "trickle down" to lower layers, which is impractical as the original input is long gone.

Answer 17

While backpropagation enables Convolutional Neural Networks (CNNs) to perform **specific tasks** like image recognition, it cannot replicate the flexible learning and reasoning of the human brain, which involves **unsupervised and adaptive learning**.

Answer 18

‘The brain has about 10^14 synapses and we only live for about 10^9 seconds. so we have a lot more parameters than data. we must do a lot of unsupervised learning’

lecture 5: backpropagation Flashcards

(42 cards)