Quiz 3 Flashcards

Question

CNN: During backpropagation in a convolutional layer, what operation is performed to compute the gradients for the kernel?

Answer 1

Element-wise multiplication betwen gradients of the loss wrt output and input, then summed.

Answer 2

To preserve spatial dimensions. Otherwise deep layers becomes smaller and smaller.

Answer 3

Valid: No padding, window always within input image Same: Padding added to keep output size equal to input

Answer 4

Reduces spatial dimensions through downsampling. Adds invariance to translation of features.

Answer 5

Property where a model is robust to certain transformations in the input. Practically, this explains how a CNN may be able to classify an object in an image regardless of where in the image it is located.

Answer 6

Property where a model can maintain the relationship between different elements after a transformation occurs (e.g. scaling, rotation, time shift)

Answer 7

Shared weights and bias

Answer 8

Convolution layers maintain spatial relationships between features. E.g. If an image rotates, the convolution will also rotate.

Answer 9

Multiply downstream gradient elements into corresponding receptive field. Then add all the receptive fields together.

Answer 10

False. Adding more convolutional layers increases the receptive field size linearly, as each extra layer increases the receptive field size by the kernel size.

Answer 11

Reshape weights for a node back into size of image

Answer 12

For each kernel scale values from 0-255 and visualize. Each kernel becomes a feature map.

Answer 13

Performs non-linear mapping of high dimensional data to 2D space. Preserve pair-wise distances.

Answer 14

Given an input image and a convolution kernel in the network, we can view what area of the kernel had the highest activation.

Answer 15

1. No intrinsic measure of utility. Need user studies to measure usefulness of visualization. 2. Neural networks learn "distributed representation" - 1:1 mapping of node to feature not guaranteed.

Answer 16

Updates the input in the direction of the gradient (rather than opposite in gradient descent)

Answer 17

Applies ReLU forward and zeroes out negative gradients in addition of it. Improves visualization by only keeping positive gradients.

Answer 18

Visualizes area of the image with high gradients

Answer 19

See which area of the image the network focused on (using dog vs snow to classify wolf example)

Answer 20

Generates heat maps highlighting regions of an input image that contribute the most to a specific class prediction

Answer 21

Computes gradient of the target class score with respect to the feature maps of the last convolutional layer. Reweight feature maps per channel and apply ReLU.

Answer 22

Guided Grad-CAM multiplies guided backprop and Grad-CAM.

Answer 23

Class visualization

Answer 24

Attacker has complete picture of the target model (network, params, data)

Answer 25

Attacker has limited or no picture of the target model. Generally uses trial-and-error attempts to craft adversarial examples.

Answer 26

CNNs tend to be more biased towards texture than shape. Remediating this bias improves accuracy and robustness.

Answer 27

Style-loss function - minimize squared diff between gram matrices Content-loss function - match features of content image and generate image

Answer 28

Square matrix that represents relationships between vectors.

Answer 29

Predicts classes for each pixel.

Answer 30

Decoders are symmetrical to forward. Takes small feature maps and upsamples them back to the original image.

Answer 31

Puts back the max output value back into the receptive field when decoding. Non-max pixels are left as zero.

Answer 32

Each pixel in the input is multiplied across all kernels values, then "stamped" to the output dimension.

Answer 33

Uses skip connections like ResNet but in a encoder-decoder network.

Answer 34

Task of identifying and setting a bounding box for an identified object.

Answer 35

Cross-entropy loss for classification + Mean squared error for bounding box

Answer 36

When an architecture performs several tasks with shared features.

Answer 37

Uses a grid and for each grid makes K bounding boxes. Estimates refined boxes across multiple layers. Selects box with highest confidence score among group of overlapping boxes for an object.

Answer 38

Predict bounding box + classification in a single pass. Special loss function to minimize both errors at once.

Answer 39

Take intersection of bounding box (pred vs truth) and divide it by the union to determine wellness of fit. Calculate precision/recall curve and calculate its average precision over all classes.

Answer 40

Step 1 - determine regions of interest Step 2 - classify those regions

Answer 41

Unsupervised learning

Answer 42

Use bounding boxes within feature maps, then map to input image.

Answer 43

Reuses computation

Answer 44

Applies a fixed grid to the feature map and applies max pooling to each cell in the grid with respect to the corresponding feature map.

Answer 45

Can backpropagate

Answer 46

Uses a region proposal network (RPN) to generate candidate regions. Take top-K and classify.

Answer 47

Applies mask to boxes to detect which pixels is an object

Answer 48

For the top-left element of the output: (1*1) + (2*0) (4*0) + (5*(-1)) Result: -5 For the top-right element of the output: (2*1) + (3*0) (5*0) + (6*(-1)) Result: -6 For the bottom-left element of the output: (4*1) + (5*0) (7*0) + (8*(-1)) Result: -12 For the bottom-right element of the output: (5*1) + (6*0) (8*0) + (9*(-1)) Result: -9 The resulting 2x2 output matrix: -5 -6 -12 -9

Answer 49

dL/d(1) = (1*1) + (2*2) + (3*3) + (4*4) + (5*5) + (6*6) + (7*7) + (8*8) + (9*9) = 285

Answer 50

Output Size= (4 - 2) / 2 + 1

Answer 51

2x2 (same as output)

Answer 52

2x2 (same shape as kernel)

Answer 53

760 Formula = (Channels * Kernel * Kernel + Bias) * Filters = (3 * 5 * 5 +1) * 10 = 760

Answer 54

2353 Memory requirement is the product of input and channel

Quiz 3 Flashcards

(83 cards)