CNN Flashcards

1
Q

IMAGENET dataset

A

Dataset about daily use materials and animals
227227
3
* 1,281,167 training images,
* 50,000 validation images, and
* 100,000 test images
* 1000 classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

CNN

A

Special kind of neural network for processing data that has a grid-like topology, like time series data (1D) or image data (2D).

CNNs consists of: -
1. Convolution layer
2. Pooling layer
3. Fully connected layer (ANN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why not use ANN on image data?

A
  1. Computational complexity (Large no. of pixels)
  2. Overfitting
  3. Loss of spatial arrangement (Since, 2D image is converted to 1D layer)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does CNN work on image data?

A

Initial convolutional layer extracts primitive features (edges).
Going further in network, more complex features are extracted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Greyscale image

A

B/W
Single channel
Values between [0-255]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Colored Image

A

RGB
Three channel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is convolution?

A

Convolution is element-wise matrix multiplication, where kernel (filter) is multiplied with the input pixels to get the feature map.

The process of detecting features in an image is called convolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the value of filter (kernel) is decided?

A

Initialized with random value.
Decided during backpropagation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is filter?

A

A matrix of weights that slides over the input pixels, perform element wise multiplication to give a single output pixel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is padding?

A

Contribution of edge pixel is less to form the output than the central pixels. In order to make them equal, we use padding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is stride?

A

Stride decide how our weight matrix should move in the input, i.e. jumping one step or two.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Valid padding

A

No padding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Same padding

A

Automatic padding so that size of input image is same as feature map

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula to find the output after convolution

A

[n + 2p - f ]/s + 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why are strides required?

A
  1. Extract only high level features
  2. Limit feature; helps reduce complexity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is pooling required?

A

This is because convolution has:

  1. Memory issue
  2. Translation variance

Though, increasing the stride will address the memory issue but translation variance problem will not be solved by stride.

Pooling down sample the feature map.

17
Q

Translation Invariance

A

The ability to ignore positional shifts, or translations, of the target in the image.

18
Q

Type of pooling

A
  1. Max pooling
  2. Avg pooling
  3. Global pooling (Global max & Global avg)
19
Q

Advantage of pooling

A
  1. Reduced image size (due to down sampling)
  2. Translation invariance
  3. Enhanced feature (Only in max pooling)
20
Q

Disadvantage of pooling

A
  1. Not suggested for Image segmentation tasks
  2. Loss of information
21
Q

ANN vs CNN

A

Similarity: -
1. Input*weights + bias; both works in the same way. In case of CNN, weights means filters’ weights.

Differences: -
1. No. of learnable parameter do not depend on input in CNN

22
Q

How to reduce overfitting in CNN model?

A
  1. Add more data
  2. Data Augmentation
  3. L1/L2 Regularization
  4. Batch Normalization
  5. Dropout
23
Q

Why do we need data augmentation?

A
  1. To generate more data
  2. To reduce overfitting (Increase generalization of image)

It includes Image rotation, scaling, flip, zoom,

24
Q

Why do we need pretrained models?

A
  1. Absence of labeled data
  2. CNN is computationally expensive to train
25
Q

What is pooling

A

Also called down sampling, as it carries out dimensionality reduction. The feature parameters in the input are reduced to only the necessary parameters to reduce complexity and improve the network’s performance. It also helps avoid the problem of overfitting.

Pooling operation involves sliding a two-dimensional filter over each channel of feature map and summarizing the features lying within the region covered by the filter.

26
Q

Feature map

A

Output of the convolutional layer, a numerical representation of the image, which is used to identify patterns from the image.

27
Q

ResNet

A

ResNet stands for residual network, which refers to the residual blocks that make up the architecture of the network.

28
Q

Residual connections

A

Skip connections, also known as residual connections are implemented by adding the output of an earlier layer to the output of a later layer.

They allow for the preservation of information from earlier layers, which helps the network to learn better representations of the input data and solve the problem of vanishing gradients.

29
Q

Key Features of ResNet-50

A
  • ILSVRC’15 classification winner (3.57% top 5 error ResNet-152)
  • Has other variants also (with 35, 101, 152 layers)
  • Every ‘residual block‘ has two 3×3 convolution layers
  • No FC layer, except one last 1000 FC softmax layer for classification
  • Global average pooling layer after the last convolution
  • Batch Normalization after every convolution layer
  • SGD + momentum (0.9)
  • No dropout used