LESSON 9 - Supervised deep learning - discriminative Flashcards

1
Q

What is the primary use of deep learning, especially in the context of neural networks?

A

Deep learning is mainly used for classification problems, such as object recognition, and involves hierarchical processing in neural organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does the structure of hierarchical processing in the brain differ from traditional neural network architectures?

A

Unlike the traditional neural network architectures with limited hierarchy (input – hidden neuron – output), the brain’s circuits are more sophisticated, involving many layers of processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What neuroscientific model reflects the processing specialized in object recognition, and how was it implemented?

A

The neuroscientific model reflecting object recognition processing involves simple cells extracting features from an image, which are then picked up by complex cells. It was implemented as a computer simulation without learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What hindered the application of deep learning in the past, and what changed to make it successful?

A

Limited computational power in the past hindered deep learning. Success was facilitated by the availability of parallel computing architectures, particularly Graphic Processing Units (GPUs), which significantly enhanced computational capabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does deep learning differ in its approach to supervised learning compared to other machine learning methods?

A

In deep learning, learning is typically run end-to-end, meaning the model learns directly from raw data (e.g., pixels) to object classes without the need for pre-processing or feature extraction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the role of Rectified Linear Units (RELU) in addressing the problem of weak gradients in deep learning?

A

RELU is used to replace the Sigmoid activation function, providing a linear response and preventing the weakening of gradients during backpropagation, which accelerates learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What challenges arise when dealing with many hidden layers in deep learning, and how can overfitting be addressed?

A

With many hidden layers, backpropagating errors becomes challenging. Overfitting, due to the model’s complexity, is tackled by employing stronger regularization methods, such as weight decay and dropout.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the significance of sparse connectivity in convolutional neural networks (CNNs)?

A

Sparse connectivity, as seen in CNNs, means not every neuron in one layer is connected to a hidden neuron in the next layer. This approach is similar to the receptive field idea and is crucial for pattern detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are convolutional layers implemented in CNNs, and what do they capture?

A

Convolutional layers use filters to convolve over input images, capturing features through weighted sums. These features, such as oriented edges, are crucial for recognizing patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the purpose of pooling layers in CNNs?

A

Pooling layers reduce the dimensionality of images and emphasize salient features. The pooling operation, like max pooling, helps compress information while highlighting important aspects of the image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the soft-max activation function in the output layer of a neural network work, and what does it represent?

A

The soft-max activation function ensures that all activations sum up to 1, allowing the output to be interpreted as probabilities. It represents the confidence of the network in various output classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the role of weight sharing in CNNs, and how does it contribute to the network’s architecture?

A

Weight sharing, where neurons share the same connections, is a key aspect of CNNs. It helps capture specific features from different portions of the input image, enhancing the network’s ability to recognize patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What was the significance of LeCun’s convolutional neural network, developed before the deep learning era?

A

LeCun’s convolutional neural network was a pioneering model, particularly working on the MNIST problem. It laid the foundation for convolutional architectures and is considered a precursor to deep learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What marked the start of deep learning in image processing and vision in 2012, and which neural network was instrumental?

A

AlexNet, developed in 2012, marked the beginning of deep learning in image processing and vision, representing a game-changer for the field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How did GoogLeNet contribute to the progress of deep learning, and what characterized its architecture?

A

GoogLeNet, known for its sophistication, contributed significantly to deep learning progress. Training required weeks on supercomputers, highlighting the model’s complexity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How has deep learning made advancements in medical imaging, specifically in skin lesion classification?

A

Deep learning, particularly CNNs, has made progress in skin lesion classification for benign or malignant purposes. It demonstrated quality surpassing that of expert dermatologists on a test set.

17
Q

In terms of computational efficiency, why are mini-batches preferred in stochastic gradient descent during training?

A

Mini-batches are more computationally efficient than fully stochastic gradient descent because they involve a small number of examples compared to the entire training dataset, leading to more stable gradients.

18
Q

How does the soft-max activation function facilitate the interpretation of neural network output in terms of probabilities?

A

The soft-max activation function ensures that the output of each neuron can be interpreted as a probability, with the sum of all probabilities across neurons totaling 1.

19
Q

What is the purpose of dropout as a regularization method in deep learning, and how does it work?

A

Dropout is a regularization method that involves removing hidden neurons during training to check their necessity. By setting all connections to 0, the neuron practically does not exist during that instance.

20
Q

Why is sparse connectivity considered an architectural constraint in convolutional neural networks?

A

Sparse connectivity, restricting connections between neurons in one layer to a small portion of the image, is crucial for capturing specific features and aligns with the concept of receptive fields.

21
Q

How does pooling help in reducing the dimensionality of images in convolutional neural networks?

A

Pooling layers reduce dimensionality and emphasize important features by selecting, for instance, the maximum value from a group of values in a matrix (max pooling). This aids in compressing and highlighting salient information.

22
Q

What was the role of weight sharing in convolutional neural networks, and how does it contribute to the network’s architecture?

A

Weight sharing involves neurons sharing connections, enabling the network to capture specific features from different portions of the input image. It contributes to the network’s ability to recognize patterns.

23
Q

Why is the output of a neural network using the soft-max activation function interpreted as probabilities?

A

The soft-max activation function ensures that the activations of neurons in the output layer sum up to 1, allowing them to be interpreted as probabilities. Each activation represents the probability of a particular class.

24
Q

What characterized LeCun’s convolutional neural network developed before the deep learning era?

A

LeCun’s convolutional neural network developed before the deep learning era was characterized by its pioneering work on the MNIST problem. It utilized an input size of 27x27, convolutional and pooling layers, fully connected layers, and an output layer. This model laid the foundation for convolutional architectures and is considered a precursor to the deep learning advancements that followed.

25
Q

How did the computational power contribute to the success of deep learning, and what type of architecture played a crucial role?

A

The success of deep learning is attributed to the availability of computing power, especially through parallel computing architectures. Graphic Processing Units (GPUs), which are simpler processors capable of parallel computations, played a crucial role in the efficiency of running neural network simulations for deep learning.

26
Q

In deep learning, how does the approach differ regarding the learning process from raw data to output compared to traditional machine learning methods?

A

In deep learning, the learning process occurs end-to-end, starting from raw data (e.g., pixel values in images) to the output classes or categories. Unlike traditional machine learning methods, deep learning does not require preprocessing data to extract features; instead, the network learns to encode and discover relevant features during the learning process.

27
Q

What challenges arise when there are many hidden layers in a neural network, and how is the issue of weakened error gradients addressed, particularly when using the Sigmoid activation function?

A

Challenges arise with many hidden layers in a neural network, as error gradients computed at output layers need to be backpropagated through multiple hidden layers. This becomes particularly severe when using the Sigmoid activation function, which leads to weakened gradients and slower learning. The issue is addressed by using Rectified Linear Units (RELU), a linear response activation function that helps prevent vanishing gradients.

28
Q

What is the significance of sparse connectivity in the context of convolutional neural networks, and how is the concept of weight sharing implemented?

A

Sparse connectivity in convolutional neural networks (CNNs) refers to the notion that not every neuron in one layer is connected to a hidden neuron in the next layer. This idea is implemented through weight sharing, where each neuron in the input layer connects to only a small, restricted part of the visual space defined as its receptive field. This concept is foundational in CNNs, resembling the idea of receptive fields in visual neurons.