Kursusgang 9 (Deep learning and transfer learning) Flashcards

1
Q

What is deep learning?

A

A class of machine learning techniques that exploit many layers of non-linear information processing for feature extraction and transformation and for pattern analysis and classification.

Deep learning is capable of learning complex features in the data and it is able to handle large amounts of data, both labeled and unlabeled. Due to the ever-increasing computational power, it is now feasible with very large models e.g. large language models with a few trillion parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key models in neural networks?

A

Deep Neural Network (DNN)
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Long Short-Term Memory (LSTM) RNNs
Generative Adversarial Networks (GANs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why did deep learning become popular in the early 2010’s?

A

Significantly more processing power, which allows higher model complexity and more adjustments to model structure.
We are in the big-data era, allowing interpolation rather than extrapolation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why use deep learning, when single hidden layer neural network are universal approximators?

A

Deep machines can represent more complex functions with less parameters and they can learn the same function with less training data, by reusing low-level feature detectors. Furthermore, features at higher layers more invariant (less sensitive to small shifts in the input) and discriminative than at lower layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or false: The output layer of a deep neural network is typically a softmax function.

A

True, it guarantees that the output is a probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the drawbacks of deep neural networks?

A

They do not explicitly exploit known structures (e.g. translational variability) in the input data.
Furthermore, they do not explicitly apply operations that reduces variability (e.g., pooling and aggregation).
Hyperparameters can be hard to determine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What makes convolutional neural networks unique?

A

They replace the matrix multiplication in normal neural networks with convolution to
* Explicitly exploit the data structure
* Automatically generalize across spatial translations of inputs
* Be applicable to any input that is laid out on a grid (1-D, 2-D, 3-D, and so on)
Furthermore, they use pooling to reduce variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is pooling?

A

Divide the feature map into smaller regions: These regions are typically non-overlapping. Pooling summarizes each region which is done by applying a pooling function:
* Max pooling: Selects the maximum value within each region.
* Average pooling: Calculates the average value within each region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are convolutional neural networks specifically designed for classification?

A

Early layers learn simple patterns like edges or gradients. Deeper layers capture more abstract patterns like shapes or objects. They alternate between convolutional and pooling layers. Just before the ouput, stack a deep neural network on top for the classification based on the features extracted by the convolutional neural network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the limitations of convolutional neural networks?

A

They mainly deal with translational variability. They cannot take advantage dependencies and correlations between samples (and labels) in a sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What makes recurrent neural networks unique?

A

It can model sequential data in a natural way, which is important if memory is important for decision making.
Recurrent neural networks have an internal state, often called a “hidden state,” that stores information about past inputs. This memory allows them to process sequential data effectively.
Recurrent connections loop information from previous steps back into the current step, enabling the network to maintain context and capture dependencies between elements in a sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the limitations of recurrent neural networks?

A

Simple recurrent neural networks are difficult to train due to diminishing and explosion of gradients over time and they have difficulty modeling long-range dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a long short-term memory recurrent neural network?

A

It is a specific type of recurrent neural network designed to specifically deal with the diminishing/exploding gradient problem in typical recurrent neural networks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a generative adversarial network?

A

Generative Adversarial Networks are a type of deep learning architecture that can generate highly realistic synthetic data. They are based on pitting two neural networks against each other in a competitive game.

A generator network will try to produce as realistic as possible outputs, such as images, and a discriminator network will try to differentiate between real images from a dataset and the fake images from the generator network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is a generative adversial network trained?

A

Both the generator and discriminator networks are initialized with random weights.
Alternate Training Steps:

Train the discriminator:
Feed real data samples to the discriminator and train it to classify them as real.
Feed generated data samples from the generator to the discriminator and train it to classify them as fake.

Train the generator:
Generate new data samples.
Feed these generated samples to the discriminator.
Train the generator to “fool” the discriminator by maximizing the probability that the discriminator classifies the generated samples as real.

These training steps continue iteratively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What loss function does a generative adversial network use?

A

Both the generator and discriminator use a cross-entropy loss function.

17
Q

What is transfer learning?

A

Train the model for one task and then use it on a related task, some times with great success.