fine tune - hyperparams - transfer Flashcards

1
Q

What is the purpose of Grid Search in hyperparameter tuning?

A

Grid Search exhaustively searches over a specified parameter grid by evaluating all possible combinations of hyperparameter values to find the best configuration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does Random Search differ from Grid Search in hyperparameter tuning?

A

Random Search randomly samples hyperparameter combinations within defined ranges, which can be more efficient than Grid Search when dealing with large parameter spaces.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Bayesian Optimization in the context of hyperparameter tuning?

A

Bayesian Optimization uses probabilistic models to guide the search for optimal hyperparameters, balancing exploration and exploitation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the objective of Learning Rate Adjustment in fine-tuning methods?

A

Learning Rate Adjustment involves modifying the learning rate during training to improve convergence, commonly using techniques like step decay, exponential decay, or learning rate schedulers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does Feature Extraction mean in fine-tuning?

A

Feature Extraction involves freezing the pretrained model layers and using them to extract features from the data, typically followed by training a new classifier on top.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Full Fine-Tuning entail in neural network training?

A

Full Fine-Tuning involves retraining the entire pretrained model on new data, allowing all layers to adapt to the specific task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the role of Adding Task-Specific Layers in fine-tuning?

A

Adding Task-Specific Layers involves appending custom layers to a pretrained model for specialized tasks, such as classification or regression, while optionally freezing other layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Pretrained Model Utilization in transfer learning?

A

Pretrained Model Utilization leverages models trained on large datasets to solve related tasks by adapting their learned features to new data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can a pretrained model be used for classification tasks in transfer learning?

A

A pretrained model can be used by replacing its final classification layer with a new layer tailored to the target classes, and then fine-tuning or training only that layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does utilizing a pretrained model for predictions involve?

A

It involves using the pretrained model as is to make predictions on new data without further training, relying on its existing learned features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of logits in transfer learning?

A

Logits, the raw output of a model before applying an activation function like softmax, can be used as intermediate representations for downstream tasks or ensembling models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is an encoder from a pretrained model used in transfer learning?

A

The encoder, typically the feature extraction part of the model, is used to generate embeddings from input data, which can be fed into custom classifiers or task-specific layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is Grid Search computationally expensive?

A

Grid Search evaluates all possible combinations of hyperparameters, which can be infeasible for large search spaces or computationally intensive models. The # combinations grows proportionaly with the exponent of the amount of features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the primary benefit of Random Search over Grid Search?

A

Random Search can cover a wide range of hyperparameter values without evaluating all combinations, often finding good configurations faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an advantage of Bayesian Optimization for tuning?

A

Bayesian Optimization focuses on promising regions of the hyperparameter space, reducing the number of evaluations required compared to exhaustive methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is Feature Extraction preferred over Full Fine-Tuning?

A

Feature Extraction is preferred when the dataset is small or similar to the pretrained model’s original dataset, reducing the risk of overfitting.

17
Q

Why is learning rate crucial in fine-tuning neural networks?

A

The learning rate determines the step size for updates to model parameters, impacting convergence speed and the risk of overshooting the optimal solution.

18
Q

Why is transfer learning effective for tasks with limited data?

A

Transfer learning leverages knowledge from large pretrained models or a large dataset, reducing the need for extensive labeled data and training time on new tasks.

19
Q

How does transfer learning differ from training from scratch?

A

Transfer learning starts with a pretrained model to help a new model to converge faster, whereas training from scratch requires initializing and training the model entirely on the target dataset.