chatGPT questions exam2 Flashcards by Blink128 YourLove

What does Stochastic Gradient Descent (SGD) optimize in machine learning models?

SGD is used to minimize the loss function of a model.

How well did you know this?

Not at all

Perfectly

How does SGD differ from traditional gradient descent?

Unlike traditional gradient descent that uses the entire dataset for updates, SGD updates model parameters using only a single sample or a small batch of samples.

How well did you know this?

Not at all

Perfectly

What are the key benefits and drawbacks of using SGD?

SGD is more computationally efficient for large datasets and can help escape local minima, but it may lead to slower and less stable convergence.

How well did you know this?

Not at all

Perfectly

What does the FROC curve evaluate in medical image analysis?

the FROC curve evaluates detection and localization performance, plotting sensitivity against the average number of false positives per image.

How well did you know this?

Not at all

Perfectly

Why can accuracy be a misleading performance metric?

Accuracy can be misleading in cases of class imbalance, where it doesn’t reflect the model’s performance on the minority class

How well did you know this?

Not at all

Perfectly

What was the major limitation of the original R-CNN in object detection?

The original R-CNN was slow due to its reliance on selective search for generating region proposals.

How well did you know this?

Not at all

Perfectly

How did Fast R-CNN improve over R-CNN?

Fast R-CNN improved efficiency by sharing computations across region proposals.

How well did you know this?

Not at all

Perfectly

What innovation did Faster R-CNN introduce to object detection?

Faster R-CNN introduced a Region Proposal Network (RPN), allowing for end-to-end training and faster proposal generation.

How well did you know this?

Not at all

Perfectly

How does setting a high threshold for predictions affect sensitivity and specificity?

A high threshold generally increases specificity but may decrease sensitivity by rejecting true positives with lower confidence.

How well did you know this?

Not at all

Perfectly

What approach do convolutional networks use for segmentation tasks?

They utilize architectures with downsampling and upsampling layers, and sometimes strided or dilated convolutions, to produce a pixel-wise segmentation map.

How well did you know this?

Not at all

Perfectly

How is the number of trainable parameters in a convolutional layer calculated?

The formula is (filter height × filter width × input channels + 1) × number of filters, where “+1” accounts for the bias term.

How well did you know this?

Not at all

Perfectly

Why is the learning rate considered a critical hyperparameter in neural network training?

t influences the training dynamics, where too high a rate may cause divergence and too low a rate results in slow convergence.

How well did you know this?

Not at all

Perfectly

What does dropout do in neural networks?

Dropout prevents overfitting by randomly dropping units and their connections during training, simulating training multiple networks in parallel.

How well did you know this?

Not at all

Perfectly

What are the advantages of ReLU over sigmoid functions?

ReLU mitigates the vanishing gradient problem, accelerates convergence, and maintains gradient flow for positive inputs.

How well did you know this?

Not at all

Perfectly

Why is feature normalization important before training a machine learning model?

it scales features to have zero mean and unit variance, ensuring consistent scaling and preventing information leakage from the test set.

How well did you know this?

Not at all

Perfectly

How do residual networks (ResNets) facilitate the training of deep models?

resNets use skip connections to learn residual mappings, alleviating the vanishing gradient problem and enabling deeper architectures.

How well did you know this?

Not at all

Perfectly

What does the receptive field refer to in convolutional networks?

Study These Flashcards

It refers to the size of the input area that influences the network’s output, determined by the cumulative effect of convolution and pooling operations.

How does the softmax function work in neural networks?

Study These Flashcards

It converts raw output scores into probabilities by taking the exponential of each output and normalizing these values by the sum of all exponentials.

What does a false negative (FN) indicate in detection tasks?

Study These Flashcards

A FN occurs when an object is present but not detected by the classifier, which is critical in applications where missing detections can have serious consequences.

What is a dilated (atrous) convolution and its purpose?

Study These Flashcards

It increases the receptive field without increasing the number of weights by inserting spaces between kernel elements, allowing for broader spatial aggregation.

How does U-Net architecture specialize for medical image segmentation?

Study These Flashcards

U-Net uses a symmetric architecture with downsampling and upsampling paths, enabling precise localization and context integration without needing a pre-defined weight map.

How is specificity calculated in a model’s performance evaluation?

Study These Flashcards

Specificity is calculated as the number of true negatives divided by the sum of true negatives and false positives.

What determines the output size of a valid convolution operation?

Study These Flashcards

The output size is determined by (W − F + 1) × (H − F + 1), where W and H are the width and height of the image, and F is the filter size.

What makes the YOLO object detection framework unique?

Study These Flashcards

YOLO processes the entire image in a single evaluation and makes predictions for each grid cell, combining bounding box predictions and class probabilities.

How does backpropagation work in training neural networks?

It computes gradients of the loss with respect to network parameters by applying the chain rule, allowing for efficient parameter updates.

What are histogram-based features in image processing?

These features, like mean, variance, skewness, and kurtosis, describe the distribution of pixel intensities and are invariant to/do not consider spatial relationship and correlation between pixels (identical histograms can belong to different textures)

how does the shift-and-stitch algorithm benefit fully convolutional networks?

It enables the generation of fine-grained outputs by reconstructing the original resolution through multiple shifted input passes.

What improvement does Fast R-CNN offer over the original R-CNN?

Fast R-CNN processes the entire image at once and uses RoI pooling to extract features, improving efficiency and speed

How does histogram matching adjust an image's brightness and contrast?

It modifies an image so that its histogram matches that of a reference image, using cumulative distribution functions for mapping pixel values.

What are the implications of not using pooling layers in CNNs?

Omitting pooling layers requires convolutional layers with strides for downsampling, potentially leading to higher memory and computational costs.

How does backpropagation facilitate learning in CNNs?

It calculates gradients (using chain rule) for all network parameters by propagating errors backward through the network, enabling parameter updates for learning.

What are the steps in mini-batch gradient descent training?

The steps are selecting a mini-batch, performing a forward pass, computing loss, backpropagating to compute gradients, and updating weights.

How does the choice of pooling size and stride affect max-pooling operations?

Incompatible pooling sizes and strides may lead to suboptimal coverage and downsampling uniformity in the feature map.

What strategy is used in transfer learning with small datasets?

Freeze most of the pre-trained network's parameters and fine-tune the last few layers to adapt the features to the new task.

How does dilated convolution affect the receptive field in CNNs?

dilated convolution expands the receptive field without increasing the number of weights, allowing for broader spatial information aggregation.

What is the purpose of the softmax function in neural networks?

it converts logits into probabilities, facilitating multi-class classification by ensuring output values sum to 1.

What does the term "dying ReLU" problem refer to?

it refers to the issue where neurons stop learning due to always outputting zero, which can stem from improper initialization or gradient descent issues. - this can be solved with leakyrelu

How is the number of trainable parameters in a dense layer calculated?

It is calculated as (number of inputs + 1) × number of outputs, accounting for each input connection and bias term.

What does zero padding do in convolutional layers?

Zero padding allows control over output dimensions, maintaining spatial sizes or adjusting them for specific convolutional effects.

How is the "best model" identified during neural network training?

The best model is often the one with the lowest validation loss, indicating effective learning and generalization capabilities without overfitting.

chatGPT questions exam2 Flashcards

(40 cards)