Project Flashcards

Question

What is the activation function ReLu and why does it work well for images?

Answer 1

o Replaces all negative values (pixels) with 0, keep all positive values the same o ReLu are easy to compute, computationally efficient, gradients are cleanly defined and are constant (except for a piecewise non-linearity) o introduce non-linearity into the network, allowing it to learn complex patterns and gradients

Answer 2

 CNN detected weights and features as deep learning  SVM only weights as machine learning

Answer 3

 To classify non linearly sepearble data  transform the images into a high-dimensional space and consequently find the optimal decision boundaries in this new high-dimensional space  Functions include: linear, polynomial, radial basis function

Answer 4

 Radial basis function: combines multiple polynomial kernels multiple times of different degrees to project the non-linearly  Detect via grid search

Answer 5

* Random Search & Try-Error: needs more time, random and thus expected to provide worse hp, not structured search * PCA -> less dimensions -> grid search computationally ok

Answer 6

 C = controls balance between maximizing margin & minimizing training error * Small -> wide „street“ seperating 2 classes but datapoints might be within boundary -> larger training error but better generalization * Large -> small „street“ -> lower trianing error but lower generalization  Gamma = influences how decision boundary adapts to different datapoints = defines how many dp considered for hyperplane * Small -> considers many dp * Large -> conisders few dp

Answer 7

 C = 100 * Relatively high -> small street, lower training error, but lower generalization  gamma = 0.0001 * relatively low -> decision boundary less adjusted to individual datapoints, considers a broader range of training instances in determining the decision boundary -> considers more instances -> more generalizable

Answer 8

o Accuracy: measures overall correctones of model, ratio of correctly predicted instances to total number of instances

Answer 9

o Precision: measures model ability to predict positive instances correctly -> presents ratio of false positives

Answer 10

o Recall: ratio of true positives to the sum of true positives and false negatives > presents ratio of false negative

Answer 11

CNN1: self-developed model <-> CNN2: adapted from similar paper

Answer 12

o CNN1:  padding (because of cropping)  more layers (5 conv., 5 maxpooling, 2 dropout, 2 dense)  dropout  smaller kernel size (for malaria important -> consider local features, don’t miss info)  max pooling (<-> average) (better because max focuses on most important / striking features in surrounding)  2 (<-> 3) dense layers  Dense layers with 512 units (<-> 256), more information -> capture more complex patterns & nuances in the input (also due to larger image size)  dense layers have activation function (introduction of non-linearity)

Answer 13

o Other parasite o different images (background etc.) o no cropping o smaller images (44x44, we have 128 x 128 -> can have more pooling -> using their architecture will not work (still large images with few pooling layers)

Answer 14

a. In our case: used small one (3,3) -> because malaria part can be small b. Also don’t want to risk to miss some information, pattern that indicates malaria

Answer 15

o introduce non-linearity into the network, allowing it to learn complex patterns and gradients o computationally efficient o avoiding vanishing gradients o Images typically consist of pixel intensities ranging from 0 to 255 -> black = 0, dont need negative values o Non-linearity: For negative inputs (x < 0) -> output 0. This non-linear behavior introduces an element of discontinuity, breaking the linearity of the network o For positive: still linear -> can capture both non-linearity & linearity

Answer 16

0 boarders around image pixels, info on boarder, we had boundary boxes -> wanted to classify well

Answer 17

- Max pooling better for image classification - features tend to encode the spatial presence of some pattern or concept over the different tiles of the feature map (hence, the term feature map), - more informative to look at maximal presence of different features than at their average presence - Full contrast

Answer 18

- Does not scale well: As dataset grows, KNN becomes increasingly inefficientcompromising overall model performance. (Scaling problem) - prone to overfitting - Curse of dimensionality: volume of space grows exponentially with dimensions, Need more points to ‘fill’ a high-dimensional volume

Answer 19

- RF better for multi-class classification - not so well with high-dimensional data compared to SVM - research showed that RF performs better if 10 to 100 features, SVM if > 100 features (paper that compares which papers used RF or SVM & how they performed) - could be a good option as well (but literature said that SVM outperformed for Malaria detection)

Answer 20

- linear model that’s why can’t really work - risk of overfitting if many independent variable - limited in capturing complexity

Answer 21

address different aspects of overfitting * Early stopping: control the capacity of the model by stopping training before it starts overfitting * Dropout: introduces randomness during training, preventing the network from relying too heavily on any specific set of features or neurons

Project Flashcards

(45 cards)