Troubleshooting Flashcards

Question 1

Q

Why is your model worst then the authers?

Answer

A

Implementation bug
Hyper parameters choices - model could be extremely sensitive
Data model fit - different data from the paper
Dataset construction - most time in industry is spent on datasets and not models

Question 2

Q

בגדול מה צריכה להיות האסטרטגיה?

Answer

A

פסימיסם,

בגלל שקשה לעשות דיבאג - אז להתחיל ממש בדברים הפשוטים ואז לעלות את המורכבות

Question 3

Q

בגדול מה התהליך של בניית מודל

Answer

A

להתחיל בפשוט - לבחור מודל קל ודאטה קל

ליצור ולדבג את המודל

להעריך את התוצאות

לשפר את ההיפרפרמטר

לשפר את הדאטה/המודל

Question 4

Q

When starting simple - what architecture to choose?

Answer

A

For images start with LeNet like architecture then move to resnet

For sequences start with Transformer/attention model than move to wavenet like model

For other start with simple fully connected NN

For multiple input - say a picture with a phrase - start with making each input into a lower dimensional feature space for example use convNet and flatten the results, same with sequence use LSTM and keep the final vector. Then concaténate all together and pass the output through a fully connected layer

Question 5

Q

Optimizer defualts

Answer

A

Adam with learning rate of 3e-4
Relu for cnn tanh for lstm
Regularización - none
Normalization - none (like batch normalization - not the one on the input)
Both are none because they are a source of bugs

Question 6

Q

Should i normalize the input data?

Answer

A

Yes!
Make aure to do it!
And that its not done automaticaly

Question 7

Q

How to simplify the problem so we can start easy

Answer

A

Small training set (less then 10,000)
Less classes/objects/smaller pictuers etc..
Create a simple synthetic training set

Question 8

Q

3 General advice for implementing the model

Answer

A

Lightweight implementation
פחות מ200 שורות קוד…
Of the shelf components
Build complicated pipelines later