Lesson 4 Flashcards
How does NLP Transfer learning work?
1) fit language model; it predicts the next word of a sentence. This is hard! You need to know a lot about English and a lot about the world! e.g., fit on WikiText 103 dataset; most of the largest articles on Wikipedia. About 1bn tokens. This is pre-trained model. 2) transfer learning… fine tuning to predict the next word of you domainm aka target corpus (e.g., movie reviews). Don’t need any labels at all! “Self-supervised model” 3) fine tune for classifier with labels on smaller set
Trick in creating language model?
Use train and test… all text… to train the language model!
What is process to fit and fine tune language model?
language_model_learner … It creates an RNN drop_mult=0.3 is dropout(?) lr_find fit_one_cycle unfreeze learn.fit_one_cycle(10,…) 0.30 accuracy is great (so ~1/3 of the time you can predict the exact next word!) ** This training could take over a day ** use learn.predict… to check it is sensible. You are generating sentences 26:30
How do you go from language model to classifier?
save the encoder (don’t need the decoder which is the generator) need to ensure you use the SAME VOCABULARY as the language model learn = text_classifier_learner(clf, drop_mult=0.5) learn.load_encoder(‘fine_tuned_enc’) learn.freeze() lr_find learn.fit_one_cycle(…)
What does learn.freeze_to(-2) mean?
Just unfreeze the last two layers
What is process to fine tune classifier?
fit_one_cycle() learn.freeze_to(-2) fit_one_cycle() learn.freeze_to(-3) fit_one_cycle() It helps with text class to unfreeze one layer at a time lastly, unfreeze the entire thing 31:29
Discrimiigtive learning rate
How much do I decrease the learning rate as I move from layer to layer
What is 2.6
35:4- Stephen merity, Frank Hudder; how can use Random Forrest to find optimal hyperparameters. Like autoML.
What do you use embeddings for?
Categorical data is converted to embeddings Continuous data is fed in as is
How do we deal missing missing data?
Replace with median, add binary column with is_missing
How can you make a validation set with contiguous periods in fastai 1.x?
TabularList.from_df(…).split_by_idx()
How do you make the tabular learner
get_tabular_learner(data, layers=[200,100], metrics=…)
What is collab filtering?
Recommender system… user and who like what; bunch of users; most simple dataset: userid, movieid, numberofstars Think of it as big sparse matrix with movies on one axis, user on another, rating as value.
cold start problem?
have a second model, meta-data driven model, for new users or new movies; or like Netflix UX, when you sign up they ask you a bunch of questions
For tabular time series
Jeremy says not to use RNN when there are other features you can use (store open? promotion? weather? day of week, etc).