Deep Learning Flashcards
How can we use deep learning for nlp
data driven approach
take input text, embed it in high dimensional vector space
run prediction model on this
both the embedding and the prediction model are usually neural networks
How can use use time series for nlp
time series is a set of data points in time order. these datapoints for us are words and the time order is the appearance order in the text
What tasks can we perform using deep learning (4)
sequence classification
sequence labelling
sequence extraction
sequence to sequence translation
what is sequence classification
the output is the probability distribution of classes the text belongs to.
what span extraction and how is it treated as a classification problem
return 2 probability distributions,
one for being the start of span
one for being the end of span
what does a ml model for sequence labelling output
the output is a probability distribution for each token over classes
examples of sequence labelling
such as POS tagging, named entity recognition, open information extraction, question type classification
how is sequence to sequence to sequence translation performed
input is the sequence and output is another sequence
applications for sequence to sequence translation
translation, summarisation, text generation, question answering
what is the major disadvantage of bag of words
we lose the ordering
what is a rnn
when processing the last token we use the output of the previous token
what is the vanishing gradient problem
we get very small gradients, making the updates zero
why is the vanishing gradient problem prevalent in nlp
because of the long term dependencies
what is the solution to the vanishing gradient problem
lstm
what is a lstm
uses a context vector that retains information from past calculations.
at each step it decides how much to add to the context vector and whether anything should be disgarded