chapter 11 Flashcards
Natural-language processing (NLP)
getting computers to deal with human language
In the 1990s, rule-based NLP approaches were overshadowed by more successful …
statistical approaches, in which massive data sets were employed to train machine-learning algorithms.
Most recently, this statistical data-driven NLP approach has focused on
deep learning
deep learning’s first major success in NLP
automated speech recognition
automated speech recognition is still not at “human level”
- Background noise can significantly hurt the accuracy of these systems
- these systems are occasionally thrown off by unusual words or phrases in a way that highlights their lack of understanding of the speech they are transcribing.
sentiment classification
An AI system that could accurately classify a sentence (or longer passage) as to its sentiment—positive, negative, or some other degree of opinion
Some early NLP systems looked for the presence of individual words or short sequences of words as indications of the sentiment of a text.
Looking at single words or short sequences in isolation is generally not sufficient to glean the overall sentiment;
it’s necessary to capture the semantics of words in the context of the whole sentence.
recurrent neural networks (RNNs)
inspired by ideas on how the brain interprets sequences
key differences between a traditional neural network and a recurrent neural network
for the RNN is that its hidden units have additional “recurrent” connections; each hidden unit has a connection to itself and to the other hidden unit
> most important one
Unlike a traditional neural network, an RNN operates over a series of time steps
At each time step, the RNN is fed an input and computes the activation of its hidden and output units just as does a traditional neural network.
But in an RNN each hidden unit computes its activation based on both the input and the activations of the hidden units from the previous time step.
This gives the network a way to interpret the words it “reads” while remembering the context of what it has already “read.”
At each time step, the hidden units’ activations constitute:
the network’s encoding of the partial sentence it has seen so far.
The network keeps refining that encoding as it continues to process words.
END symbol
After the last word in the sentence, the network is given a special END symbol, which tells the network that the sentence is finished.
appended by humans to each sentence before the text is fed to the network
Because the network stops encoding the sentence only when it encounters the END symbol, the system can in principle encode sentences of any length into a fixed-length set of numbers
NLP output
the output unit in this network processes the hidden units’ activations (the “encoding”) to give the network’s confidence that the input sentence has a positive sentiment.
backpropagation
Given a set of sentences that humans have labeled as “positive” or “negative” in sentiment, the encoder network can be trained from these examples via back-propagation.
vocabulary of a network
the set of all words that the network will be able to accept as inputs.
scheme for encoding words as numbers
- assign each word in the vocabulary an arbitrary number between 1 and 20,000.
- give the neural network 20,000 inputs, one per word in the vocabulary
- At each time step, only one of those inputs—the one corresponding to the actual input word—will be “switched on.” (one-hot encoding)