CHAPTER 5 part 1 Flashcards
………….is the core component of modern Natural Language Processing (NLP).
language model
T/F language model , It is a probabilistic statistical model that determines the probability of a given sequence of words occurring in a sentence based on the previous words.
T
more about language model
⚫ It helps to predict which word is
more likely to appear next in the sentence.
⚫ It’s a tool that analyzes the pattern of human language for the prediction of words.
⚫ Language models analyze bodies of text data to provide a basis for their word predictions.
⚫ Widely used in NLP applications like chatbots and search engines.
How does Language Model Works?
⚫ Language Models determine the probability of the next word by analyzing the text in data.
⚫ These models interpret the data by feeding it through algorithms.
⚫ The algorithms are responsible for creating rules for the context in natural language
⚫ The models are prepared for the prediction of words by learning the features and characteristics of a language.
⚫ With this learning, the model prepares itself for understanding phrases and predicting the next words in sentences.
Types of Language Models
There are primarily two types of language models:
- Statistical Language Models
- Neural Language Models
…………….predict the next word based on previous words.
Use probabilistic techniques to analyze text patterns.
Statistical models
Popular Statistical Models
1- N-Gram Model – Uses a fixed-length sequence of words.
2- Bidirectional Model – Considers both past and future words.
3- Exponential Model – Assigns probabilities based on exponential functions.
4-Continuous Space Model – Represents words in a continuous vector space.
5- Neural language Model – Uses neural networks.
N-Gram Model
This is one of the simplest approaches to language modelling
The value of ‘n’ defines the size of the sequence (e.g., n=4 for a
4-gram, like “can you help me”).
– ‘n’ represents the amount of context the model considers when predicting the next word
T/F An N-Gram model creates a probability distribution for a sequence of ‘n’ tokens (words)
T
There are different types of N-Gram models such as, ………… , ……….
bigrams, trigrams
Let’s understand N-gram with an example
⚫ Example Sentence:
“I like learning and practice NLP in this lecture”
– Unigrams:
“I”, “like”, “learning”, “and”, “practice”, “NLP”, “in”, “this”, “lecture”
– Bigram Example:
(“I”, “like”), (“like”, “learning”), (“learning”, “and”), (“and”, “practice”),
(“practice”, “NLP”), (“NLP”, “in”), (“in”, “this”), (“this”, “lecture”)
Calculating N-Gram
- The N-Gram model assigns probabilities to sequences of words based on their occurrence in a training corpus.
- For example, given the sentences:
- “There was heavy rain”. VS “There was heavy flood”
⚫ An N-gram model will tell us that “heavy rain” occurs much more often
than “heavy flood” in the training corpus
⚫ Hence, the N-Gram model assigns a higher probability to “There was
heavy rain”.
…………….is a collection of text data consisting of the proceedings of the European Parliament from 1996 to 2012.
Europarl Corpus
T/F For a bigram model, The prediction of the next word depends only on the previous word.
For an n-gram model, only the preceding (n-1) words are considered
T
T/F The Markov assumption simplifies language modeling by stating that only the most recent words in a sentence matter when predicting the next word
T
Problem with n-gram
⚫ One problem with with N-Gram models is data sparsity.
⚫ This occurs when the model encounters word sequences
(N-Grams) that were not seen during training. As a result,\
the model assigns them a zero probability.
⚫ Techniques to solve this problem include smoothing, backoff
and interpolation
……….is a two-word sequence of two words coming together to form a meaning
⚫ Example:
“I like”, “like learning”, “learning and”, “ and practice”, “practice
NLP”, “ NLP in”, “in this”, “this lecture”.
bigram
………………..is a three-word sequence of three words coming together to form a meaning
⚫ Example:
For the previous sentence, the trigram would simply be:
“I like learning”, “learning and practicing”, “practicing NLP in “ in
this lecture”
trigram
⚫ Example:
For the previous sentence, the trigram would simply be:
“I like learning”, “learning and practicing”, “practicing NLP in “ in
this lecture”
trigram
T/F Unlike n-gram models, which analyze text in one direction (backwards), bidirectional models analyze text in both directions, (backwards and forwards)
These models can predict any word in a sentence or body of text by using every other word in the text
T
Bidirectional (cont.)
T/F Examining text bidirectionally increases result accuracy
This type is often utilized in machine learning and speech generation applications
Example: Google uses a bidirectional model to process search queries
T
This type of statistical model evaluates text by using an equation which is a combination of n-grams and feature functions
⚫ Here the features and parameters of the desired results
are already specified
⚫ This model has fewer statistical assumptions which mean
the chances of having accurate results are more.
Exponential
⚫ In this type of statistical model, words are arranged as a
non-linear combination of weights in a neural network
⚫ The process of assigning weight to a word is known as word embedding
⚫ This type of model proves helpful in scenarios where the data set of words continues to become large and include
unique words.
Continuous Space
JUST READ
⚫ These language models are based on neural networks and
are often considered as an advanced approach to execute
NLP tasks
⚫ Neural language models overcome the shortcomings of
classical models such as n-gram and are used for complex
tasks such as speech recognition or machine translation
DONE