NLP Flashcards
“Lemmatization is more complex than stemming.” Justify this statement.
Lemmatization takes longer to execute than stemming. It finds the
exact word meaning. Hence it is complex.
What will be the output of “bodies” in stemming and lemmatization?
a) The output of the word bodies after stemming will be – bodi
b) The output of the word bodies after Lemmatization will be – body
What do you mean by syntax? Explain in detail.
The grammatical structure of a sentence. The grammatical structure contain nouns, verbs, adverbs, adjectives, and some rules to prepare a structure. Another part of grammatical structure is part of speech.
. Also give examples for:
1. Perfect Syntax, no meaning
Perfect Syntax, no meaning – The sentence which is grammatically correct but it does
not make any sense. In human language, a perfect balance of syntax and semantics is
important for better understanding. Human is communication is complex. For example,
Chcikens feed extravagantly while the moon drinks tea.
examples for
Multiple meaning of a word
Multiple meaning of a word – In Natural language it is important to understand that a
word can have multiple meanings and the meanings fit into the sentence according to the
context of it.
TF definition
Term Frequency refers to the frequency of words in one document.
Applications of NLP
1) Automatic summarisation
2) Sentiment Analysis
3) Text classification
4) Virtual Assistants
Script bot-
Handling Conversations Eg.customer care chat bots in companies
Smart bots
along with handling conversation Eg. Alexa, Cortana, Siri
Human Vs. Computer Language
Humans communicate through language which we process all the time. Our brain keeps on processing the sounds that it hears around itself and tries to make sense out of them all the time.
On the other hand, the computer understands the language of numbers. Everything that is sent to the machine has to be converted to numbers. And while typing, if a single mistake is made, the computer throws an error and does not process that part. The communications made by the machines are very basic and simple.
The challenges in Natural Language processing are:
1) Ambiguity in sentences
2) Handling Emotions
3) Handling multiple meanings of same word
4) Handling Syntax ( Grammar)-
5)Handling emotions
6) Handling sentences with perfect syntax, no meaning.
TF-IDF score
The TF- IDF score help the computer to understand the importance of words while processing the NLP.
The higher the value , the more valuable the word is for a given corpus.
Applications of TF- IDF
a)Document Classification b)Topic modelling c)Information Retrieval system d) Stop Word filtering
define corpus
A corpus is a large and structured set of machine-readable texts that have been produced in a natural communicative setting.
What is meant by a dictionary in NLP?
Dictionary in NLP means a list of all the unique words occurring in the corpus. If some words are repeated in different documents, they are all written just once as while creating the dictionary.