Intro To NLP Flashcards
What is NLP?
The development of systems with knowledge of human language
What subject areas does NLP combine? (4)
Linguistics, computer science, maths, psychology
What is the main challenge of NLP? And what does it mean?
Ambiguity- more than one possible interpretation
What are the 4 types of ambiguity?
phonological, lexical, syntactic, semantic
Describe phonological ambiguity
words that sound the same but have a different meaning
e.g. red and read, flower and flour
Describe lexical ambiguity
Due to a word having multiple senses
e.g. I am going to the bank
Describe syntactic ambiguity
Due to a word having more than one possible part of speech
e.g. I saw her duck
Describe semantic ambiguity
due to lack of knowledge of the world: multiple possible interpretations unless knowledge of the world is available.
e.g.”the children ate the cookies because they were very hungry”
were the children hungry or were the cookies hungry
What are the two approaches to NLP
symbolic or statistical/ML based (+hybrid)
Describe the symbolic approach to NLP
rule and dictionary based
What are the advantages of a symbolic approach (3)
- expert knowledge yields highly predictive results
- interpretable results
- good when labelled data is hard to obtain
What are the disadvantages of a symbolic approach (3)
- shortage of experts
- laborious rule writing
- domain adaptation problematic
Describe the statistical/ML approach to NLP
Use a large amount of data to discover patterns
What are the advantages of a statistical/ml approach (2)
- can generalise to unseen examples
- good when a dictionary is unavailable
What are the disadvantages of a statistical/ml approach (3)
- need labels, therefore need people, therefore time consuming
- retrain for new domain
- black box, cant inspect