Introduction - Week 1 Flashcards
What makes an application a language processing application
It requires the use of knowledge about human language
Is Unix wc an example of a language processing application?
Yes, when it counts words
No, when it counts lines or bytes. Lines and bytes are computer artefacts, not linguistic entities
Is google search an NLP application?
Yes, it uses knowledge about human languages
Why is NLP hard?
Text is only structured for the human user, often almost fully unstructured for the machine, sometimes ‘semi-structured’ like html
semi-structured text
Text that is partially structured for the machine like HTML
Natural Language Processing (NLP)
Necessary steps for “understanding” a piece of data represented by a language
NLP tasks (umbrella terms)
- Text mining
- Text analytics
- Computational Linguistics
- (human) language technology
NLP Tasks and Applications
Information Retrieval
- Searching for relevant documents
Document classification
- Sorting documents into categories
Question answering
- Short answer for a question
Text summarisation
- Summarise a set of documents
Sentiment analysis
- Product reviews, Twitter, Hate crime detection
Machine translation
- One of the first motivations for NLP
Natural language generation
- For data to text
Authoring and marking tools
- Check spelling, grammar, style
- Automated marking of essays
Conversational Agents
- Dialogues, voice recognition, Text to speech, speech to Text
etc… (many many others)
NLP main problems
Variability
Ambiguity
Variability
Numerous ways to say the same thing
Ambiguity
Words and sentences are often ambiguous, and can have multiple meanings
Word-level ambiguity
Apple (company) or Apple (fruit)
Sentence-level ambiguity
I made her duck (this has at least 5 meanings)
Lexical Ambiguity
A word with multiple POS tags, e.g. Duck can be verb or noun
Lexical-semantic ambiguity
A word with different senses
e.g. bank can be financial institution or part of countryside (river bank)