Word-level processing Flashcards
intro
What are 5 aspects of NLP
- Machine translation
- Information retrieval
- Sentiment Analysis
- Information Extraction
- Question Answering
What are the 8 levels of classical NLP pipeline
- Tokenization
- Sentence splitting
- part-of-speech tagging
- Morphological analysis
- Names entity recognition
- Syntatic parsing
- Coreference resolution
- other annotators
Symbolic way to build a question-anwering system. What are the pros and cons
Pros
* transparency, any prediction is grounded in a rule or dictionary entry
* generalization-by-default, thanks to recursion of rules
Cons
* creating rules is labor intensive
* systems generalize only within their own scope
What is Eliza
An NLP system designed by Joseph Weinzanbaum in 1966. Goal is to simulate a psychotherapist. It is responsive (essentially asks questions back at the user). Pattern matching the input to generate a substitution-based output.
What is a statistical/machine learning way to built a answer generating machine. Name pros and cons
Pros:
* Interpretability, as the statistics reflect whatever data we processed
* Generalisation-by-default, thanks to the grounding in a symbolic format
* Makes use of large annotated corpora
Cons:
* A reliance on handcrafted features
* Often makes too many independent assumtions to be robust
* not always spot on
Give one example of statistics/machine learning NLP
autocorrect
What is the neural way to create a question answering NLP, Give pros and cons
Pros:
* Can model statistical dependence
* Little to no feature engineering required
* Makes use of large corpora
* Very successful in a wide array of typical NLP tasks
Cons
* very limited transparency
* limited theoretical insights
* need to rediscover features/knowledge encoded in the network, if all
What is the classical NLP pipeline
- Morphology
- Syntax
- Lexical semantics
- Compositional semantics
- Pragmatics
What is morphology
Tokenization, lemmatization
What is syntax
part of speech tagging, grammars and parsing
What is lexical semantics
logical forms, word embedding
What is compositional semantics
sentence embeddings, natural language inference
what is pragmatics
Question answering, dualogue modelling