Hoorcollege 10 Natural language inference Flashcards
SICK-NL
dataset for dutch natural language inference
RTE (recognizing textual entailments) / NLI
Recognizing textual entailment is an established task in NLP
* RTE covers strictly speaking more than logical entailment
SICK dataset -> sentences involving compositional knowledge
* SNLI is lager, like sick, description of scenes
* MultiNLI, more diverse text ‘genre’ sources
Entailment datasets
At the word level entailment is hypernymy (dog -> animal) more general
Categorical vs graded entailment
* Chemistry -> science (10.0)
* Enemy -> crocodile (0.33)
Phrase level
* Parrot -> pet
* Dead parrot -/> pet
Sentence level
what a sentence entails: Entailment, contradiction, neural
Dataset formats
- Entailment
- Non entailment: contradiction, UNKNOWN / Neural
(Hella) SWAG
* Different focus: commonsense reasoning
* Different format: multiple choice
NLI in language models
- LLMs are fine-tuned on NLI data (don’t transfer well to new inference data)
- Achieve good accuracy
- textual entailment different from logical since it isn’t exact
- issue with variation between logical contradiction and referential contradiction in datasets
a man is smoking vs a man is not smoking
Interannotator agreement
expected agreement & Cohen’s kappa (for two annotators) &
Fleiss’s kappa (for more than two annotators) measures agreement between annotators