Lecture 6 Flashcards
What is Natural Language Generation (NLG)?
NLG is the process of transforming structured data into human-readable text, also known as Data-to-Text.
Name three applications of NLG.
Structured Report Generation (e.g., BabyTalk project), Weather Reporting, and Question Answering Systems.
What are the main types of NLG systems?
Classical (rule-based), Template-Based, Statistical/Neural (data-driven), and Hybrid Systems.
What is a Template-Based NLG System?
A system that uses fixed text templates with slots for variable content, often used in predictable domains.
Describe the advantage and disadvantage of Rule-Based NLG Systems.
Advantage: Produces highly accurate text for specific domains. Disadvantage: Requires extensive manual effort to define rules.
What are the key components of a Classical NLG System?
Content determination, discourse structuring, aggregation, referring expression generation, lexical choice, realization, and fluency ranking.
What is Content Determination in NLG?
The process of deciding what information to include in the generated text.
What is Discourse Structuring in NLG?
Organizing the information into a coherent and logical flow within the generated text.
Name three Decoding Strategies in NLG.
Greedy Sampling, Beam Search, and Top-K Sampling.
What is Top-K Sampling?
A decoding method where only the top
𝑘
k most probable words are considered for generation, adding diversity.
How does Temperature Sampling work in NLG?
Adjusts the probability distribution with a temperature parameter to control creativity and focus; lower values make generation more focused.
What types of data are commonly used to train LLMs?
Web text from Common Crawl, Colossal Clean Crawled Corpus (C4), Wikipedia, news sites, and patents.
Why is Data Quality important in NLG?
Low-quality data can introduce biases, toxicity, and unsafe content in the generated text.
What is Prompting in the context of LLMs?
Using an input prompt to guide a language model to generate relevant text, sometimes known as in-context learning.
What is the difference between Zero-Shot and Few-Shot Prompting?
Zero-shot prompting includes no examples, while few-shot prompting includes labeled examples to improve model performance.