Developing Generative Artificial Intelligence Solutions Flashcards
What are the 5 parts of the generative AI application lifecycle
1) Defining a business use case
2) Selecting a foundational model (FM)
3) Improving the performance of an FM
4) Evaluating the performance of an FM
5) Deployment and its impact on business objectives
Challenges of generative AI
Regulatory violations
Social risks
Data security and privacy concerns
Toxicity
Hallucinations
Interpretability
Nondeterminism
A few criteria to consider when deciding between pre-trained model and building a new one
Cost, modality, latency, multi-lingual support, model size and complexity, customization, input/output length, responsibility consideration, deployment and integration
What is prompt engineering?
A technique used to improve the performance of a model by crafting the input prompts or instructions given to the model to generate desired outputs or behaviors.
What are key aspects of prompt engineering?
- Design
- augmentation (incorporating additional information)
- tuning (iteratively refining and adjusting the prompts)
- ensembling (combining multiple prompts)
- mining (exploring and identifying effective prompts)
What is RAG?
Retrieval Augmented Generation - a form of natural language process prompt engineering. It combines the capabilities of retrieval systems and generative language models
Fine tuning
A method of improving the performance of a foundational model - it is taking a pre-trained language model and further training it on specific tasks or domain-specific dataset
What are the 2 ways of fine-tuning a model
1) Instruction fine-tuning (uses examples of how the model should respond to specific instructions)
2) Reinforcement learning from human feedback (RLHF) provides human feedback data (better aligned with human preferences)
3 evaluation types
1) Human evaluation 2) Benchmark datasets 3) Automated metrics
What are agents?
Software components or entities designed to perform specific actions or tasks autonomously or semi-autonomously, based on predefined rules or algorithms. Examples of actions include: task coordination, reporting and logging, integration and communication
Bilingual Evaluation Understudy (BLEU)
measures the similarity between a generated text and one or more reference translations, considering both precision and brevity. Used for evaluating machine translations.
Bidirectional Encoder Representations from Transformers (BERT)
Metric that evaluates the semantic similarity between a generated text and one or more reference texts. Used for assessing the semantic similarities between 2 sentences
Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
measures the quality of a generated summary or translation by comparing it to one or more reference summaries or translations.
What is Amazon Titan
High performing foundation model from AWS. Can do images, text and multimodal choices. Can customize with your own data.
Context window
The number of tokens an LLM. An consider when generating new text