Developing Generative AI Solutions Flashcards
This course is a primer to generative AI courses, which dive deeper into concepts related to customizing an FM using prompt engineering, Retrieval Augmented Generation (RAG), and fine-tuning.
What are the capabilities of Generative AI?
- Adaptability
- Responsiveness
- Simplicity
- Creativity and exploration
- Data efficiency
- Personalization
- Scalability
What are the challenges of Generative AI?
- Regulatory violations
- Social risks
- Data security and privacy concerns
- Toxicity
- Hallucinations
- Interpretability
- Nondeterminism
What is the Generative AI Application lifecycle?
The generative AI application lifecycle refers to the process of using generative AI models within applications or systems.
What are the stages of the Generative AI lifecycle?
1 - Define a use case
2 - Select a foundation model
3 - Improve performance
4 - Evaluate the results
5 - Deploy the application
**After deployment, user feedback, usage data, and performance metrics are continuously collected and analyzed to identify areas for improvement or new requirements. Based on this feedback, the generative AI model might be retrained, fine-tuned, or updated to enhance its performance and address any identified issues.
It’s important to note that the generative AI application lifecycle is an iterative process, and different stages might have to be revisited or repeated as the application evolves, user needs change, or new advancements in generative AI technologies emerge.**
What is included in the first stage of the Generative AI lifecycle?
The first stage in the generative AI application lifecycle is defining a use case. This phase is the foundation that sets the path for the entire project by doing the following:
- Defining the problem to be solved
- Gathering relevant requirements
- Aligning stakeholder expectations
What is a business use case?
A business use case is a structured narrative that describes how a system or process should behave from the perspective of an actor or stakeholder. It helps to communicate the functional requirements of a system or process.
What does a well-defined business use case consist of?
1 - Use case name
A short and descriptive name that identifies the use case
2 - Brief description
A high-level summary of the use case’s purpose and objective
3 - Actors
The entities or stakeholders that interact with the system or process. These can be human actors (for example, customers or employees) or external systems.
4 - Preconditions
The conditions that must be true before the use case can be initiated
5 - Basic flow (main success scenario)
A step-by-step description of the actions and interactions that occur when the use case is completed successfully, from start to finish. This is the primary path or happy path—for example, a list of each step necessary to achieve success.
6 - Alternative flows (extensions)
Additional scenarios or paths that might occur due to exceptional conditions, errors, or alternative user choices. These describe how the system should handle these situations—for example, contingency plans.
7 - Postconditions
The state or conditions that must be true after the successful completion of the use case
8 - Business rules
Any business policies, constraints, or regulations that govern the behavior of the system or process within the context of the use case
9 - Nonfunctional requirements
Any nonfunctional requirements, such as performance, security, or usability considerations, that are relevant to the use case
10 - Assumptions
Any assumptions made about the system, environment, or context that are necessary for the use case to be valid or applicable
11 - Notes or additional information
Any additional notes, explanations, or supplementary information that might be helpful for understanding or implementing the use case
What are the key metrics employed when addressing business use cases with AI?
Cost savings
One of the primary metrics is the potential cost savings that can be achieved by using generative AI. This includes reductions in labor costs, process optimization, and efficiency gains.
Time savings
Generative AI can automate and streamline various tasks, leading to significant time savings. Measuring the reduction in time required for specific processes or activities can be a valuable metric.
Quality improvement
Generative AI can enhance the quality of outputs, such as written content, creative designs, or analytical insights. Metrics like accuracy, coherence, and creativity can be used to measure quality improvements.
Customer satisfaction
If generative AI is used to improve customer interactions or experiences, metrics like customer satisfaction scores, net promoter score (NPS), or sentiment analysis can be valuable indicators.
Productivity gains
Generative AI can augment human capabilities, leading to increased productivity. Metrics like output volume, error rates, or task completion times can measure productivity improvements.
What are the key approaches employed when addressing business use cases with AI?
Process automation
Generative AI can be used to automate repetitive or time-consuming tasks, such as content generation, data analysis, or customer service interactions. This approach can lead to significant efficiency gains and cost savings.
Augmented decision-making
Generative AI can be used to enhance decision-making processes by providing insights, recommendations, and decision support. By analyzing large and complex datasets, generative AI models can uncover patterns, trends, and actionable insights that can inform and improve business decisions, ultimately leading to better outcomes.
Personalization and customization
Generative AI can be used to create personalized and customized content, products, or experiences for customers or stakeholders. This approach can improve customer satisfaction, engagement, and loyalty.
Creative content generation
Generative AI can be employed to generate creative content, such as written text, images, videos, or audio. This approach can be valuable for marketing, advertising, entertainment, or educational purposes.
Exploratory analysis and innovation
Generative AI can be used to explore new ideas, concepts, or solutions by generating novel combinations or variations. This approach can foster innovation and help businesses stay at the forefront of technology.
What are some selection criteria for choosing and using pre-trained models?
Cost
Pre-trained models can be expensive, especially for larger and more complex models. The cost might include licensing fees, computational resources for inference, and potential customization or fine-tuning costs. It’s essential to evaluate the budget constraints and weigh the cost against the expected benefits.
Modality
Generative AI models can be designed for different modalities, such as text generation, image generation, audio generation, or multimodal generation (combining multiple modalities). The choice of modality depends on the desired output format and the target application.
Latency
Some applications require real-time or low-latency generation, and others can tolerate longer processing times. The model’s inference speed and the available computational resources should be evaluated to ensure acceptable latency for the target use case.
Multi-lingual support
If the application requires generating content in multiple languages, selecting a model that supports the desired languages or can be adapted to new languages through techniques like transfer learning is crucial.
Model size
Larger models generally have higher computational requirements and can be more resource intensive during inference. However, they often perform better on complex tasks. The model size should be balanced against the available computational resources and performance requirements.
Model complexity
More complex models, such as those based on transformer architectures or large language models, can handle more advanced tasks but might be more challenging to deploy and optimize. Simpler models might be preferred for resource-constrained environments or simpler use cases.
Customization
Some pre-trained models offer the ability to fine-tune or adapt them to specific domains or tasks. This customization can improve performance but might require additional computational resources and labeled data.
Input/output length
Generative models might have limitations on the maximum input or output sequence lengths that they can handle. Applications requiring long-form generation or processing of extensive input data should consider models capable of handling the desired input/output lengths.
Responsibility considerations
It’s important to evaluate the responsible implications of using pre-trained generative AI models, such as potential biases, misinformation risks, or misuse. Models should be vetted for their training data sources and potential societal impacts.
Deployment and integration
The ease of deployment, compatibility with existing infrastructure, and availability of tools or libraries for integrating the model into the target application should be considered.
What is Prompt Engineering?
Prompt engineering refers to the process of carefully crafting the input prompts or instructions given to the model to generate desired outputs or behaviors.
What are some key aspects of prompt engineering?
- Design: Crafting clear, unambiguous, and context-rich prompts that effectively communicate the desired task or output to the model
- Augmentation: Incorporating additional information or constraints into the prompts, such as examples, demonstrations, or task-specific instructions, to guide the model’s generation process
- Tuning: Iteratively refining and adjusting the prompts based on the model’s outputs and performance, often through human evaluation or automated metrics
- Ensembling: Combining multiple prompts or generation strategies to improve the overall quality and robustness of the outputs
- Mining: Exploring and identifying effective prompts through techniques like prompt searching, prompt generation, or prompt retrieval from large prompt libraries
Name some common prompt engineering techniques?
*Zero-shot prompting
*Few-shot prompting
*Chain-of-thought (CoT) prompting
*Self-consistency
*Tree of thoughts (ToT)
*Retrieval Augmented Generation (RAG)
*Automatic Reasoning and Tool-use (ART)
*ReAct prompting
What is RAG?
RAG is a natural language processing (NLP) technique that combines the capabilities of retrieval systems and generative language models to produce high-quality and informative text outputs. RAG incorporates two main components:
1 - A Retrieval System: This component retrieves relevant information from a large corpus of text data, such as a knowledge base, web pages, or other textual sources. The retrieval system uses techniques like information retrieval, sparse indexing, or dense retrieval to identify the most relevant passages or documents for a given input query or context.
2 - A Generative Language Model: This component is a large pre-trained language model, such as GPT-3, BART, or T5, that can generate natural language text. The language model takes the input query or context, along with the retrieved relevant information. And from this, it generates a coherent and fluent text output that combines the retrieved knowledge with its own understanding and language generation capabilities.
What are some RAG business applications?
Building intelligent question-answering systems
RAG can be used to build intelligent question-answering systems that can retrieve relevant information from large knowledge bases and generate natural language responses. This can be useful in customer support, virtual assistants, or any domain where users need quick and accurate information.
Expanding and enriching existing knowledge bases
RAG can also expand and enrich existing knowledge bases by generating new knowledge or rephrasing existing information in a more natural and understandable way. This can improve the accessibility and usability of knowledge bases for various applications.
Generating high-quality content
RAG also generates high-quality content, such as articles, reports, or summaries, by combining retrieved information from various sources with the language generation capabilities of the model. This can be useful in domains like journalism, research, or content marketing.