Developing Generative AI Solutions Flashcards

This course is a primer to generative AI courses, which dive deeper into concepts related to customizing an FM using prompt engineering, Retrieval Augmented Generation (RAG), and fine-tuning.

1
Q

What are the capabilities of Generative AI?

A
  • Adaptability
  • Responsiveness
  • Simplicity
  • Creativity and exploration
  • Data efficiency
  • Personalization
  • Scalability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the challenges of Generative AI?

A
  • Regulatory violations
  • Social risks
  • Data security and privacy concerns
  • Toxicity
  • Hallucinations
  • Interpretability
  • Nondeterminism
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Generative AI Application lifecycle?

A

The generative AI application lifecycle refers to the process of using generative AI models within applications or systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the stages of the Generative AI lifecycle?

A

1 - Define a use case
2 - Select a foundation model
3 - Improve performance
4 - Evaluate the results
5 - Deploy the application

**After deployment, user feedback, usage data, and performance metrics are continuously collected and analyzed to identify areas for improvement or new requirements. Based on this feedback, the generative AI model might be retrained, fine-tuned, or updated to enhance its performance and address any identified issues.

It’s important to note that the generative AI application lifecycle is an iterative process, and different stages might have to be revisited or repeated as the application evolves, user needs change, or new advancements in generative AI technologies emerge.**

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is included in the first stage of the Generative AI lifecycle?

A

The first stage in the generative AI application lifecycle is defining a use case. This phase is the foundation that sets the path for the entire project by doing the following:

  • Defining the problem to be solved
  • Gathering relevant requirements
  • Aligning stakeholder expectations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a business use case?

A

A business use case is a structured narrative that describes how a system or process should behave from the perspective of an actor or stakeholder. It helps to communicate the functional requirements of a system or process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does a well-defined business use case consist of?

A

1 - Use case name
A short and descriptive name that identifies the use case

2 - Brief description
A high-level summary of the use case’s purpose and objective

3 - Actors
The entities or stakeholders that interact with the system or process. These can be human actors (for example, customers or employees) or external systems.

4 - Preconditions
The conditions that must be true before the use case can be initiated

5 - Basic flow (main success scenario)
A step-by-step description of the actions and interactions that occur when the use case is completed successfully, from start to finish. This is the primary path or happy path—for example, a list of each step necessary to achieve success.

6 - Alternative flows (extensions)
Additional scenarios or paths that might occur due to exceptional conditions, errors, or alternative user choices. These describe how the system should handle these situations—for example, contingency plans.

7 - Postconditions
The state or conditions that must be true after the successful completion of the use case

8 - Business rules
Any business policies, constraints, or regulations that govern the behavior of the system or process within the context of the use case

9 - Nonfunctional requirements
Any nonfunctional requirements, such as performance, security, or usability considerations, that are relevant to the use case

10 - Assumptions
Any assumptions made about the system, environment, or context that are necessary for the use case to be valid or applicable

11 - Notes or additional information
Any additional notes, explanations, or supplementary information that might be helpful for understanding or implementing the use case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the key metrics employed when addressing business use cases with AI?

A

Cost savings

One of the primary metrics is the potential cost savings that can be achieved by using generative AI. This includes reductions in labor costs, process optimization, and efficiency gains.

Time savings

Generative AI can automate and streamline various tasks, leading to significant time savings. Measuring the reduction in time required for specific processes or activities can be a valuable metric.

Quality improvement

Generative AI can enhance the quality of outputs, such as written content, creative designs, or analytical insights. Metrics like accuracy, coherence, and creativity can be used to measure quality improvements.

Customer satisfaction

If generative AI is used to improve customer interactions or experiences, metrics like customer satisfaction scores, net promoter score (NPS), or sentiment analysis can be valuable indicators.

Productivity gains

Generative AI can augment human capabilities, leading to increased productivity. Metrics like output volume, error rates, or task completion times can measure productivity improvements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the key approaches employed when addressing business use cases with AI?

A

Process automation

Generative AI can be used to automate repetitive or time-consuming tasks, such as content generation, data analysis, or customer service interactions. This approach can lead to significant efficiency gains and cost savings.

Augmented decision-making

Generative AI can be used to enhance decision-making processes by providing insights, recommendations, and decision support. By analyzing large and complex datasets, generative AI models can uncover patterns, trends, and actionable insights that can inform and improve business decisions, ultimately leading to better outcomes.

Personalization and customization

Generative AI can be used to create personalized and customized content, products, or experiences for customers or stakeholders. This approach can improve customer satisfaction, engagement, and loyalty.

Creative content generation

Generative AI can be employed to generate creative content, such as written text, images, videos, or audio. This approach can be valuable for marketing, advertising, entertainment, or educational purposes.

Exploratory analysis and innovation

Generative AI can be used to explore new ideas, concepts, or solutions by generating novel combinations or variations. This approach can foster innovation and help businesses stay at the forefront of technology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some selection criteria for choosing and using pre-trained models?

A

Cost

Pre-trained models can be expensive, especially for larger and more complex models. The cost might include licensing fees, computational resources for inference, and potential customization or fine-tuning costs. It’s essential to evaluate the budget constraints and weigh the cost against the expected benefits.

Modality

Generative AI models can be designed for different modalities, such as text generation, image generation, audio generation, or multimodal generation (combining multiple modalities). The choice of modality depends on the desired output format and the target application.

Latency

Some applications require real-time or low-latency generation, and others can tolerate longer processing times. The model’s inference speed and the available computational resources should be evaluated to ensure acceptable latency for the target use case.

Multi-lingual support

If the application requires generating content in multiple languages, selecting a model that supports the desired languages or can be adapted to new languages through techniques like transfer learning is crucial.

Model size

Larger models generally have higher computational requirements and can be more resource intensive during inference. However, they often perform better on complex tasks. The model size should be balanced against the available computational resources and performance requirements.

Model complexity

More complex models, such as those based on transformer architectures or large language models, can handle more advanced tasks but might be more challenging to deploy and optimize. Simpler models might be preferred for resource-constrained environments or simpler use cases.

Customization

Some pre-trained models offer the ability to fine-tune or adapt them to specific domains or tasks. This customization can improve performance but might require additional computational resources and labeled data.

Input/output length

Generative models might have limitations on the maximum input or output sequence lengths that they can handle. Applications requiring long-form generation or processing of extensive input data should consider models capable of handling the desired input/output lengths.

Responsibility considerations

It’s important to evaluate the responsible implications of using pre-trained generative AI models, such as potential biases, misinformation risks, or misuse. Models should be vetted for their training data sources and potential societal impacts.

Deployment and integration

The ease of deployment, compatibility with existing infrastructure, and availability of tools or libraries for integrating the model into the target application should be considered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Prompt Engineering?

A

Prompt engineering refers to the process of carefully crafting the input prompts or instructions given to the model to generate desired outputs or behaviors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some key aspects of prompt engineering?

A
  • Design: Crafting clear, unambiguous, and context-rich prompts that effectively communicate the desired task or output to the model
  • Augmentation: Incorporating additional information or constraints into the prompts, such as examples, demonstrations, or task-specific instructions, to guide the model’s generation process
  • Tuning: Iteratively refining and adjusting the prompts based on the model’s outputs and performance, often through human evaluation or automated metrics
  • Ensembling: Combining multiple prompts or generation strategies to improve the overall quality and robustness of the outputs
  • Mining: Exploring and identifying effective prompts through techniques like prompt searching, prompt generation, or prompt retrieval from large prompt libraries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Name some common prompt engineering techniques?

A

*Zero-shot prompting
*Few-shot prompting
*Chain-of-thought (CoT) prompting
*Self-consistency
*Tree of thoughts (ToT)
*Retrieval Augmented Generation (RAG)
*Automatic Reasoning and Tool-use (ART)
*ReAct prompting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is RAG?

A

RAG is a natural language processing (NLP) technique that combines the capabilities of retrieval systems and generative language models to produce high-quality and informative text outputs. RAG incorporates two main components:

1 - A Retrieval System: This component retrieves relevant information from a large corpus of text data, such as a knowledge base, web pages, or other textual sources. The retrieval system uses techniques like information retrieval, sparse indexing, or dense retrieval to identify the most relevant passages or documents for a given input query or context.

2 - A Generative Language Model: This component is a large pre-trained language model, such as GPT-3, BART, or T5, that can generate natural language text. The language model takes the input query or context, along with the retrieved relevant information. And from this, it generates a coherent and fluent text output that combines the retrieved knowledge with its own understanding and language generation capabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some RAG business applications?

A

Building intelligent question-answering systems

RAG can be used to build intelligent question-answering systems that can retrieve relevant information from large knowledge bases and generate natural language responses. This can be useful in customer support, virtual assistants, or any domain where users need quick and accurate information.

Expanding and enriching existing knowledge bases

RAG can also expand and enrich existing knowledge bases by generating new knowledge or rephrasing existing information in a more natural and understandable way. This can improve the accessibility and usability of knowledge bases for various applications.

Generating high-quality content

RAG also generates high-quality content, such as articles, reports, or summaries, by combining retrieved information from various sources with the language generation capabilities of the model. This can be useful in domains like journalism, research, or content marketing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is model fine-tuning?

A

Fine-tuning is another way to improve the performance of a foundation model even further. Fine-tuning refers to the process of taking a pre-trained language model and further training it on a specific task or domain-specific dataset.

17
Q

What are the ways to fine-tune a model?

A

There are two ways to fine-tune a model:

1 - Instruction fine-tuning uses examples of how the model should respond to a specific instruction. Prompt tuning is a type of instruction fine-tuning.

2 - Reinforcement learning from human feedback (RLHF) provides human feedback data, resulting in a model that is better aligned with human preferences.

18
Q

What are the steps in fine-tuning a model?

A

Start with a pre-trained language model.

Large language models are trained on vast amounts of general-purpose text data. This helps them to develop a broad understanding of language and acquire general knowledge.

Prepare a task-specific dataset.

Collect a dataset that is relevant to the task or domain that you want the model to specialize in. This dataset should contain examples of inputs and desired outputs for the specific task.

Add task-specific layers.

The pre-trained model’s architecture is often modified by adding additional layers or components specific to the target task. For example, a classification layer might be added for text classification tasks or a decoder component for text generation tasks.

Fine-tune the model.

The pre-trained model, with the added task-specific layers, is then fine-tuned on the task-specific dataset. During fine-tuning, the model’s parameters are updated to better capture the patterns and nuances present in the task-specific data.

Evaluate and iterate.

After fine-tuning, the model’s performance is evaluated on a test set for the target task. If the performance is not satisfactory, the fine-tuning process can be repeated with different hyperparameters, more data, or different task-specific architectures.

19
Q

What are they types of evaluation models?

A

Human evaluation:

Human evaluation involves having humans interact with the foundation model and assess its performance based on specific criteria. This can involve tasks such as open-ended conversations, question-answering, text generation, or other specific use cases. Human evaluators can provide qualitative feedback on factors like coherence, relevance, factuality, and overall quality of the model’s outputs. Although human evaluation is often considered the gold standard, it can be time consuming and expensive, especially for large-scale evaluations.

Benchmark datasets:

Benchmark datasets are curated collections of data designed specifically for evaluating the performance of language models or other AI systems. These datasets often consist of carefully selected examples or tasks that cover a wide range of topics, complexities, and linguistic phenomena. Models are evaluated by running them on these benchmark datasets and measuring their performance using predefined metrics or tasks. Some popular benchmark datasets for natural language processing tasks include the following:

The General Language Understanding Evaluation (GLUE) benchmark is a collection of datasets for evaluating language understanding tasks like text classification, question answering, and natural language inference. SuperGLUE is an extension of GLUE with more challenging tasks and a focus on compositional language understanding. Stanford Question Answering Dataset (SQuAD) is a dataset for evaluating question-answering capabilities.
Workshop on Machine Translation (WMT) is a series of datasets and tasks for evaluating machine translation systems.
These benchmark datasets provide a standardized way to compare the performance of different foundation models and track progress over time.

Automated metrics:

Although human evaluation is considered the gold standard, automated metrics can provide a quick and scalable way to evaluate foundation model performance. These metrics typically measure specific aspects of the model’s outputs, such as the following:

  • Perplexity (a measure of how well the model predicts the next token)
  • BLEU score (for evaluating machine translation)
  • F1 score (for evaluating classification or entity recognition tasks)
20
Q

What are some relevant metrics to assess model performance?

A

1 - Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a set of metrics used for evaluating automatic summarization and machine translation systems. It measures the quality of a generated summary or translation by comparing it to one or more reference summaries or translations.

2 - Bilingual Evaluation Understudy (BLEU) is a metric used to evaluate the quality of machine-generated text, particularly in the context of machine translation. It measures the similarity between a generated text and one or more reference translations, considering both precision and brevity.

3 - BERTScore is a metric that evaluates the semantic similarity between a generated text and one or more reference texts. It uses pre-trained Bidirectional Encoder Representations from Transformers (BERT) models to compute contextualized embeddings for the input texts, and then calculates the cosine similarity between them.

21
Q

What are some key factors to consider when deploying a Gen AI model?

A

1 - Cost: Pay for the resources that you use with no minimum fees.

2 - Regions:Model deployment is limited to certain AWS Regions.

3 - Quotas: Ensure that you have the adequate service resources for you AWS account.

4 - Security: If your model is deployed in AWS infrastructure, the security responsibility is shared between the company and AWS. If accessing a model outside of AWS, security considerations must be evaluated for data leaving the AWS account.