AIF-C01 Flashcards
You ONLY want to manage Applications and Data. Which type of Cloud Computing model should you use?
* On-premises
* Infrastructure as a Service (IaaS)
* Software as a Service (SaaS)
* Platform as a Service (PaaS)
Platform as a Service model
Is Ec2 a PaaS or IaaS?
IaaS
Give an example of a PaaS?
AWS Beanstalk
What is the pricing model of Cloud Computing?
* Discounts over time
* Pay-as-you-go pricing
* Pay once a year
* Flat-rate pricing
Pay as you go
Which Global Infrastructure identity is composed of one or more discrete data centers with redundant power, networking, and connectivity, and are used to deploy infrastructure?
* Edge Locations
* Availability Zones
* Regions
Availability Zones
Which of the following is NOT one of the Five Characteristics of Cloud Computing?
* Rapid elasticity and scalability
* Multi-tenancy and resource pooling
* Dedicated Support Agent to help you deploy applications
* On-demand self service
Dedicated Support Agent to help you deploy applications
Which are the 3 pricing fundamentals of the AWS Cloud?
* Compute, Storage, and Data transfer in the AWS Cloud
* Compute, Networking, and Data transfer out of the AWS Cloud
* Compute, Storage, and Data transfer out of the AWS Cloud
* Storage, Functions, and Data transfer in the AWS Cloud
Compute, Storage, and data transfer out of the AWS Cloud are the 3 pricing fundamentals of the AWS Cloud.
Which of the following options is NOT a point of consideration when choosing an AWS Region?
* Compliance with data governance
* Latency
* Capacity availability
* Pricing
Capacity is unlimited in the cloud, you do not need to worry about it. The 4 points of considerations when choosing an AWS Region are: compliance with data governance and legal requirements, proximity to customers, available services and features within a Region, and pricing.
Which of the following is NOT an advantage of Cloud Computing?
* Trade capital expense (CAPEX) for operational expense (OPEX)
* Train your employees less
* Go global in minutes
* Stop spending money running and maintaining data centers
You must train your employees more so they can use the cloud effectively.
AWS Regions are composed of?
* Two or more Edge Locations
* One or more discrete data centers
* Three or more Availability Zones
AWS Regions consist of multiple, isolated, and physically separate Availability Zones within a geographic area.
Which of the following services has a global scope?
* EC2
* IAM
* Lambda
* Rekognition
IAM is a global service (encompasses all regions).
Which of the following is the definition of Cloud Computing?
* Rapidly develop, test and launch software applications
* Automatic and quick ability to acquire resources as you need them and release resources when you no longer need them
* On-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user
* Change resource types when needed
On-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user.
What defines the distribution of responsibilities for security in the AWS Cloud?
* AWS Pricing Fundamentals
* The Shared Responsibility Model
* AWS Acceptable Use Policy
* The AWS Management Console
The Shared Responsibility Model defines who is responsible for what in the AWS Cloud.
A company would like to benefit from the advantages of the Public Cloud but would like to keep sensitive assets in its own infrastructure. Which deployment model should the company use?
* Private Cloud
* Public Cloud
* Hybrid Cloud
Using a Hybrid Cloud deployment model allows you to benefit from the flexibility, scalability and on-demand storage access while keeping security and performance of your own infrastructure.
What is NOT authorized to do on AWS according to the AWS Acceptable Use Policy?
* Building a gaming application
* Deploying a website
* Run analytics on stolen content
* Backup your data
- Run analytics on stolen content
You can run analytics on AWS, but you cannot run analytics on fraudulent content. Refer to the AWS Acceptable Use Policy to see what is not authorized to do on AWS.
GenAI is a subset of X which is a subset of Y which is a subset of Z.
X - Deep Learning
Y - Machine Learning
Z - Artificial Intelligince
What does GenAI generate?
new data/content that is similar to the data that it was trained on like Text, images, Audio, Code, Video and etc.
What is the cost of generating a foundation model and why? who can do this?
Tens of Millions of dollars for training, so it only can be done by large companies who can afford it.
Which famous GenAI models are open source?
Meta and Google BERT
Which famous GenAI models are commercial and not open source?
OpenAI and Anthropic
What does LLM stand for?
Large Language Models
What is LLM?
Type of GenAI that generates coherent human-like text.
What is the most famous LLM?
ChatGPT
What is a prompt?
GenAI Model’s input from a user. The question that the user asks from GenAI model.
What does it mean that a prompt is non-deterministic?
2 users with the same prompt from the same GenAI model, may get different answers.
Name a famous image generative AI method
Diffusion Models - e.g. Stable Diffusion
How is a diffusion model trained and generates?
trained by Forward diffusion process (by adding noise to a picture in multiple steps) and generates by reverse diffusion
What is AWS bedrock?
A fully managed AWS service to build Gen AI applications
Is my training data secure in Bedrock?
yes. it’s all within the same account and not leaving it. any Foundation Model used is a copy of the original model, trained by customer data and stored locally.
What are the elements of Bedrock service?
- Foundation Models
- Interactive Playground for users
- Knowledge Bases (RAG): to fetch data from external data sources to generate more relevant and accurate responses.
- Fine-Tuning: Update the model with your data
- Unified APIs across all the models used by GenAI applications.
Does enabling a model cost?
No, we only pay for using a model.
What is Amazon Titan?
- High-performing Foundation Models from AWS
- Image, text, multimodal model choices via a fully-managed APIs
- Can be customized with your own data
What is Amazon Titan Text Express model good for?
- High-performance
text model, +100
languages - Content creation,
classification,
education…
What is Claude model from Anthropic (an AI leading company) good for?
- High-capacity text
generation, multilanguage - Analysis, forecasting,
document
comparison…
What is (Llama-2 70b-chat) model good for?
- Large-scale tasks,
dialogue, English - Text generation,
customer service…
What is Stable Diffusion model from Stability.ai good for?
Image creation for
advertising, media…
How in aws console we can play with models and choose the right one?
Playground in console has compare models feature, that you can run a command across multi models and compare their result.
Compare, by the result content, pricing, response time, etc.
What is a Custom Model in Bedrock?
By choosing a base model, and tuning based on our own data, we can create a custom model.
What are the options to customize a model in Bedrock?
- Fine Tune (one off)
- Continued Pre-training (ongoing)
Which models can be fine tuned?
not all, usually open-sources, e.g. Amazon, Cohere and Meta
For Fine tuning a model, how the data is provided?
- from S3
- must be dataset format
- use Sagemaker Ground Truth to create and label training datasets
What are Hyperparameters?
variables for the machine learning algorithms
What is Epochs Hyperparameter?
The total number of iterations of all the training data in one cycle for training the model.
What is Batch Size Hyperparameter?
The number of samples proceeded before model parameters are updated.
What is learning rate hyperparameter?
The rate at which model parameters are updated after each batch of training data. basically, how fast the model learns.
What is Learning Rate warmup steps hyperparameter?
Number of iterations over which learning rate is gradually increased to the initial rate specified.
Where is model fine tuning result saved?
S3
What is a model fine tuning result?
A fine tuned model trained with our data
What is the pricing limitation of working with fine tuned models?
must provision throughput
Instructions based fine tuning uses…?
Labeled examples that are [prompt-response] based.
How is continued pre-training?
- Provide Unlabeled data. e.g. any unstructured knowledge! it has only “input” field.
Continued Pre-training is also called ….
Domain Adoption Fine tuning. make the model expert in a domain.
what are 2 types of instruction based fine tuning?
- Single Turn Messaging
- Multi Turn Messaging => e.g. chatbot with multi turn conversation
What is Multi Turn Messaging schema?
{
“system”: “context”,
[{
“role”: “user/assistant”,
“content”: “message”
}]
}
Re-training vs instruction based fine tuning? (5 items)
- Re-training an FM requires a higher budget
- Instruction-based fine-tuning is usually cheaper as computations are
less intense and the amount of data required usually less - It also requires experienced ML engineers to perform the task
- You must prepare the data, do the fine-tuning, evaluate the model
- Running a fine-tuned model is also more expensive (provisioned
throughput)
What is Transfer Learning
Transfer Learning – the broader concept of reusing a pre-trained model to adapt it to a new related task
* Widely used for image classification
* And for NLP (models like BERT and GPT)
* Can appear in the exam as a general ML concept
* Fine-tuning is a specific kind of transfer learning
Say 4 use cases for file-tuning
- A chatbot designed with a particular persona or tone, or geared
towards a specific purpose (e.g., assisting customers, crafting
advertisements) - Training using more up-to-date information than what the language
model previously accessed - Training with exclusive data (e.g., your historical emails or messages,
records from customer service interactions) - Targeted use cases (categorization, assessing accuracy)
How is Amazon Bedrock Evaluating a Model
Using automatic evaluation.
- it evaluate a model for quality control
- Built-in task types:
* Text summarization
* question and answer
* text classification
* open-ended text generation…
- Bring your own prompt dataset or use built-in curated prompt datasets (Benchmark datasets - questions and answers)
- Scores are calculated automatically
Model scores are calculated using various statistical methods (e.g. BERTScore, F1…)
What is a Judge model
In bedrock, a model that evaluates the evaluating model’s answers vs benchmark answers, and gives a grading score.
What is a Bias score?
Some benchmark datasets in Bedrock can quickly identify bias like discrimination against a specific group. they generate a Bias score.
What is the diff between human and automatic model evaluation?
in Human approach, instead of the Judge model, groups of humans evaluate by giving thumbs up or down or score like 1 to 5, and generating a grade.
Also, in human evaluation, we can have custom tasks types that only humans can evaluate accurately.
What are automated metrics to evaluate a FM?
- ROUGE
- BLEU
- BERTScore
- Perplexity
What is ROUGE? and what are ROUGE-N and ROUGE-L?
Recall Oriented Understudying for Gisting Evaluation => it’s a FM auto evaluation metric
Evaluating automatic summarization and machine translation systems
* ROUGE-N – measure the number of matching n-grams between reference and generated text.
(1gram is a word - it means how many words the model’s answer is from the benchmark answer)
* ROUGE-L – longest common subsequence between reference and generated text.
i.e. the longest sequence of words (not necessarily consecutive, but still in order) that is shared between both.
what is Gisting?
the word’s meaning= engage in chat or gossip.
in ML, it means using machine translation (MT) to quickly understand the general meaning or essence of foreign text, without requiring a perfect translation.
Focus on meaning, not precision:
Gisting prioritizes understanding the core message over achieving a grammatically perfect translation.
What is BLEU?
Bilingual Evaluation Understudy => it’s a FM auto evaluation metric
* Evaluate the quality of generated text, especially for translations
* Considers both precision and penalizes too much brevity
* Looks at a combination of n-grams (1, 2, 3, 4)
What is BERTScore?
Bidirectional Encoder Representations from Transformers => it’s a FM auto evaluation metric.
Semantic similarity between generated text
* Uses pre-trained BERT models to compare the contextualized embeddings of both texts and computes the cosine similarity between them.
* Capable of capturing more nuance between the texts
What is Perplexity?
it’s a FM auto evaluation metric.
how well the model predicts the next token (lower is better)
What are Business Metrics to Evaluate a Model On?
- User Satisfaction – gather users’ feedbacks and assess their satisfaction with the model responses (e.g., user satisfaction for an ecommerce platform)
- Average Revenue Per User (ARPU) – average revenue per user attributed to
the Gen-AI app (e.g., monitor ecommerce user base revenue) - Cross-Domain Performance – measure the model’s ability to perform cross
different domains tasks (e.g., monitor multi-domain ecommerce platform) - Conversion Rate – generate recommended desired outcomes such as purchases
(e.g., optimizing ecommerce platform for higher conversion rate) - Efficiency – evaluate the model’s efficiency in computation, resource utilization…
(e.g., improve production line efficiency)
What are models evaluation task types?
- General Text Generation
- Text summarization
- Question and Answer
- Text classification
List a few model evaluation metrics?
- Toxity: offensive and inappropriate content
- Accuracy
- Robustness
- Relevance
- Consistency
- Completeness
What does RAG stand for?
Retrieval - Augmented Generation
What does RAG do?
Allows a FM to reference a data source outside of training data.
the bedrock using RAG builds a knowledge base, backed by a vector database.
What is creating vector embedding?
Bedrocks takes care of building the knowledge base based on customer data source.
What is an augmented prompt?
When user sends a “Query” to Bedrock Prompt, it sends a “Search” to the knowledge base and Retrieves the “Retrieval Text”. then it sends the “Query”+”Retrieval Text” called the “Augmented Prompt” to the Foundation Model to generate the final response.
The response had reference to the actual data source chunks.
What is RAG useful for?
Where realtime data is needed to be fed into the foundation model.
List the Vector databases that Bedrock can use.
AWS Aurora, AWS OpenSearch, Redis, MangoDB, Pinecone
What is a Vector database?
A vector database is a specialized database designed to store and manage data as high-dimensional vectors, enabling efficient similarity searches and retrieval of data based on semantic meaning, rather than structured data organization.
Here’s a more detailed explanation:
Data Representation:
Instead of storing data in rows and columns like traditional databases, vector databases store data as mathematical vectors, which are numerical representations of data features.
Similarity Search:
The primary purpose of vector databases is to perform similarity searches, finding data points that are “close” to a given query vector based on their vector representations.
Applications:
Vector databases are used in various applications, including:
* Recommender Systems: Suggesting similar items or content to users.
* Semantic Search: Finding documents or data that are semantically similar to a query.
* Image and Audio Recognition: Matching images or audio clips based on their vector representations.
* Anomaly Detection: Identifying unusual patterns or outliers in data.
How a custom data source is converted to vector database.
Using the Embeddings Models like Amazon Titan and Cohere. The Embeddings models doesn’t need to be the same as the Foundation Model.
The documents, for example from S3, is sectioned into “Document Chunks”, then the embedding models convert them to vectors and place these vectors in the vector database.
RAG Vector database types?
- Amazon OpenSearch Service – search & analytics database real time similarity queries, store millions of vector embeddings scalable index management, and fast nearest-neighbor (kNN) high performance search capability
- Amazon DocumentDB [with MongoDB compatibility] – NoSQL database
real time similarity queries, store millions of vector embeddings - Amazon Aurora – relational database, proprietary on AWS
- Amazon RDS for PostgreSQL – relational database, open-source
- Amazon Neptune – graph database
List Amazon Bedrock Data sources
- Amazon S3
- Confluence
- Microsoft SharePoint
- Salesforce
- Web pages (your website, your social
media feed, etc…) - More added over time…
What is Tokenization?
Converting raw text into a sequence of tokens
* Word-based tokenization: text is split into individual words
* Subword tokenization: some words can be split too (helpful for long words…)
In which website, we can experience tokenization?
https://platform.openai.com/tokenizer
What is Context Window?
The number of tokens an LLM can
process at once, and is primarily about the prompt and the information it contains. it’s a race now between models to have the greatest context window.
What are pros and cons of a larger context window?
Pros: more information and
coherence
Cons: more memory and processing
power
What is the first factor is in choosing a model?
its context window
What is a vector?
array of numerical values out of text, images or audio. These are scores, a rating for each dimension such as semantic meaning, syntactic role and sentiment. can be positive or negative.
What is embedding?
Creating vectors out of text, images and videos
How are vectors created from a text?
text > tokens > each token gets a tokenID > the tokens are fed into an embedding model > each token is converted into a vector (an array of scores) > vectors are stored in a Vector db
How is embedding model related to search engines?
it generates vectors with scores that are searchable using the nearest neighbor capability of search engines like open search.
Describe a couple of methods to visualize a multi dimentional embedding?
- dimentionality reduction: visualize in 2D
- color visualization
What are Amazon Bedrock guardrails?
Control the interaction between users and Foundation Models (FMs)
* Filter undesirable and harmful content => Blocked topics list
* Remove Personally Identifiable Information (PII)
* Enhanced privacy
* Reduce hallucinations
* Ability to create multiple Guardrails and monitor and analyze user inputs that can
violate the Guardrails
What are the bedrock guardrails parameters?
- the error message (could be different ones for prompt and response)
- harmful categories filter (boolean) - e.g. hate, sexual, insults, violence and misconduct
- prompt attacks filter (boolean): user inputs trying to override the system instructions.
- content filters
- custom word filters
- denied topics
- PII filters
- contextual grounding check (reduce hallucination): verify if the response is meaningful based on the knowledge provided
- Relevance check
What is Amazon Bedrock guardrails pricing model?
It is priced based on the number of text units processed, with content filters and denied topics costing $0.15 per 1,000 text units.
What are Amazon Bedrock Agents?
- Manage and carry out various multi-step tasks related to infrastructure
provisioning, application deployment, and operational activities - Task coordination: perform tasks in the correct order and ensure
information is passed correctly between tasks - Agents are configured to perform specific pre-defined action groups
- Integrate with other systems, services, databases and API to exchange data
or initiate actions - Leverage RAG to retrieve information when necessary
Given a task, how does Bedrock agent work behind the scene?
- assigned a task
- The agent sends the following to a Bedrock model:
- Prompt
- Instructions
- Action groups and knowledge bases
- Conversation history
- Task
- The Bedrock model runs “Chain of thought” => a list of steps
- Each step can be calling an API from the actions groups, executing a lambda or searching a knowledge base.
- The result is sent back to the agent
- The agent sends the Task and The Result to another Bedrock model to generate the final refined response.
What is Tracing feature of Bedrock agent?
gives us a list of steps generated by “Chain of thought”, so we can debug them.
What is Bedrock Model Invokation logging?
- All the calls to Bedrock models (including request and response, the model Id, number of token, the applied guadrails, the region, latency in ms, etc.) are logged and sent to Cloudwatch or S3 or both.
- This can include the Text, the Images and the Embeddings.
- Then we also can define Alerts in Cloudwatch based on logs analytics.
- It should be enabled in Bedrock settings.
Give 5 examples of metrics that Bedrock sends to Cloudwatch
- For the guardrails: “ContentFilteredCount”
- Invocations (the count)
- InvocationLatency
- OutputTokenCount
- InputTokenCount
What is Bedrock Pricing Model?
On-Demand:
- Pay-as-you-go (no commitment)
- Text Models – charged for every input/output token processed
- Embedding Models – charged for every input token processed
- Image Models – charged for every image generated
IMPORTANT - * Works with Base Models only
Batch:
- Multiple predictions at a time (output is a single file in Amazon S3)
- Can provide discounts of up to 50%
Provisioned Throughput:
- Purchase Model units for a certain time (1 month, 6 months…)
- Throughput – max. number of input/output tokens processed per minute
IMPORTANT - * Works with Base, Fine-tuned, and Custom Models
List Model Improvement Techniques by their cost
- Prompt Engineering
* No model training needed (no additional computation or fine-tuning) - Retrieval Augmented Generation (RAG)
* Uses external knowledge (FM doesn’t need to ”know everything”, less complex)
* No FM changes (no additional computation or fine-tuning) - Instruction-based Fine-tuning
* FM is fine-tuned with specific instructions (requires additional computation) - Domain Adaptation Fine-tuning
* Model is trained on a domain-specific dataset (requires intensive computation)
What are the Bedrock cost saving approaches?
- On-Demand – great for unpredictable workloads, no long-term commitment
- Batch – provides up to 50% discounts
- Provisioned Throughput – (usually) not a cost-saving measure, great to “reserve”
capacity - Temperature, Top K, Top P – They are Model configurations - no impact on pricing
- Model size – usually a smaller model will be cheaper (varies based on providers)
- Number of Input and Output Tokens – main driver of cost
What type of generative AI can recognize and interpret various forms of input data, such as text, images, and audio?
Multimodel model
You are developing a model and want to ensure the outputs are adapted to your users. Which method do you recommend?
Human evaluation
What is Prompt Engineering?
developing, designing, and optimizing prompts to
enhance the output of FMs for your needs
What are 4 elements that the Improved Prompting technique consists of?
- Instructions – a task for the model to do (description, how the model should perform)
- Context – external information to guide the model
- Input data – the input for which you want a response
- Output Indicator – the output type or format
NOTE - write the keyword in the prompt, for example: Context: xyz
What is negative prompting?
A technique where you explicitly instruct the model on what not to include or do in its response:
* Negative Prompting helps to:
* Avoid Unwanted Content – explicitly states what not to include, reducing the chances of irrelevant or inappropriate content
* Maintain Focus – helps the model stay on topic and not stray into areas that are not useful or desired
* Enhance Clarity – prevents the use of complex terminology or detailed data, making the output clearer and more accessible
What is the other name of Improved prompting?
Enhanced Promping
what are the pros and cons of Enhanced prompting?
Pros: Accuracy
Cons: Cost
Cost Implications:
Because the enhanced prompt is longer and contains more tokens, it will cost more to process, even if the response itself is not significantly longer.
Trade-offs:
While enhanced prompting can lead to better quality and more relevant outputs, the increased cost needs to be weighed against the value of the improved results.
Optimization:
Prompt engineers need to balance the need for detailed prompts with the cost of token usage, optimizing for both quality and efficiency.
What are the parameters in Prompt Performance Optimization?
The LLM parameters:
* System Prompts – how the model should behave and reply
* Temperature (0 to 1) = creativity
how likely it is to choose less probable words. 0 means the response is always the same which is the most probable answer.
* Low (ex: 0.2) – outputs are more conservative, repetitive, focused on most likely response
* High (ex: 1.0) – outputs are more diverse, creative, and unpredictable, maybe less coherent
* Top P (0 to 1) = tokens with the highest probability scores until the sum (NOT AVE) of the scores reaches the specified threshold value. (Top-p sampling is also called nucleus sampling.)
* Low P (ex: 0.25) – consider the 25% most likely words, will make a more coherent response
* High P (ex: 0.99) – consider a broad range of possible words, possibly more creative and diverse output
* Top K - tokens with the highest probabilities until the specified number of tokens is reached.
* Low K (ex: 10) – more coherent response, less probable words
* High K (ex: 500) – more probable words, more diverse and creative
* Length – maximum length of the answer
* Stop Sequences – tokens that signal the model to stop generating output
NOTE:
* the more number of token = more diversity
* the higher probability =
What is the result of a low Top P, and High Temp?
The model will only choose from the most likely words (low TopP), but won’t go for the most most likely (High Temp).
It’s perfect for “creative” models, e.g., for writing fiction.
What is the other name of Top P?
nucleus sampling
What is the result of a High Top P, and Low Top K?
With a high Top-P (nucleus sampling) and low Top-K, the model will focus on a larger set of probable tokens (due to high Top-P) but only consider the lowest number of tokens (due to low Top-K) in the selection process, potentially leading to more predictable and less diverse.
What is the best Temperature setting for generating creative text?
Higher Temperature values encourage the model to take more risks, producing more creative and diverse outputs.
How does Top-p differ from Top-k?
Top-p sampling dynamically selects tokens based on cumulative probability, adapting the number of tokens considered. Top-k sampling fixes the number of tokens to the top k most probable, regardless of their cumulative probability.
Can I use Temperature, Top-k, and Top-p together?
Yes, combining these parameters allows for finer control over the model’s output, but it’s essential to adjust them carefully to avoid unintended consequences.
Why is my model generating repetitive text?
If the randomness is too low (low Temperature, low Top-k/Top-p), the model may loop over high-probability tokens. Increasing the randomness can help introduce more variety
in LLM, what does coherence mean?
“coherence” refers to the logical flow, consistency, and clarity of the generated text, ensuring it makes sense as a whole and is easy for users to understand.
which parameters impact a model latency?
- The model size
- The model type itself (Llama has a different performance than Claude)
- The number of tokens in the input (the bigger the slower)
- The number of tokens in the output (the bigger the slower)
What is the impact of Top P, Top K, Temperature model parameters on its latency?
Latency is not impacted by Top P, Top K, Temperature
What is Zero-Shot Prompting?
- Present a task to the model
without providing examples or
explicit training for that specific task - You fully rely on the model’s
general knowledge - The larger and more capable the
FM, the more likely you’ll get good
results
What is a Few-Shots Prompting?
- Provide examples of a task to
the model to guide its output - We provide a “few shots” to
the model to perform the task - If you provide one example
only, this is also called
“one-shot” or “single-shot”
What is a Chain of Thought Prompting?
- Divide the task into a sequence of
reasoning steps, leading to more
structure and coherence - Using a sentence like “Think step
by step” helps - Helpful when solving a problem as
a human usually requires several
steps - Can be combined with Zero-Shot
or Few-Shots Prompting
What is Prompt template? how can it help?
Simplify and standardize the process of
generating Prompts. the prompt will have a defined format with variables that user provides, then response will be a defined format too.
It Helps with:
* Processes user input text and output prompts from foundation models (FMs)
* Orchestrates between the FM, action groups, and knowledge bases
* Formats and returns responses to the user
You can also provide examples with few-shots prompting to improve the model performance
Prompt templates can be used with Bedrock Agents
What is Prompt Template Injections?
Similar to SQL injection. The hacker passes values to prompt template variables which override the purpose of the template to do the harm that hacker intended.
How do we protect against prompt injections?
Add explicit instructions to ignore any unrelated or potential
malicious content.
What can Amazon Q business do?
- Fully managed Gen-AI assistant for your employees
- Based on your company’s knowledge and data
- Answer questions, provide summaries, generate content, automate tasks
- Perform routine actions (e.g., submit time-off requests, send meeting invites)
What does Amazon Q Business have behind the scene?
- Built on Amazon Bedrock (but you can’t choose the underlying FM). it uses multiple FMs.
- 40+ data connectors (Fully managed RAGs) - including AWS services and external data sources.
- plugins - to interact with 3rd parties, e.g. create a ticket in JIRA
- we can create custom plugins too using APIs
How does Amazon Q business interact with user?
It has a web application interface
how is Amazon Q business secure?
User is authenticated and authorised using IAM Identity centre which can also be integrated with 3rd party IDPs
What are Amazon Q business Admin Controls?
- Controls and customize responses to your organizational needs
- Admin controls == Guardrails
- Block specific words or topics
- Respond only with internal information (vs using external knowledge)
- Global controls & topic-level controls (more granular rules)
What are Amazon Q apps?
as part of the Amazon Q business:
* Create Gen AI-powered apps without coding by using natural language
* Leverages your company’s internal data
* Possibility to leverage plugins (Jira, etc…)
What does Amazon Q developer do?
- Answer questions about the AWS
documentation and AWS service selection - Answer questions about resources in your AWS
account - Suggest CLI (Command Line Interface) to run
to make changes to your account - Helps you do bill analysis, resolve errors,
troubleshooting… - AI code companion to help you
code new applications (similar to
GitHub Copilot)- Supports many languages: Java,
JavaScript, Python, TypeScript, C#… - Real-time code suggestions and
security scans - Software agent to implement
features, generate documentation,
bootstrapping new projects
*IDE extension
- Supports many languages: Java,
Can Amazon Q access Cross-region data?
yes if enabled
Can Amazon Q modify resources in our aws account?
No, it can generates CLI commands for us, then we should run the command ourselves in Cloud shell.
List a few AWS services that are integrated with Amazon Q?
Quicksight, EC2, AWS Chatbot, Glue
What is Party Rock?
A AWS GenAI app-building playground (powered by Amazon Bedrock)
* Allows you to experiment creating GenAI apps with various FMs (no coding
or AWS account required)
* UI is similar to Amazon Q Apps (with less setup and no AWS account