AIF-C01 Flashcards

1
Q

You ONLY want to manage Applications and Data. Which type of Cloud Computing model should you use?
* On-premises
* Infrastructure as a Service (IaaS)
* Software as a Service (SaaS)
* Platform as a Service (PaaS)

A

Platform as a Service model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Is Ec2 a PaaS or IaaS?

A

IaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give an example of a PaaS?

A

AWS Beanstalk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the pricing model of Cloud Computing?
* Discounts over time
* Pay-as-you-go pricing
* Pay once a year
* Flat-rate pricing

A

Pay as you go

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which Global Infrastructure identity is composed of one or more discrete data centers with redundant power, networking, and connectivity, and are used to deploy infrastructure?
* Edge Locations
* Availability Zones
* Regions

A

Availability Zones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following is NOT one of the Five Characteristics of Cloud Computing?
* Rapid elasticity and scalability
* Multi-tenancy and resource pooling
* Dedicated Support Agent to help you deploy applications
* On-demand self service

A

Dedicated Support Agent to help you deploy applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which are the 3 pricing fundamentals of the AWS Cloud?
* Compute, Storage, and Data transfer in the AWS Cloud
* Compute, Networking, and Data transfer out of the AWS Cloud
* Compute, Storage, and Data transfer out of the AWS Cloud
* Storage, Functions, and Data transfer in the AWS Cloud

A

Compute, Storage, and data transfer out of the AWS Cloud are the 3 pricing fundamentals of the AWS Cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following options is NOT a point of consideration when choosing an AWS Region?
* Compliance with data governance
* Latency
* Capacity availability
* Pricing

A

Capacity is unlimited in the cloud, you do not need to worry about it. The 4 points of considerations when choosing an AWS Region are: compliance with data governance and legal requirements, proximity to customers, available services and features within a Region, and pricing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is NOT an advantage of Cloud Computing?
* Trade capital expense (CAPEX) for operational expense (OPEX)
* Train your employees less
* Go global in minutes
* Stop spending money running and maintaining data centers

A

You must train your employees more so they can use the cloud effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

AWS Regions are composed of?
* Two or more Edge Locations
* One or more discrete data centers
* Three or more Availability Zones

A

AWS Regions consist of multiple, isolated, and physically separate Availability Zones within a geographic area.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which of the following services has a global scope?
* EC2
* IAM
* Lambda
* Rekognition

A

IAM is a global service (encompasses all regions).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which of the following is the definition of Cloud Computing?
* Rapidly develop, test and launch software applications
* Automatic and quick ability to acquire resources as you need them and release resources when you no longer need them
* On-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user
* Change resource types when needed

A

On-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What defines the distribution of responsibilities for security in the AWS Cloud?
* AWS Pricing Fundamentals
* The Shared Responsibility Model
* AWS Acceptable Use Policy
* The AWS Management Console

A

The Shared Responsibility Model defines who is responsible for what in the AWS Cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A company would like to benefit from the advantages of the Public Cloud but would like to keep sensitive assets in its own infrastructure. Which deployment model should the company use?
* Private Cloud
* Public Cloud
* Hybrid Cloud

A

Using a Hybrid Cloud deployment model allows you to benefit from the flexibility, scalability and on-demand storage access while keeping security and performance of your own infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is NOT authorized to do on AWS according to the AWS Acceptable Use Policy?
* Building a gaming application
* Deploying a website
* Run analytics on stolen content
* Backup your data

A
  • Run analytics on stolen content
    You can run analytics on AWS, but you cannot run analytics on fraudulent content. Refer to the AWS Acceptable Use Policy to see what is not authorized to do on AWS.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

GenAI is a subset of X which is a subset of Y which is a subset of Z.

A

X - Deep Learning
Y - Machine Learning
Z - Artificial Intelligince

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does GenAI generate?

A

new data/content that is similar to the data that it was trained on like Text, images, Audio, Code, Video and etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the cost of generating a foundation model and why? who can do this?

A

Tens of Millions of dollars for training, so it only can be done by large companies who can afford it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which famous GenAI models are open source?

A

Meta and Google BERT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which famous GenAI models are commercial and not open source?

A

OpenAI and Anthropic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does LLM stand for?

A

Large Language Models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is LLM?

A

Type of GenAI that generates coherent human-like text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the most famous LLM?

A

ChatGPT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a prompt?

A

GenAI Model’s input from a user. The question that the user asks from GenAI model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does it mean that a prompt is non-deterministic?

A

2 users with the same prompt from the same GenAI model, may get different answers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Name a famous image generative AI method

A

Diffusion Models - e.g. Stable Diffusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How is a diffusion model trained and generates?

A

trained by Forward diffusion process (by adding noise to a picture in multiple steps) and generates by reverse diffusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is AWS bedrock?

A

A fully managed AWS service to build Gen AI applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Is my training data secure in Bedrock?

A

yes. it’s all within the same account and not leaving it. any Foundation Model used is a copy of the original model, trained by customer data and stored locally.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are the elements of Bedrock service?

A
  • Foundation Models
  • Interactive Playground for users
  • Knowledge Bases (RAG): to fetch data from external data sources to generate more relevant and accurate responses.
  • Fine-Tuning: Update the model with your data
  • Unified APIs across all the models used by GenAI applications.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Does enabling a model cost?

A

No, we only pay for using a model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is Amazon Titan?

A
  • High-performing Foundation Models from AWS
  • Image, text, multimodal model choices via a fully-managed APIs
  • Can be customized with your own data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is Amazon Titan Text Express model good for?

A
  • High-performance
    text model, +100
    languages
  • Content creation,
    classification,
    education…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is Claude model from Anthropic (an AI leading company) good for?

A
  • High-capacity text
    generation, multilanguage
  • Analysis, forecasting,
    document
    comparison…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is (Llama-2 70b-chat) model good for?

A
  • Large-scale tasks,
    dialogue, English
  • Text generation,
    customer service…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is Stable Diffusion model from Stability.ai good for?

A

Image creation for
advertising, media…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

How in aws console we can play with models and choose the right one?

A

Playground in console has compare models feature, that you can run a command across multi models and compare their result.
Compare, by the result content, pricing, response time, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a Custom Model in Bedrock?

A

By choosing a base model, and tuning based on our own data, we can create a custom model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What are the options to customize a model in Bedrock?

A
  • Fine Tune (one off)
  • Continued Pre-training (ongoing)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Which models can be fine tuned?

A

not all, usually open-sources, e.g. Amazon, Cohere and Meta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

For Fine tuning a model, how the data is provided?

A
  • from S3
  • must be dataset format
  • use Sagemaker Ground Truth to create and label training datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are Hyperparameters?

A

variables for the machine learning algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is Epochs Hyperparameter?

A

The total number of iterations of all the training data in one cycle for training the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is Batch Size Hyperparameter?

A

The number of samples proceeded before model parameters are updated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is learning rate hyperparameter?

A

The rate at which model parameters are updated after each batch of training data. basically, how fast the model learns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is Learning Rate warmup steps hyperparameter?

A

Number of iterations over which learning rate is gradually increased to the initial rate specified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Where is model fine tuning result saved?

A

S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is a model fine tuning result?

A

A fine tuned model trained with our data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is the pricing limitation of working with fine tuned models?

A

must provision throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Instructions based fine tuning uses…?

A

Labeled examples that are [prompt-response] based.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How is continued pre-training?

A
  • Provide Unlabeled data. e.g. any unstructured knowledge! it has only “input” field.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Continued Pre-training is also called ….

A

Domain Adoption Fine tuning. make the model expert in a domain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

what are 2 types of instruction based fine tuning?

A
  • Single Turn Messaging
  • Multi Turn Messaging => e.g. chatbot with multi turn conversation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is Multi Turn Messaging schema?

A

{
“system”: “context”,
[{
“role”: “user/assistant”,
“content”: “message”
}]
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Re-training vs instruction based fine tuning? (5 items)

A
  • Re-training an FM requires a higher budget
  • Instruction-based fine-tuning is usually cheaper as computations are
    less intense and the amount of data required usually less
  • It also requires experienced ML engineers to perform the task
  • You must prepare the data, do the fine-tuning, evaluate the model
  • Running a fine-tuned model is also more expensive (provisioned
    throughput)
56
Q

What is Transfer Learning

A

Transfer Learning – the broader concept of reusing a pre-trained model to adapt it to a new related task
* Widely used for image classification
* And for NLP (models like BERT and GPT)
* Can appear in the exam as a general ML concept
* Fine-tuning is a specific kind of transfer learning

57
Q

Say 4 use cases for file-tuning

A
  • A chatbot designed with a particular persona or tone, or geared
    towards a specific purpose (e.g., assisting customers, crafting
    advertisements)
  • Training using more up-to-date information than what the language
    model previously accessed
  • Training with exclusive data (e.g., your historical emails or messages,
    records from customer service interactions)
  • Targeted use cases (categorization, assessing accuracy)
58
Q

How is Amazon Bedrock Evaluating a Model

A

Using automatic evaluation.
- it evaluate a model for quality control
- Built-in task types:
* Text summarization
* question and answer
* text classification
* open-ended text generation…
- Bring your own prompt dataset or use built-in curated prompt datasets (Benchmark datasets - questions and answers)
- Scores are calculated automatically
Model scores are calculated using various statistical methods (e.g. BERTScore, F1…)

59
Q

What is a Judge model

A

In bedrock, a model that evaluates the evaluating model’s answers vs benchmark answers, and gives a grading score.

60
Q

What is a Bias score?

A

Some benchmark datasets in Bedrock can quickly identify bias like discrimination against a specific group. they generate a Bias score.

61
Q

What is the diff between human and automatic model evaluation?

A

in Human approach, instead of the Judge model, groups of humans evaluate by giving thumbs up or down or score like 1 to 5, and generating a grade.
Also, in human evaluation, we can have custom tasks types that only humans can evaluate accurately.

62
Q

What are automated metrics to evaluate a FM?

A
  • ROUGE
  • BLEU
  • BERTScore
  • Perplexity
63
Q

What is ROUGE? and what are ROUGE-N and ROUGE-L?

A

Recall Oriented Understudying for Gisting Evaluation => it’s a FM auto evaluation metric

Evaluating automatic summarization and machine translation systems
* ROUGE-N – measure the number of matching n-grams between reference and generated text.
(1gram is a word - it means how many words the model’s answer is from the benchmark answer)
* ROUGE-L – longest common subsequence between reference and generated text.
i.e. the longest sequence of words (not necessarily consecutive, but still in order) that is shared between both.

64
Q

what is Gisting?

A

the word’s meaning= engage in chat or gossip.
in ML, it means using machine translation (MT) to quickly understand the general meaning or essence of foreign text, without requiring a perfect translation.

Focus on meaning, not precision:
Gisting prioritizes understanding the core message over achieving a grammatically perfect translation.

65
Q

What is BLEU?

A

Bilingual Evaluation Understudy => it’s a FM auto evaluation metric
* Evaluate the quality of generated text, especially for translations
* Considers both precision and penalizes too much brevity
* Looks at a combination of n-grams (1, 2, 3, 4)

66
Q

What is BERTScore?

A

Bidirectional Encoder Representations from Transformers => it’s a FM auto evaluation metric.
Semantic similarity between generated text
* Uses pre-trained BERT models to compare the contextualized embeddings of both texts and computes the cosine similarity between them.
* Capable of capturing more nuance between the texts

67
Q

What is Perplexity?

A

it’s a FM auto evaluation metric.
how well the model predicts the next token (lower is better)

68
Q

What are Business Metrics to Evaluate a Model On?

A
  • User Satisfaction – gather users’ feedbacks and assess their satisfaction with the model responses (e.g., user satisfaction for an ecommerce platform)
  • Average Revenue Per User (ARPU) – average revenue per user attributed to
    the Gen-AI app (e.g., monitor ecommerce user base revenue)
  • Cross-Domain Performance – measure the model’s ability to perform cross
    different domains tasks (e.g., monitor multi-domain ecommerce platform)
  • Conversion Rate – generate recommended desired outcomes such as purchases
    (e.g., optimizing ecommerce platform for higher conversion rate)
  • Efficiency – evaluate the model’s efficiency in computation, resource utilization…
    (e.g., improve production line efficiency)
69
Q

What are models evaluation task types?

A
  • General Text Generation
  • Text summarization
  • Question and Answer
  • Text classification
70
Q

List a few model evaluation metrics?

A
  • Toxity: offensive and inappropriate content
  • Accuracy
  • Robustness
  • Relevance
  • Consistency
  • Completeness
71
Q

What does RAG stand for?

A

Retrieval - Augmented Generation

72
Q

What does RAG do?

A

Allows a FM to reference a data source outside of training data.
the bedrock using RAG builds a knowledge base, backed by a vector database.

73
Q

What is creating vector embedding?

A

Bedrocks takes care of building the knowledge base based on customer data source.

74
Q

What is an augmented prompt?

A

When user sends a “Query” to Bedrock Prompt, it sends a “Search” to the knowledge base and Retrieves the “Retrieval Text”. then it sends the “Query”+”Retrieval Text” called the “Augmented Prompt” to the Foundation Model to generate the final response.
The response had reference to the actual data source chunks.

75
Q

What is RAG useful for?

A

Where realtime data is needed to be fed into the foundation model.

76
Q

List the Vector databases that Bedrock can use.

A

AWS Aurora, AWS OpenSearch, Redis, MangoDB, Pinecone

77
Q

What is a Vector database?

A

A vector database is a specialized database designed to store and manage data as high-dimensional vectors, enabling efficient similarity searches and retrieval of data based on semantic meaning, rather than structured data organization.

Here’s a more detailed explanation:

Data Representation:
Instead of storing data in rows and columns like traditional databases, vector databases store data as mathematical vectors, which are numerical representations of data features.

Similarity Search:
The primary purpose of vector databases is to perform similarity searches, finding data points that are “close” to a given query vector based on their vector representations.

Applications:
Vector databases are used in various applications, including:
* Recommender Systems: Suggesting similar items or content to users.
* Semantic Search: Finding documents or data that are semantically similar to a query.
* Image and Audio Recognition: Matching images or audio clips based on their vector representations.
* Anomaly Detection: Identifying unusual patterns or outliers in data.

78
Q

How a custom data source is converted to vector database.

A

Using the Embeddings Models like Amazon Titan and Cohere. The Embeddings models doesn’t need to be the same as the Foundation Model.

The documents, for example from S3, is sectioned into “Document Chunks”, then the embedding models convert them to vectors and place these vectors in the vector database.

79
Q

RAG Vector database types?

A
  • Amazon OpenSearch Service – search & analytics database real time similarity queries, store millions of vector embeddings scalable index management, and fast nearest-neighbor (kNN) high performance search capability
  • Amazon DocumentDB [with MongoDB compatibility] – NoSQL database
    real time similarity queries, store millions of vector embeddings
  • Amazon Aurora – relational database, proprietary on AWS
  • Amazon RDS for PostgreSQL – relational database, open-source
  • Amazon Neptune – graph database
80
Q

List Amazon Bedrock Data sources

A
  • Amazon S3
  • Confluence
  • Microsoft SharePoint
  • Salesforce
  • Web pages (your website, your social
    media feed, etc…)
  • More added over time…
81
Q

What is Tokenization?

A

Converting raw text into a sequence of tokens
* Word-based tokenization: text is split into individual words
* Subword tokenization: some words can be split too (helpful for long words…)

82
Q

In which website, we can experience tokenization?

A

https://platform.openai.com/tokenizer

83
Q

What is Context Window?

A

The number of tokens an LLM can
process at once, and is primarily about the prompt and the information it contains. it’s a race now between models to have the greatest context window.

84
Q

What are pros and cons of a larger context window?

A

Pros: more information and
coherence
Cons: more memory and processing
power

85
Q

What is the first factor is in choosing a model?

A

its context window

86
Q

What is a vector?

A

array of numerical values out of text, images or audio. These are scores, a rating for each dimension such as semantic meaning, syntactic role and sentiment. can be positive or negative.

87
Q

What is embedding?

A

Creating vectors out of text, images and videos

88
Q

How are vectors created from a text?

A

text > tokens > each token gets a tokenID > the tokens are fed into an embedding model > each token is converted into a vector (an array of scores) > vectors are stored in a Vector db

89
Q

How is embedding model related to search engines?

A

it generates vectors with scores that are searchable using the nearest neighbor capability of search engines like open search.

90
Q

Describe a couple of methods to visualize a multi dimentional embedding?

A
  • dimentionality reduction: visualize in 2D
  • color visualization
91
Q

What are Amazon Bedrock guardrails?

A

Control the interaction between users and Foundation Models (FMs)
* Filter undesirable and harmful content => Blocked topics list
* Remove Personally Identifiable Information (PII)
* Enhanced privacy
* Reduce hallucinations
* Ability to create multiple Guardrails and monitor and analyze user inputs that can
violate the Guardrails

92
Q

What are the bedrock guardrails parameters?

A
  • the error message (could be different ones for prompt and response)
  • harmful categories filter (boolean) - e.g. hate, sexual, insults, violence and misconduct
  • prompt attacks filter (boolean): user inputs trying to override the system instructions.
  • content filters
  • custom word filters
  • denied topics
  • PII filters
  • contextual grounding check (reduce hallucination): verify if the response is meaningful based on the knowledge provided
  • Relevance check
93
Q

What is Amazon Bedrock guardrails pricing model?

A

It is priced based on the number of text units processed, with content filters and denied topics costing $0.15 per 1,000 text units.

94
Q

What are Amazon Bedrock Agents?

A
  • Manage and carry out various multi-step tasks related to infrastructure
    provisioning, application deployment, and operational activities
  • Task coordination: perform tasks in the correct order and ensure
    information is passed correctly between tasks
  • Agents are configured to perform specific pre-defined action groups
  • Integrate with other systems, services, databases and API to exchange data
    or initiate actions
  • Leverage RAG to retrieve information when necessary
95
Q

Given a task, how does Bedrock agent work behind the scene?

A
  • assigned a task
  • The agent sends the following to a Bedrock model:
    • Prompt
    • Instructions
    • Action groups and knowledge bases
    • Conversation history
    • Task
  • The Bedrock model runs “Chain of thought” => a list of steps
  • Each step can be calling an API from the actions groups, executing a lambda or searching a knowledge base.
  • The result is sent back to the agent
  • The agent sends the Task and The Result to another Bedrock model to generate the final refined response.
96
Q

What is Tracing feature of Bedrock agent?

A

gives us a list of steps generated by “Chain of thought”, so we can debug them.

97
Q

What is Bedrock Model Invokation logging?

A
  • All the calls to Bedrock models (including request and response, the model Id, number of token, the applied guadrails, the region, latency in ms, etc.) are logged and sent to Cloudwatch or S3 or both.
  • This can include the Text, the Images and the Embeddings.
  • Then we also can define Alerts in Cloudwatch based on logs analytics.
  • It should be enabled in Bedrock settings.
98
Q

Give 5 examples of metrics that Bedrock sends to Cloudwatch

A
  • For the guardrails: “ContentFilteredCount”
  • Invocations (the count)
  • InvocationLatency
  • OutputTokenCount
  • InputTokenCount
99
Q

What is Bedrock Pricing Model?

A

On-Demand:

  • Pay-as-you-go (no commitment)
  • Text Models – charged for every input/output token processed
  • Embedding Models – charged for every input token processed
  • Image Models – charged for every image generated
    IMPORTANT - * Works with Base Models only

Batch:

  • Multiple predictions at a time (output is a single file in Amazon S3)
  • Can provide discounts of up to 50%

Provisioned Throughput:

  • Purchase Model units for a certain time (1 month, 6 months…)
  • Throughput – max. number of input/output tokens processed per minute
    IMPORTANT - * Works with Base, Fine-tuned, and Custom Models
100
Q

List Model Improvement Techniques by their cost

A
  1. Prompt Engineering
    * No model training needed (no additional computation or fine-tuning)
  2. Retrieval Augmented Generation (RAG)
    * Uses external knowledge (FM doesn’t need to ”know everything”, less complex)
    * No FM changes (no additional computation or fine-tuning)
  3. Instruction-based Fine-tuning
    * FM is fine-tuned with specific instructions (requires additional computation)
  4. Domain Adaptation Fine-tuning
    * Model is trained on a domain-specific dataset (requires intensive computation)
101
Q

What are the Bedrock cost saving approaches?

A
  • On-Demand – great for unpredictable workloads, no long-term commitment
  • Batch – provides up to 50% discounts
  • Provisioned Throughput – (usually) not a cost-saving measure, great to “reserve”
    capacity
  • Temperature, Top K, Top P – They are Model configurations - no impact on pricing
  • Model size – usually a smaller model will be cheaper (varies based on providers)
  • Number of Input and Output Tokens – main driver of cost
102
Q

What type of generative AI can recognize and interpret various forms of input data, such as text, images, and audio?

A

Multimodel model

103
Q

You are developing a model and want to ensure the outputs are adapted to your users. Which method do you recommend?

A

Human evaluation

104
Q

What is Prompt Engineering?

A

developing, designing, and optimizing prompts to
enhance the output of FMs for your needs

105
Q

What are 4 elements that the Improved Prompting technique consists of?

A
  • Instructions – a task for the model to do (description, how the model should perform)
  • Context – external information to guide the model
  • Input data – the input for which you want a response
  • Output Indicator – the output type or format

NOTE - write the keyword in the prompt, for example: Context: xyz

106
Q

What is negative prompting?

A

A technique where you explicitly instruct the model on what not to include or do in its response:
* Negative Prompting helps to:
* Avoid Unwanted Content – explicitly states what not to include, reducing the chances of irrelevant or inappropriate content
* Maintain Focus – helps the model stay on topic and not stray into areas that are not useful or desired
* Enhance Clarity – prevents the use of complex terminology or detailed data, making the output clearer and more accessible

107
Q

What is the other name of Improved prompting?

A

Enhanced Promping

108
Q

what are the pros and cons of Enhanced prompting?

A

Pros: Accuracy
Cons: Cost

Cost Implications:
Because the enhanced prompt is longer and contains more tokens, it will cost more to process, even if the response itself is not significantly longer.

Trade-offs:
While enhanced prompting can lead to better quality and more relevant outputs, the increased cost needs to be weighed against the value of the improved results.

Optimization:
Prompt engineers need to balance the need for detailed prompts with the cost of token usage, optimizing for both quality and efficiency.

109
Q

What are the parameters in Prompt Performance Optimization?

A

The LLM parameters:
* System Prompts – how the model should behave and reply
* Temperature (0 to 1) = creativity
how likely it is to choose less probable words. 0 means the response is always the same which is the most probable answer.
* Low (ex: 0.2) – outputs are more conservative, repetitive, focused on most likely response
* High (ex: 1.0) – outputs are more diverse, creative, and unpredictable, maybe less coherent
* Top P (0 to 1) = tokens with the highest probability scores until the sum (NOT AVE) of the scores reaches the specified threshold value. (Top-p sampling is also called nucleus sampling.)
* Low P (ex: 0.25) – consider the 25% most likely words, will make a more coherent response
* High P (ex: 0.99) – consider a broad range of possible words, possibly more creative and diverse output
* Top K - tokens with the highest probabilities until the specified number of tokens is reached.
* Low K (ex: 10) – more coherent response, less probable words
* High K (ex: 500) – more probable words, more diverse and creative
* Length – maximum length of the answer
* Stop Sequences – tokens that signal the model to stop generating output

NOTE:
* the more number of token = more diversity
* the higher probability =

110
Q

What is the result of a low Top P, and High Temp?

A

The model will only choose from the most likely words (low TopP), but won’t go for the most most likely (High Temp).
It’s perfect for “creative” models, e.g., for writing fiction.

111
Q

What is the other name of Top P?

A

nucleus sampling

112
Q

What is the result of a High Top P, and Low Top K?

A

With a high Top-P (nucleus sampling) and low Top-K, the model will focus on a larger set of probable tokens (due to high Top-P) but only consider the lowest number of tokens (due to low Top-K) in the selection process, potentially leading to more predictable and less diverse.

113
Q

What is the best Temperature setting for generating creative text?

A

Higher Temperature values encourage the model to take more risks, producing more creative and diverse outputs.

114
Q

How does Top-p differ from Top-k?

A

Top-p sampling dynamically selects tokens based on cumulative probability, adapting the number of tokens considered. Top-k sampling fixes the number of tokens to the top k most probable, regardless of their cumulative probability.

115
Q

Can I use Temperature, Top-k, and Top-p together?

A

Yes, combining these parameters allows for finer control over the model’s output, but it’s essential to adjust them carefully to avoid unintended consequences.

116
Q

Why is my model generating repetitive text?

A

If the randomness is too low (low Temperature, low Top-k/Top-p), the model may loop over high-probability tokens. Increasing the randomness can help introduce more variety

117
Q

in LLM, what does coherence mean?

A

“coherence” refers to the logical flow, consistency, and clarity of the generated text, ensuring it makes sense as a whole and is easy for users to understand.

118
Q

which parameters impact a model latency?

A
  • The model size
  • The model type itself (Llama has a different performance than Claude)
  • The number of tokens in the input (the bigger the slower)
  • The number of tokens in the output (the bigger the slower)
119
Q

What is the impact of Top P, Top K, Temperature model parameters on its latency?

A

Latency is not impacted by Top P, Top K, Temperature

120
Q

What is Zero-Shot Prompting?

A
  • Present a task to the model
    without providing examples or
    explicit training for that specific task
  • You fully rely on the model’s
    general knowledge
  • The larger and more capable the
    FM, the more likely you’ll get good
    results
121
Q

What is a Few-Shots Prompting?

A
  • Provide examples of a task to
    the model to guide its output
  • We provide a “few shots” to
    the model to perform the task
  • If you provide one example
    only, this is also called
    “one-shot” or “single-shot”
122
Q

What is a Chain of Thought Prompting?

A
  • Divide the task into a sequence of
    reasoning steps, leading to more
    structure and coherence
  • Using a sentence like “Think step
    by step” helps
  • Helpful when solving a problem as
    a human usually requires several
    steps
  • Can be combined with Zero-Shot
    or Few-Shots Prompting
123
Q

What is Prompt template? how can it help?

A

Simplify and standardize the process of
generating Prompts. the prompt will have a defined format with variables that user provides, then response will be a defined format too.

It Helps with:
* Processes user input text and output prompts from foundation models (FMs)
* Orchestrates between the FM, action groups, and knowledge bases
* Formats and returns responses to the user

You can also provide examples with few-shots prompting to improve the model performance

Prompt templates can be used with Bedrock Agents

124
Q

What is Prompt Template Injections?

A

Similar to SQL injection. The hacker passes values to prompt template variables which override the purpose of the template to do the harm that hacker intended.

125
Q

How do we protect against prompt injections?

A

Add explicit instructions to ignore any unrelated or potential
malicious content.

126
Q

What can Amazon Q business do?

A
  • Fully managed Gen-AI assistant for your employees
  • Based on your company’s knowledge and data
    • Answer questions, provide summaries, generate content, automate tasks
    • Perform routine actions (e.g., submit time-off requests, send meeting invites)
127
Q

What does Amazon Q Business have behind the scene?

A
  • Built on Amazon Bedrock (but you can’t choose the underlying FM). it uses multiple FMs.
  • 40+ data connectors (Fully managed RAGs) - including AWS services and external data sources.
  • plugins - to interact with 3rd parties, e.g. create a ticket in JIRA
    • we can create custom plugins too using APIs
128
Q

How does Amazon Q business interact with user?

A

It has a web application interface

129
Q

how is Amazon Q business secure?

A

User is authenticated and authorised using IAM Identity centre which can also be integrated with 3rd party IDPs

130
Q

What are Amazon Q business Admin Controls?

A
  • Controls and customize responses to your organizational needs
  • Admin controls == Guardrails
  • Block specific words or topics
  • Respond only with internal information (vs using external knowledge)
  • Global controls & topic-level controls (more granular rules)
131
Q

What are Amazon Q apps?

A

as part of the Amazon Q business:
* Create Gen AI-powered apps without coding by using natural language
* Leverages your company’s internal data
* Possibility to leverage plugins (Jira, etc…)

132
Q

What does Amazon Q developer do?

A
  • Answer questions about the AWS
    documentation and AWS service selection
  • Answer questions about resources in your AWS
    account
  • Suggest CLI (Command Line Interface) to run
    to make changes to your account
  • Helps you do bill analysis, resolve errors,
    troubleshooting…
  • AI code companion to help you
    code new applications (similar to
    GitHub Copilot)
    • Supports many languages: Java,
      JavaScript, Python, TypeScript, C#…
    • Real-time code suggestions and
      security scans
    • Software agent to implement
      features, generate documentation,
      bootstrapping new projects
      *IDE extension
133
Q

Can Amazon Q access Cross-region data?

A

yes if enabled

133
Q

Can Amazon Q modify resources in our aws account?

A

No, it can generates CLI commands for us, then we should run the command ourselves in Cloud shell.

134
Q

List a few AWS services that are integrated with Amazon Q?

A

Quicksight, EC2, AWS Chatbot, Glue

135
Q

What is Party Rock?

A

A AWS GenAI app-building playground (powered by Amazon Bedrock)
* Allows you to experiment creating GenAI apps with various FMs (no coding
or AWS account required)
* UI is similar to Amazon Q Apps (with less setup and no AWS account