Amazon Bedrock & Generative AI Flashcards

Question 1

Q

What is generative AI ?

Answer

A

Generative AI takes a set of data (training data) and learns patterns from that data. The data can be in the form of

Text
Image
Audio
Code
Video

Question 2

Q

What is a foundational model ?

Answer

A

A foundation model is a machine learning (ML) model that’s trained on large amounts of data and can be used for many tasks.

Question 3

Q

In LLMs what is a non deterministic response ?

Answer

A

non-deterministic meaning that for the same question answers will vary if only slightly

Question 4

Q

What is a Bedrock Agent ?

Answer

A

Agents can do the following

Manage and carry out various multi task steps related to infrastructure provisioning, application deployment and operational activities
Task Coordination: Perform tasks in the correct order and ensure information is passed correctly between tasks
Agents are configure to perform specific pre-defined action groups
Integrate with other systems, services, databases and API to exchange data or initiate actions
Leverage RAG to retrieve information when necessary

So an agent understands what it can do and will extract information from openapi, aws lambda or knowledge bases to satisfy the tasks asked of it. Think of it like a personal shopper in that it responds to your query and interrogates the data sources that are most applicab

Question 5

Q

What are the functions of Bedrock Guardrails

Answer

A

Guardrails as the name suggests are a protection mechanism that allow the following

Control the interaction between users and foundation models
Filter undesirable and harmful content
Remove PII
Enhance Privacy
Reduce Hallucinations
Ability to create an monitor multiple Guardrails

So in AWS bedrock you can set up a guardrail to be made up of several types of filter such as content, harmful categories, prompt attacks, sensitive information and word attacks.

Question 6

Q

What is RAG ?

Answer

A

Sometimes there are prompts that are only answered from a niche data set. A typical question is for example whose role is accounts is it to submit VAT returns. This relies on niche data not commonly available.

To get this to work we supply external data (S3) to build a knowledge base. When the prompt is submitted the knowledge base is used as a source of information to augment the prompt which is then submitted to the Foundation Model to generate a response.

Question 7

Q

What role in RAG does the vector database play ?

Answer

A

The vector database is responsible for supplying the information from the knowledge base. The two main ones are Aurora and Opensearch. Other available are MongoDb, Redis and Pinecone.

The rough workflow is that of the data in S3 being chunked and then vectored via amazon titan or cohere and then the vectors are stored in the vector database.

Question 8

Q

What data sources can be used to seed a knowledge base ?

Answer

A

S3
Confluence
Sharepoint
Salesforce
Web Pages

Question 9

Q

What are the basic setup steps for a knowledge base in RAG

Answer

A

So there is a chat setup which is really like a playground that you use to get a feel for what is involved. In this playground you can supply a datasource and select a model and template. The template is the format for the answer coming out of the knowledge base that will sent into the foundational model (brain://Fd5CPf7KIkOOFIah23UmNw/FoundationalModels).

You can then interactively ask questions so that you can see how it all works.

Question 10

Q

Name the four cost areas in Bedrock ?

Answer

A

Prompt Engineering
Cheap no model retraining or fine tuning
RAG
Some cost to support external knowledge base
Instruction Based Fine Tuning
FM is fine tuned with specific instructions
Domain Adaption Fine-Tuning
Most expensive option as Model is trained on a domain specific dataset - requires intensive computation

Best strategy for cost savings in Bedrock is to minimise input tokens (prompt engingeering) and output tokens (response brevity)

Question 11

Q

What two functions of cloudwatch are available to Bedrock ?

Answer

A

Logging
Sends logs of all invocations to Amazon Cloudwatch or S3
Can include text, images and embeddings
Metrics
Metrics are available such as ContentFilteredCount to monitor if GuardRails are working
Alarms can be built on top of Metrics

Question 12

Q

Is Clustering supervised or unsupervised learning ?

Answer

A

Un supervised

Question 13

Q

Does high bias lead to overfitting or underfitting ?

Answer

A

Underfitting

Question 14

Q

Does high variance lead to overfitting or underfitting ?

Answer

A

Overfitting

Question 15

Q

What is bias versus variance trade-off

Answer

A

The bias versus variance trade-off refers to the challenge of balancing the error due to the model’s complexity (variance) and the error due to incorrect assumptions in the model (bias), where high bias can cause underfitting and high variance can cause overfitting

Question 16

Q

What is the terminology hierarchy ?

Answer

A

Artificial Intelligence > Machine Learning > Deep Learning > Generative AI

Question 17

Q

What is the difference between image processing and computer vision

Answer

A

Image processing focuses on enhancing and manipulating images for visual quality, whereas computer vision involves interpreting and understanding the content of images to make decisions

Question 18

Q

What is the difference in Feature Engineering between structured and unstructured data ?

Answer

A

Feature engineering for structured data typically includes tasks like normalization, handling missing values, and encoding categorical variables. For unstructured data, such as text or images, feature engineering involves different tasks like tokenization (breaking down text into tokens), vectorization (converting text or images into numerical vectors), and extracting features that can represent the content meaningfully.

Question 19

Q

In Bedrock which deployment type allows me to use a customised model ?

Answer

A

Provisioned Throughput

Question 20

Q

Under what model do FMs create labels ?

Answer

A

Foundation models use self-supervised learning to create labels from input data. In self-supervised learning, models are provided vast amounts of raw completely unlabeled data and then the models generate the labels themselves. This means no one has instructed or trained the model with labeled training data sets.

Question 21

Q