Domain 3 Flashcards
T or F: LLMs are cost-effective and easy to maintain
False. The duration and cost of training a model are important considerations because it can be expensive for hardware storage and more.
Name five considerations of using foundational models
latency constraints, inference speed, and real-time requirements, Architecture and complexity
T or F: Accuracy is recommended with datasets that are not evenly distributed or imbalanced.
Accuracy is not recommended with datasets that are not evenly distributed or imbalanced.
Name some metrics you can use to evaluate model performance
Such metrics might include accuracy, precision, recall, F1 score, root mean squared error or RMSE, mean average precision or MAP, and mean absolute error, MAE.
biases that might be present in the training data. It’s important to understand how to mitigate risks, address ethical concerns, and make informed decisions about model selection and fine-tuning
Another consideration is the availability and compatibility of the pre-trained model
you should check whether the model is compatible with your framework, language, and environment, and check to ensure it has a license and documentation. You should also check whether the model has been updated and maintained regularly and whether it has any known issues or limitations.
interpretability
he ability to interpret and explain model outcomes is important. Being transparent refers to interpretability. it means being able to explain mathematically through coefficients and formulas why a model makes a certain prediction. This interpretability is possible if the model is simple enough, but foundation models are not interpretable by design because they are extremely complex. if interpretability is a requirement, then pre-trained foundation models might not be the best choice.
How is explainability different from intertability
Explainability attempts to explain this black box, by approximating it locally with a simpler model that is interpretable.
Greater complexity might lead to enhanced performance…
, but can increase costs. The more complicated the model is, the harder it is to explain the outputs of the model. And there are more considerations too, such as hardware constraints, maintenance updates, data privacy, transfer learning, and more.
What is inference?
he inference is where you process new data through the model to make predictions. It is the process of generating an output from an input that you provided to model
Amazon Bedrock foundation models supports what inference parameters?
Temperature, Top K, Top P to control randomness and diversity in the response. Amazon Bedrock also supports parameters such as response length, penalties, and stop sequences to limit the length of the response.
What is a prerequisite to creating a vector database?
A vector database is filled with dense vectors by processing input data, generally text data, and using an ML model, generally an embedding model. So it’s important to understand that a machine learning model is a prerequisite to create a vector database and the indexing technology itself. Vector databases are the factual reference of foundation model based applications, helping the model retrieve trustworthy data. Foundation models use vector databases as an external data source to improve their capabilities by enhancing search recommendations and text generation use cases. Vector databases add additional capabilities for efficient and fast lookup, and to provide data management, fault tolerance, authentication, and access control and query engine.
knowledge bases for Amazon Bedrock
give you the ability of collecting data sources into a repository of information. This way, you build an application that takes advantage of retrieval augmented generation, RAG.
What two components does RAG have?
RAG combines two components, a retriever component, which searches through a knowledge base and a generator component, which produces outputs based on the retrieved information. This combination helps the model access up-to-date and domain-specific knowledge beyond their training data.
AWS services that help store embeddings within vector databases.
Examples include Amazon OpenSearch Service, Amazon Aurora, Redis, Amazon Neptune, Amazon DocumentDB with MongoDB compatibility, and Amazon RDS with PostgreSQL.
Amazon RDS for PostgreSQL also supports the pgvector extension
to store embeddings and perform efficient searches.
Agents for Amazon Bedrock
is a fully managed AI capability from AWS to help you build applications foundation models. Agents can automatically break down tasks and generate the required orchestration logic or write custom code, and agents can securely connect to your databases through APIs, they can ingest and structure the data for machine consumption and augment it with contextual details to produce more accurate responses and fulfill requests.
Temperature:
Adjusts the randomness of the model’s response. A lower temperature results in more focused responses, while a higher temperature leads to more diverse outputs.
Top-k
defines the cut-off for the number of words (tokens) for each completion to choose from, ordered by their probabilities. A lower Top K value reduces the chance of an unusual word being selected.
Top-p
Top P works similarly to Top K The percentage of most-likely candidates that the model considers for the next token.
Choose a lower value to decrease the size of the pool and limit the options to more likely outputs.
Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.
The BERT score is a metric developed to assess the quality of generated responses compared to a set of reference responses. It uses pre-trained models to calculate semantic similarity between the generated responses and reference answers. Here’s how it works:
ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.
What is few-shot prompting? I
t is when you provide a few examples to help the LLM models better perform and calibrate their output to meet your expectations.
What is zero-shot prompting?
It is a sentiment classification prompt with no examples provided to the prompt.
You can also use a prompt template. .
Templates might include instructions, few-shot examples, and specific content and questions for different use cases