Essentials of Prompt Engineering Flashcards
Elements of a prompt Instructions:
This is a task for the large language model to do. It provides a task description or instruction for how the model should perform.
Elements of a prompt Context:
This is external information to guide the model.
Elements of a prompt - Input data:
This is the input for which you want a response.
Elements of a prompt Output indicator
This is the output type or format.
Negative prompting
is used to guide the model away from producing certain types of content or exhibiting specific behaviors. It involves providing the model with examples or instructions about what it should not generate or do.
Good prompt broken down
Instructions: Given a list of customer orders and available inventory, determine which orders can be fulfilled and which items have to be restocked.
*
Context: This task is essential for inventory management and order fulfillment processes in ecommerce or retail businesses.
*
Input data:
Orders:
Order 1: Product A (5 units), Product B (3 units)
Order 2: Product C (2 units), Product B (2 units)
Inventory:
Product A: 8 units
Product B: 4 units
Product C: 1 unit
*
Output indicator: Fulfillment status:
Randomness and diversity
This is the most common category of inference parameter. Randomness and diversity parameters influence the variation in generated responses by limiting the outputs to more likely outcomes or by changing the shape of the probability distribution of outputs.
Three most common Randomness and Diversity parameters
Temperature, Top P, Top K
When interacting with FMs, you can often configure these to limit or influence the model response.
inference parameters
Two most common categories of inference parameters
Randomness and diversity
length
Temperature
This parameter controls the randomness or creativity of the model’s output. A higher temperature makes the output more diverse and unpredictable, and a lower temperature makes it more focused and predictable. Temperature is set between 0 and 1
Top p
is a setting that controls the diversity of the text by limiting the number of words that the model can choose from based on their probabilities. Top p is also set on a scale from 0 to 1.
Top K
Top k limits the number of words to the top k most probable words, regardless of their percent probabilities.
Low top K setting
With a low setting, like 10, the model will only consider the 10 most probable words for the next word in the sequence. This can help the output be more focused and coherent
High top K setting
This can lead to more diverse and creative output, because the model has a larger pool of potential words to choose from
In general low temperature, top p and top k result in
Less creative, more coherent, and repetitive responses and vice versa
The length inference parameter category refers to the settings that
control the maximum length of the generated output and specify the stop sequences that signal the end of the generation process.
The maximum length setting determines the
maximum number of tokens that the model can generate during the inference process. This parameter helps to prevent the model from generating excessive or infinite output, which could lead to resource exhaustion or undesirable behavior.
Stop sequences are
special tokens or sequences of tokens that signal the model to stop generating further output. When the model encounters a stop sequence during the inference process, it will terminate the generation regardless of the maximum length setting.
Stop sequences can be predefined or dynamically generated based on the input or the generated output itself. In some cases, multiple stop sequences can be specified, allowing the model to stop generation upon encountering any of the defined sequences.
Stop sequences are particularly useful in tasks where
the desired output length is variable or difficult to predict in advance. For example, in conversational artificial intelligence (AI) systems, the stop sequence could be an end-of-conversation token or a specific phrase that indicates the end of the response.
Zero-shot prompting is a technique where
a user presents a task to a generative model without providing any examples or explicit training for that specific task.
Improper settings in maximum length and stop sequences can lead to
incomplete outputs, or conversely, to excessive and potentially nonsensical generations.
Few-shot prompting is a technique that involves providing a language model with
contextual examples to guide its understanding and expected output for a specific task. In this approach, you supplement the prompt with sample inputs and their corresponding desired outputs, effectively giving the model a few shots or demonstrations to condition it for the requested task
When employing a few-shot prompting technique, consider the following tips:
Make sure to select examples that are representative of the task that you want the model to perform and cover a diverse range of inputs and outputs.
Experiment with the number of examples.
Chain-of-thought (CoT) prompting is a technique that
divides intricate reasoning tasks into smaller, intermediary steps. This approach can be employed using either zero-shot or few-shot prompting techniques. CoT prompts are tailored to specific problem types. To initiate the chain-of-thought reasoning process in a machine learning model, you can use the phrase “Think step by step.” It is recommended to use CoT prompting when the task requires multiple steps or a series of logical reasoning.
Poisoning refers to the
intentional introduction of malicious or biased data into the training dataset of a model. This can lead to the model producing biased, offensive, or harmful outputs, either intentionally or unintentionally.
Hijacking and prompt injection refer to the technique of
influencing the outputs of generative models by embedding specific instructions within the prompts themselves
Exposure refers to the risk of exposing
sensitive or confidential information to a generative model during training or inference. An FM can then inadvertently reveal this sensitive data from their training corpus, leading to potential data leaks or privacy violations
Prompt leaking refers to the
unintentional disclosure or leakage of the prompts or inputs (regardless of whether these are protected data or not) used within a model. Prompt leaking does not necessarily expose protected data. But it can expose other data used by the model, which can reveal information of how the model works and this can be used against it.
Jailbreaking attempts involve
crafting carefully constructed prompts or input sequences that aim to bypass or exploit vulnerabilities in the AI system’s filtering mechanisms or constraints. The goal is to “break out” of the intended model limitations.