ML and Gen AI Refresh Flashcards

Question

What are some quality scores that should be used to assess our RAG?

Answer 1

Happy that you asked. Think of it like a little calf...or CAF. I. Context Relevance * Retrieved context NEEDS to be relevant for answering the users question II.Answer Relevance * The answer has to directly answer users the question. I.e. no "I want to bake a cake" and the output ends up being "Top 10 hot sexy things to do in Austin this Tuesday" III. Faithfulness * The answer must be faithful to the context retrieved. I.e. we ask what planet has the most number of moons--and we provide literally the exact answer to it and BOOM get the wrong answer still. Wamp wamp.

Answer 2

Good retrievers make good answers. The answer-Generation must be able to make good use of the documents. The retriever-Retrievr must be able to find the most relevant documents for answering the users question

Answer 3

1. Chunk Size Optimization Chunking too small or too large may result in inaccurate answers 2. Structured Knowledge Enables recursive retrievals and query routing 3. Sliding Window Chunking Overlapping chunks to help alleviate long documents. 4. Metadata Attachments Enables more efficient search via filtering, like on keywords!

Answer 4

Chunking in Large Language Model (LLM) applications breaks down extensive texts into smaller, manageable segments.

Answer 5

In this method, chunks have some overlap, ensuring that information at the edges of chunks is not lost. Sliding window chunking provides a balance between fixed-length and context-aware techniques.

Answer 6

Information Compression * Reduces noise and helps alleviate context-window constraint Generator Fine-Tuning * Fine-tine LLM to help ensure retrieved docs are aligned to LLM Result Re-Rank * the process of reordering an initial list of retrieved documents or passages to improve the ranking quality. * Alleviates lost in the middle phenomena in LLMs *

Answer 7

The Teacher Network: The big power hungry sensei that contains all this vast knowledge The Student Network: The eager young grashopper--the faster lighter weight model we are going to train.

Answer 8

It's where LLMs really underperform on certain tasks when relevant information is in the middle of the prompt. The more we expand the size of the prompt the information in the middle gets lost. Just like humans we respond well to information at the beginning or end of a piece of content. Information in the middle tends to get lost.

Answer 9

The context window is the number of tokens the LLM can process at once in the input prompt.

Answer 10

Logistic regression is used if we want to calculate something discrete, like whether people like Troll 2. It fits squiggle lineto data that tells us the predicted probability for discrete variables on the y axis.

Answer 11

Linearity: Relationship between predictors and the response is linear. Independence: All observations are independent of each other. E.g. feature 1 does not influence feature 2 Homoscedasticity: The spread of the residuals is constant across all levels of the predictors. Or the variance of our error terms are similar across the independent variables Normality of Residuals: The residuals should follow a normal distribution. The residuals should form a bell curve when you plot them.

Answer 12

OK KIDS! Let's talk about with Bias v. Variance Trade Off is. To understand this we have to understand what Bias and Variance even is in relation to a model--so let's hop in! Bias: Bias is actually exactly what it sounds like, BIAS! For example lets say we are using a model and we make some "assumptions" about the data. Let's say, in our case, we assume everyone at all weight is going to be a height of 3 inches. We absolutely 100% REFUSE to believe otherwise, well that's BIAS! Now, we say in statistics that a model that is often overly bias is underfitting the data. And when we think of something "fitting" the data we think of how well our model, or in this case a line matches up all the points with all points lying on the line being a perfect fit. We see when we draw the line that all x values = height of 3 inches we are essentially drawing a complete horizontal line and missing all the data points! This is called underfitting. Variance: Now variance is exactly related to what it sounds like, how much things vary! Imagine the model now is so worried about being too biased that it now wants to ensure that every possible solution and miniscule consideration is taken into account. Therefore for all weights down into the ounce level we try to make sure that we get the exact possible height---say for example on average in the real world the height is 3 inches for people between weights 1-2 ounces. When we have a lot of variance, we're adding in all these possible solutions of heights for a ton of different values between 1-2 ounces. What happens is we start to lose sight of the "trend" in the answer, and get to focused on the "exact" answer and get lost in the weeds. A way to see this is imagine drawing a squiggly line for all data points. What this does is add a ton of COMPLEXITY into our model, meaning just that...it's super complex and gets lost in the weeds of the overall answer. When we have a highly variant or overly complex model we often worry about the model overfitting the data--meaning that when the model sees new data it may not be able to get the correct answer because it was soooo focused on the data it saw before. So this is all great, but what is the Bias vs. Variance Tradeoff?? Well, you actually already have the pieces to know what it is! The Bias vs Variance tradeoff is ensuring that the model we make is not too bias (leading to underfitting the data) and does not have too much variance (overfitting the data) so that we can find a sweet spot of a model who has correct assumptions of the data, but isn't so hyperfocused on being "overly" correct that it loses sight on the broader trend at large. You can think of it as bias is being so concerned with one particular right answer and it's wrong, and variance being obsessed with all possible right answers and it's wrong.

Answer 13

Think of Regularizations like a spanker ready to punish your model when it does something as horrible as overfitting the data (gasp, bad model)! Regularization spanks your model into shape by adding a penalty on the size of coefficients for your model. There are two types (L1,L2,Elasticnet etc) and each have their own way of SPANKING or well shrinking the coefficents towards zero to simplify your overly complex complicado model that loves to overfit.

Answer 14

Use regularization when you think your model is being naughty and overfitting the training data. E.g. captures noise instead of the underlying pattern (bad model!)

Answer 15

Cross-validation is a technique to assess how the results of a statistical analysis will generalize to the data. It is used to estimate the skill of the model on new data, tuning model parameters, and choose between models.

Answer 16

To evaluate the performance of your model in a robust way than simply splitting the data into a single train and test set and praying to the math gods that that was good enough. What makes cross-validation fold sexy is that it allows for multiple rounds of training and validation on different subsets of the data.

Answer 17

Regularization - To spank that model into shape when it's being a bad boi and overfitting. Cross-Validation - To assess how well your regularization or other model choices is likely to perform on unseen data WITH the added sexy feature of splitting up the data into multiple folds through a sexy iterative process of splitting up the data

Answer 18

OK first of all let's be real here--linear regression is basic, and what's the point of trying to solve this problem anyway? Well it's to see if you know that one model provides something better and sexier than the other model. Now let's think about what might go into Air Bnb booking prices...(location, amenities, reviews) oh my god it's so many my butt hurts. If we approach trying to solve the problem with Linear Regression be my guest, but it's going to assume that the relationship is...well linear. And guess what? There are factors (features) that we are looking at they are not going to play well linearly. Then there's random forest (oooooo) it's like a swiss army knife, and capable of capturing linear and NON linear interactions between features without you specifying them. So considering the complexity, Random Forest Regression is probably better. OMG THAT WAS TOO MUCH INFO GIVE ME A TWO SENTENCE ANSWER. I would use Random Forest Regression because odds are relationships between the variables may be non-linear and even capture complex interactions between predictions.

Answer 19

1. Knowledge Extraction: Think of the student capturing not just the teachers final learnings, but the intermediate activations and hidden representations from the teacher. 2. Student Learns two primary things: The original data the teacher was trained on AND the knowledge extracted from the teacher intermediate activations and hidden representations 3. Graduation: The student learns the essence of the teacher's knowledge. It may not be a smoothe all knowing samurai yet, but it's cheaper, faster, and simpler

Answer 20

Intermediate Activations = The outputs of hidden layers. One layer might detect edges, the other might detect shapes and so on. This helps the student breakdown meaningful features from the image, just like the teacher did. Hidden Representations: These are the intermediate activations within the teacher network. Like the teachers unspoken thoughts and insights of the image. The teacher will never match these completely, but by mimicing the teachers internal activations it learns similar representations.

Answer 21

The tiny decision makers: E.g. think of it as one layer detecting edges, the next detecting shapes It is the outputs of individual neurons within hidden layers of the network, representing their activation level. In dry terms--it indicates its response to the the weighted sum of its inputs and any added bias.

Answer 22

Instead of internal activations representing single layers, think of hidden representations as this summary of all the relationships between individual neuron activations across all the layers.

Answer 23

Scope: Internal activations are specific to individual neurons, while hidden representations are collective understandings of an entire layer. Level of abstraction: Internal activations are raw outputs, while hidden representations are more abstract interpretations. Accessibility: Internal activations are temporary and not stored, while hidden representations can be extracted and analyzed in some tasks.

Answer 24

Nope, not always! It could lead to loss of performance so it's important to assess the efficiency v performance.

Answer 25

N- to one: They take n tokens as input and produce one token as output

Answer 26

Sequential = Process word by word sequentially (oh right) Transformers = Are like an attentive high emotional IQ boyfriend who has this super power of considering all your words simultaneously, and paying ATTENTION to the one that matter most for the current word being processed (awww!)

Answer 27

1. Input and Embeddings: Sentence gets transformed into a vector representation---representing the meaning of each word. 2. Positional Encoding: Words have order--but the transformers architecture does not know that yet--so we add positional encodings that tell the model the position of each word in the sentence. 3. Attentioooonnn!!!!: Self-attention is used where each word pays attention to all other words in the sentence. Each word has a spotlight and that spotlight is going to shine the brightest on the most relevant words for understanding. = Helps the model capture contextual meaning 4. Multi-head Attention: Obviously one spotlight is not enough, so transformers have multi-head attention like different spotlights with different filters, where each captures different aspects of the relationships between words. =Helps the model learn multiple levels of representation 5. Encoder and decoder (if applicable): Depending on the text we might have an encoder that processes the input sentence using these attention layers, followed by a decoder that generates the output using attention to both the enco ders output and its own generated words 6. Output: Finally leads to a desired output gathered through the attention to generate e.g. translation, summary, etc

Answer 28

Short answer: 1. Faster than sequential: due to PP parallel processing of words 2. Long-Range Dependencies: Captures relationships between words far apart in the sentence. 3. Better contextual awareness. 4. The architecture is more flexible for different tasks. THE PP (Parallel Processing): Allows for parallel processing of all words compared to sequential where its word by word---e.g. makes em fast bois Long-Range Dependencies - Capture relationships between words FAR apart in the sentence, something traditional models struggle with. E.g. Although the farm lives alone, he was never lonely because his dog. LRD alone and because and never lonely offer hints. Flexible for a lot of tasks: The architecture is easy to adapt for certain tasks like q-a, sentiment, entity recognition etc.

Answer 29

They're words or chunks of characters like ham, bur etc. Rule of thumb is 1 token = 4 chraracters or 1 token = .75 words for english text

Answer 30

Primarily Transformers are faster, AND are better at understanding long range dependencies. Imagine the detective struggling to connect a hidden clue at the beginning to a seemingly unrelated detail at the end. Transformers, with their global attention, can easily spot these distant connections.

Answer 31

No. Logistic regression is designed for classification and it achieves that by squashing the output space into a binary one.

Answer 32

LLM are extremely literal robotic improv artists who grew up on the internet. Here me out. Imaging you're at an improv troupe comprised of two members : A human An LLM 1. The user prompt The improv artist asks for a word of recommendation, or a couple of words. You, the audience, the user, decide to say "APPLES!" or maybe you even decide to say "Apples!!! I have one for breakfast everyday." or even "APPLES! OH MY GOD I CANNOT LIVE MY LIFE WITHOUT APPLES EVERYDAY I WILL DIE." 2. The tokens Now, what you just gave the human on stage are words, and what you gave the LLM are tokens. Longer suggestions/context have more words therefore more tokens and shorter responses have less tokens. Bonus here is that in the english language, about 4 characters equate to one token, so let's just stick with the improv artist and the llm. 3. What the LLM and the Human do with the tokens: Word Association Game Now let's say for the sake of this example the improv artist goes OK we have our word of suggrestion when the lights come on "APPLES". Which just means, you're about to watch a show about...welp apples. BUT HOW DO IMPROV ARTISTS DO THIS??? For the the human: They're going to create a show based off of word associations to Apples. For the LLM: They're going to create a show based off of word associations to Apples. .... Just how they got there in their associations is a bit different, yet similar (and some people like to argue this....but that's not a discussion for now) The human is building word associations that they have learned throughout their life in that particular language. They may be using personal experience, stories they've heard from yesteryear--whatever, but they aren't making up stories about whales--THE SHOW IS ABOUT APPLES. The LLM is building word associations that they also have learned throughout their life (in their training phase). They learned these probabilities based off things like words they "saw" on the internet, and gathered patterns. From these patterns they predict probabilities of each possible next word related to apples, and pick the most likely one to do their "bit". This keeps on going, with the LLM using it's own predictions as prompts for the next word, until it reaches a final length. This is something called n-tokens in 1 token in what's called an expanding window pattern. 4. Writing good prompts and ensuring a fun show: The beauty of context NOW I know you didn't ask for this, but in case you were curious about how to make better prompts---context is key here (actually for both human and llm improv artists, BUT ESPECIALLY THE IMPROV ARTIST). You see, if the improv troupe just goes with the word apples, you might end up having a not so fun show. Why? For the improv artist: They're now forced to come up with a show about apples, and while it can deviate from that, it doesn't provide the individual much material to work with as much as say, the unhinged response which mentioned they would die without apples. One big difference between the LLM and human here those is this: The human still knows they're doing a show because you and I as humans can pick up on unsaid context. And for the LLM? Well...they don't know they're at an improv show with just the word "Apples" If we give the LLM just the words "Apple" they're going to create a bit about...well something that ultimately will sound like an encyclopedia reference to apples, the kinds of them etc.... it won't be very funny. So next time you're at an improv show or inputting a prompt to a LLM, make sure you're adding context to experience a better show.

Answer 33

Self-attention and Attention are both mechanisms that allow transformer models to attend to different parts of the input or output sequences when making predictions. Attention refers to the ability of a transformer model to attend to different parts of related sequences when making predictions. This is often used in encoder-decoder architectures, where the encoder vectorizes the input sequence, and the decoder attends to the encoded representation of the whole input when making predictions. For example, in a language translation, attention models the relationship between the original and translated text. Self-attention, on the other hand, refers to the ability of a transformer model to attend to different parts of the input sequence when making predictions. The name comes from the fact that contrary to “regular” attention, self-attention refers to the same sequence which is currently being encoded. This allows us to look at the whole context of our sequence while encoding each of the input elements. With the superpower of Attention, our robot can listen to one person speaking in English, translate it into French, and then make sure the translation makes sense by checking back with the entire conversation. It's like having a conversation between two languages and making sure nothing gets lost in translation. Now, switch gears to Self-Attention. This is when our robot focuses on just one person's story, understanding every detail by considering the whole story as it listens. It's like the robot is making a map of the story, seeing how each word connects to the others, making sure it gets the full picture. So, in the world of transformers, Attention helps our robot juggle between two languages or parts of a conversation, while Self-Attention helps it deep dive into one story to catch every nuance. Bam! That's how our transformer robot keeps up with the chatter, making sure it understands everything, whether it's translating or just listening in.

Answer 34

The alignment problem in Large Language Models typically manifests as: Lack of helpfulness: when the model is not following the user's explicit instructions. Hallucinations: when the model is making up unexisting or wrong facts. Lack of interpretability: when it is difficult for humans to understand how the model arrived at a particular decision or prediction. Generating biased or toxic output: when a language model that is trained on biased/toxic data may reproduce that in its output, even if it was not explicitly instructed to do so.

Answer 35

NER is also called entity identification, entity chunking or entity extraction. Answer NER is a subtask of information extraction that seeks to locate and classify named entity mentioned in unstructured text into pre-defined categories such as person name, organizations, locations, medical codes, monetary values, etc.

Answer 36

Answer When TF is used, the frequency of the words appearing in the corpus will be used, which will be mostly populated with stop words such as "the", "a", "and", etc. So, most of the output will be skewed towards the stop words. When TF-IDF is used, the numerical output is such that the output is not affected by the most common words in the corpus. The way that this is done is that the weights of the common words are diminished down, and the weights of the uncommon words are scaled up.

Answer 37

PIPF. Bias can be boiled down to two things. Pre-processing Stage: Data is curated and modified to remove or reduce sources of, ensure fairness etc. In-processing (Fine-tune) Stage: Design or implementation is modified and or optimized to counteract the bias during LLM learning process via adversarial learning, regularization, debiasing loss, or fairness constraints that can discourage or penalize LLM from learning biased representations. The in-processing stage can help to reduce model bias in LLM outputs, which is the bias that stems from the design or implementation of the LLM. Post-processing stage: Outputs generated by the LLM are modified or improved to correct or compensate for the bias after the LLM generation process. Output filtering, output rewriting, output ranking, etc that can detect and remove or reduce the bias from the LLM outputs. The post-processing stage can help to reduce decoding bias in LLM outputs, which is the bias that stems from the algorithm or technique used to generate text from the LLM Feedback stage: This is the stage where human interaction or intervention with the LLM is involved to monitor, evaluate, or intervene in the no code AI development and deployment process. This can involve techniques such as human feedback, human evaluation, human oversight, or human collaboration that can identify and address the bias issues in the LLM outputs. The feedback stage can help to reduce feedback bias in LLM outputs, which is the bias that stems from human interaction or intervention with the LLM. HIL (Human in the loop) can be used to help mitigate bias. > Training data curation to ensure the data is diverse > Model fine-tuning--provide feedback on model outputs > Customization and control-Human can adjust the output by tailoring towards specific domains.

Answer 38

Three Reasons TAU: 1. Training data 2. Algorithms - For example an algorithm places more importance on certain features or data points, it may unintentionally introduce or amplify biases present in the data. 3. Use case: If a LLM is designed to generate content for a certain demographic or industry, it may inadverntly reinforce existing biases and exclude different perspective

Answer 39

Data Augmentation: Introduce additional diverse and balanced examples to counteract biases present in the training data. Data Filtering: Remove or down-sample data that contains explicit biases or skewed representations. Synthetic Data Generation: Create synthetic examples that promote fair representations of underrepresented groups, helping the model learn more equitable patterns.

Answer 40

In-processing StageIn-processing strategies involve modifying the training process itself to encourage fairness and reduce bias: Bias-Aware Loss Functions: Modify the loss function to penalize biased predictions, incentivizing the model to produce more neutral outputs. Regularization: Apply regularization techniques that discourage the model from learning associations that lead to biased predictions. Adversarial Training: Train an auxiliary model to identify and counteract bias, encouraging the main model to generate less biased outputs.

Answer 41

Post-processing StagePost-processing strategies involve refining model outputs after they are generated: Re-ranking: Rank generated outputs based on bias-reduction criteria, promoting less biased responses. Bias Correction: Identify and replace biased language or associations in generated text using predefined guidelines. Rewriting: Automatically rewrite biased sentences to be more neutral and inclusive.

Answer 42

Human-in-the-Loop: Involve human reviewers to review and correct biased outputs during the model's fine-tuning phase. Continuous Monitoring: Continuously track and analyze model outputs in real-world applications to identify and rectify any new biases that may emerge. User Customization: Allow users to customize the model's behavior in terms of bias reduction, striking a balance between user preferences and ethical considerations. Diverse Stakeholder Involvement: Collaborate with diverse stakeholders, including ethicists, linguists, and impacted communities, to ensure the de-biasing process aligns with a wide range of perspectives.

Answer 43

As mentioned earlier, an auxiliary model isn't a separate model itself but a technique used in training other models. Here's an example to illustrate this technique: Scenario: Imagine you are training a model to classify handwritten digits (0-9) from images. This is the main task. Using Auxiliary Tasks: Color Recognition: As an auxiliary task, you could train the model to predict the dominant color of the image alongside recognizing the digit. This helps the model learn features related to color variations, which might be subtle but still relevant for distinguishing certain digits (e.g., differentiating a pale 2 from a darker 7). Line Detection: Another auxiliary task could be to have the model identify lines or edges within the image. This helps the model understand the overall structure and shape of the digit, which is crucial for differentiating similar-looking digits (e.g., differentiating a closed 0 from an open 6). Benefits in this example: By learning these additional tasks (color and line recognition), the main model (digit classification) gets a better understanding of the data (handwritten images). This can lead to: Improved accuracy: The model might be able to differentiate similar-looking digits more accurately with the additional information gained from the auxiliary tasks. Data efficiency: If training data for handwritten digits is limited, the auxiliary tasks can help the model learn better even with less data specifically for digit classification.

Answer 44

Transfer learning is a machine learning technique where you take a model trained on one task and use it as a starting point for a model on a different but related task. Here's a breakdown to make it easy to understand: Think of it like recycling knowledge: The original model: Imagine you've spent a lot of time learning how to ride a bicycle. You've gained valuable skills like balance, coordination, and understanding how to navigate. A new task: Now, let's say you want to learn how to ride a motorcycle. Transfer learning is like taking the knowledge you already have from bike riding and applying it to help you learn the motorcycle more quickly. How it works in machine learning: Pre-trained model: You start with a model that has already been trained on a large dataset for a task (let's say, recognizing different types of objects in images). New but related task: You want your model to do something similar, but not exactly the same (let's say, recognizing specific types of birds). Adapting the knowledge: Instead of training a brand new model from scratch, you start with the pre-trained model and fine-tune it. You might freeze some layers (keeping the general knowledge) and train new layers to learn the specifics of birds.

Answer 45

Key Differences: Focus: Transfer learning: Takes knowledge from a previous task and applies it to a new, related task. The focus is on reusing pre-trained knowledge. Auxiliary models: Focuses on introducing additional, often simpler tasks during the training of a single model to improve its performance on the main task. Knowledge Source: Transfer learning: Takes knowledge from an entirely separate, pre-trained model. Auxiliary models: Learns the additional tasks simultaneously as part of the main model's training. Timeline: Transfer learning: There's usually a separation between the training of the original model and when you begin applying its knowledge to the new task. Auxiliary models: The main and auxiliary tasks are trained together within the same model. Similarities: Both can improve model performance and data efficiency. Both involve the idea of learning multiple things to gain a better understanding overall. To sum up, think of it this way: Transfer learning is like bringing in an expert consultant with experience in a related field to help you solve a new problem. Auxiliary models are like providing a student with extra practice problems or study materials along with the main topic to improve their overall understanding.

Answer 46

Using a pretrained model like BERT and adapting it to my task like labeling sentiment. * **Adding a new Output Layer:** You add a layer on top of the pre-trained model designed to predict sentiment (positive, negative, neutral, etc.). * **Training on Your Data:** You feed your labeled sentiment data to adjust the weights in the model, specializing it to the task of understanding sentiment.

Answer 47

LSTM had an encoder decoder architecture Encoder- Creates vector rep of words Decoder - Returns a sequence of words from the vector rep LSTM needed the inputs from the previous state to make any operations on the current state to take into account the interdependence of words. Transformers-Maintain interdependence of the words without a RNN by using an attention mechanism. The attention measures how closely two elements of two sequences are related. This is applied to a single sequence (also known as a self-attention layer). This feature makes it must faster to train on. and LSTM are sequential therefore will not be as fast. Additionally transformers can have contextual embeddings, which draw information from context to correct missing or noisy data.

Answer 48

The self-attention layer determines the interdepennce of different words in the same sequence, to associate a relevant representation with it. Example: "The dog didnt cross the street because it was too tired." To us it is the dog and not the street. The objective of self-attention process will be to detect the link between dog and it. This feature makes transformers must faster to treain compared to the other models.

Answer 49

Processing speed: Traditional models, like LSTMs, struggled with parallel processing, leading to slower training and execution. Contextual awareness: LLMs, with their Transformer architecture and self-attention mechanism, excel at understanding the relationships between words, providing them with superior contextual awareness. Data volume: LLMs are trained on massive amounts of data, making them more robust and knowledgeable compared to their less data-hungry predecessors. Generalizability: LLMs, due to their architecture and training methods, are often more adaptable and can perform various NLP tasks, unlike traditional models often limited to specific functions. Provide data for training Generative AI models: Analyzing vast amounts of text data using NLP techniques helps identify patterns and relationships in language, which are then used to train Generative AI models to produce human-quality text, speech, or code. Guide and shape the output of Generative AI models: By understanding the context and intent behind user input or prompts, NLP can refine the direction and style of the text generated by AI, ensuring it's relevant and coherent. Evaluate the performance of Generative AI models: NLP techniques can be used to assess the quality and effectiveness of the outputs generated by AI models, helping to identify areas for improvement and tailoring them to specific user scenarios.

Answer 50

Situation: Set the scene Task: Describe the purpose Action: Explain what you did Share: Share the outcome

Answer 51

Consideration: (Functional and non-functional requirements) Problem, Things to Consider, Success metrics and KPIS Design: Model choice, System architecture: Data pipeline, Offline training, Online predictions, Metrics and Monitors, MLOPs Build: CI/CD best practices, Google Cloud Vertex AI for model development and deployment Develop APIs to integrate the solution Implement unit tests: Deploy: Testing and monitoring, what do we want to monitor, including infra health, model performance, bias etc. Demonstrate value to customer: * Track key metrics overtime and or compare against benchmarks * Gather feedback etc Scale: Horizontal scaling: Add additional computing resources to handle increased load Model retraining/drift detection: More advanced algorithms Auto scaling on online endpoint node

Answer 52

Concept: Add more machines (nodes) to the system to distribute the workload and increase processing power. Example: As data and user base grow, performance bottlenecks may occur. Horizontal scaling can be implemented by adding more instances of the model running on seperate machines (like autoscaling). This increases the sys capactiy to handle more users and data without compromising responsiveness. Vertical Scaling: Concept: Upgrade existing hardware CPU and memory of machine(s) running in the system Example Autoscaling is like horizontal scaling: * Add more machines or resources during peak periods of high traffic. During low periods it can scale down. This dynamic scaling ensures the system can efficently handle dynamic workloads and be more cost effective!

Answer 53

Consideration: Functional requirements, non-fun requirements Success metrics/KPIs Design: model choice, services, ci/cd etc Build: Services used what were actually building Deploy: What resources we're using, monitoring, model drift etc Value: Demonstrate value to customer, customer feedback, benchmarks, kpis etc Scale: Horizontal, Vertical, Auto

Answer 54

Understanding the Objective: Marketing Goals: What specific goals do they hope to achieve with these tailored marketing materials? (e.g., brand awareness, lead generation, increased engagement) Target Audience: Who are the specific customer segments they want to reach? (e.g., demographics, interests, pain points) Success Metrics: How will they measure the effectiveness of the generated marketing materials? (e.g., click-through rates, conversion rates, social media engagement) Content and Design Preferences: Desired Formats: What formats do they envision for the marketing materials? (e.g., social media posts, blog articles, email newsletters) Brand Voice and Tone: What is their desired brand voice and tone for each customer segment? (e.g., formal, informal, humorous, informative) Existing Assets: Do they have any existing marketing materials that have performed well that could be used as inspiration or training data for the LLM? Teechnical Considerations: Data Availability: Do they have access to relevant customer data or insights for each segment that can be used to inform the LLM? Compliance Requirements: Are there any specific compliance requirements or regulations they need to consider when generating marketing materials? Additionally, you might inquire about their budget and timeline for the project. Additionally, you might inquire about their budget and timeline for the project.

Answer 55

Understand the objective: Marketing goals, target audience, benchmarks and kpis, timeline, and budget Content and Design Preferences: Designed formats, single report or chatbot? Will marketing materials contain images or only posts, which platform? Existing assets? Technical Considerations: Data availability, Compliance requirements (IP, PII), Performance metrics, ethical considerations

Answer 56

Words compared to other words. The attention layer is like whispering with your buddies getting a scoop at a party. It takes each word and compares it to all the others that seem relevant based on a weight. By the end, each word has a better understanding of the whole party because they have important connections between others

Answer 57

Words pay attention to the other words to figure out how they relate to eachother. I.e. ok king is important, but jester is definitely "connected" to king

Answer 58

HL: Answer this question to obtain the REAL question I want answered. Give LLM a roadmap--e.g. you give it a starter question like: Prompt1: What is the capital of France? And based on the answer to that we then ask what continent is France located?

Answer 59

Similar to least-to-most, but instead of just showing the LLM the steps, you also include how to think through each step. I see a ball, it's next to the couch, go get it and bring it back! Prompt: I need to find the distance between New York City and Los Angeles. 1. First, I need to find the latitude and longitude of each city. 2. Then, I can use the distance formula to calculate the distance between those coordinates.

Answer 60

Give it a starting point, and then it asks itself questions to figure out the answer. E.g. How do I get that ball? And let the LLM come up with steps on its own.

Answer 61

This prompt combines reasoning + taking action. The LLM can figure things out and then use that knowledge to do something in the real world. Prompt: A customer is looking for a new pair of running shoes. 1. Based on their preferences (cushioning, support, etc.), recommend some shoes from our inventory. 2. If the customer needs more information about a specific shoe, search the web for reviews and additional details. E.g. it has access to other knowledge sources

Answer 62

Next level. You basically teach the LLM how to learn and improve its own prompts. It's like saying hey if you get stuck, try asking a different question or rephrasing things a bit. Prompt: When answering a question, try your best to be informative and comprehensive. If you're unsure about something, rephrase the question or search for additional information before responding.

Answer 63

This helps LLMs understand things that aren't just numbers, like colors and objects. I have 3 apples, 2 oranges, and 1 banana. How many fruits do I have in total? The LLM might convert this prompt to a program like: fruits = {"apple": 3, "orange": 2, "banana": 1} total_fruits = sum(fruits.values()) print(total_fruits)

Answer 64

Basically you keep feeding it information until it has a clear picture of what you actually want based on the output. You are writing a news article about the discovery of a new planet. 1. Start with a headline that grabs attention. 2. Briefly describe the planet's location and size. 3. Is there any information about the planet's atmosphere or potential for life? 4. Include a quote from the scientist who made the discovery.

Answer 65

This one is good for recommendation systems, like suggesting movies or products. It helps LLM rank thinks based on how relevant they are, kind of like saying "hey, focus on the movies I actually like, not the ones I dont." Prompt: A user has watched movies A, B, and C. Based on their viewing history and movie descriptions, recommend the movie they are most likely to enjoy next (out of choices D, E, and F).

Answer 66

It considers different ways of solving a problem and picks the most logical one. Prompt: Solve the following equation: 2x + 5 = 11. The LLM might also consider alternative solutions and choose the most consistent one.

Answer 67

Let's LLM use external tools and resources to solve problems, like calling an API to look up a fact. Prompt: What is the current weather in London? The LLM could use an external weather API to retrieve the latest data.

Answer 68

This is where the LLM can learn from information you give it at the moment you ask a question. Prompt: Write a poem about a cat. Here's a fun fact: Did you know that cats have excellent night vision?

ML and Gen AI Refresh Flashcards

(97 cards)