Fundamentals of ML and AI Flashcards
Artificial Intelligence
The theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
AI is a broad field that encompasses the development of intelligent systems capable of performing tasks that typically require human intelligence, such as perception, reasoning, learning, problem-solving, and decision-making. AI serves as an umbrella term for various techniques and approaches, including machine learning, deep learning, and generative AI, among others.
Machine Learning
The use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data.
ML is a type of AI for understanding and building methods that make it possible for machines to learn. These methods use data to improve computer performance on a set of tasks.
Deep Learning
A type of machine learning based on artificial neural networks in which multiple layers of processing are used to extract progressively higher level features from data.
Deep learning uses the concept of neurons and synapses similar to how our brain is wired. An example of a deep learning application is Amazon Rekognition, which can analyze millions of images and streaming and stored videos within seconds.
Generative AI
Generative artificial intelligence (AI) is a type of AI that can create new content, such as images, music, text, videos, and audio. It uses deep neural networks to learn from large datasets and produce new content that’s similar to the data it’s learned from.
Generative AI is a subset of deep learning because it can adapt models built using deep learning, but without retraining or fine tuning.
Generative AI systems are capable of generating new data based on the patterns and structures learned from training data.
Sentience
“Sentient” is the ability to feel or perceive, allowing to think and experience emotions. This would necessarily include consciousness.
Sentience is feeling; sapience is thinking.
Sapience
“Sapient” is the capacity for intelligence, wisdom, and logic along with the ability to solve problems, learn, and understand.
Sentience doesn’t even require self-awareness. Sapience, on the other hand, is often described as consciousness, or the ability to reason.
Sapience is generally the quality that would differentiate an intelligent species from animals.
Building a machine learning model involves…
Building a machine learning model involves data collection and preparation, selecting an appropriate algorithm, training the model on the prepared data, and evaluating its performance through testing and iteration.
Labeled data
Labeled data is a dataset where each instance or example is accompanied by a label or target variable that represents the desired output or classification. These labels are typically provided by human experts or obtained through a reliable process.
Example: In an image classification task, labeled data would consist of images along with their corresponding class labels (for example, cat, dog, car).
Unlabeled data
Unlabeled data is a dataset where the instances or examples do not have any associated labels or target variables. The data consists only of input features, without any corresponding output or classification.
Example: A collection of images without any labels or annotations
Structured data
Structured data refers to data that is organized and formatted in a predefined manner, typically in the form of tables or databases with rows and columns. This type of data is suitable for traditional machine learning algorithms that require well-defined features and labels. The following are types of structured data.
Tabular data: This includes data stored in spreadsheets, databases, or CSV files, with rows representing instances and columns representing features or attributes.
Time-series data: This type of data consists of sequences of values measured at successive points in time, such as stock prices, sensor readings, or weather data.
Unstructured data
Unstructured data is data that lacks a predefined structure or format, such as text, images, audio, and video. This type of data requires more advanced machine learning techniques to extract meaningful patterns and insights.
Text data: This includes documents, articles, social media posts, and other textual data.
Image data: This includes digital images, photographs, and video frames.
Supervised Learning
In supervised learning, the algorithms are trained on labeled data. The goal is to learn a mapping function that can predict the output for new, unseen input data.
Unsupervised Learning
Unsupervised learning refers to algorithms that learn from unlabeled data. The goal is to discover inherent patterns, structures, or relationships within the input data.
Reinforcement Learning
In reinforcement learning, the machine is given only a performance score as guidance and semi-supervised learning, where only a portion of training data is labeled. Feedback is provided in the form of rewards or penalties for its actions, and the machine learns from this feedback to improve its decision-making over time.
Inferencing
After the model has been trained, it is time to begin the process of using the information that a model has learned to make predictions or decisions.
Batch inferencing
Batch inferencing is when the computer takes a large amount of data, such as images or text, and analyzes it all at once to provide a set of results. This type of inferencing is often used for tasks like data analysis, where the speed of the decision-making process is not as crucial as the accuracy of the results.
Real-time inferencing
Real-time inferencing is when the computer has to make decisions quickly, in response to new information as it comes in. This is important for applications where immediate decision-making is critical, such as in chatbots or self-driving cars. The computer has to process the incoming data and make a decision almost instantaneously, without taking the time to analyze a large dataset.
Artificial neural networks
Computational models that are designed to mimic the way the human brain processes information
Neural networks have lots of tiny units called nodes that are connected together. These nodes are organized into layers. The layers include an input layer, one or more hidden layers, and an output layer.
Computer Vision
Computer vision is a field of artificial intelligence that makes it possible for computers to interpret and understand digital images and videos. Deep learning has revolutionized computer vision by providing powerful techniques for tasks such as image classification, object detection, and image segmentation.
Natural Language Processing (NLP)
Natural language processing (NLP) is a branch of artificial intelligence that deals with the interaction between computers and human languages. Deep learning has made significant strides in NLP, making possible tasks such as text classification, sentiment analysis, machine translation, and language generation.
Amazon Bedrock provides access to…
Amazon Bedrock provides access to a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon.
FM lifecycle
The foundation model lifecycle is a comprehensive process that involves several stages, each playing a crucial role in developing and deploying effective and reliable foundation models.
It’s important to note that the FM lifecycle is an iterative process, where lessons learned and insights gained from each stage can inform and improve subsequent iterations.
Data selection
Step 1 of the Foundational Model Lifecycle:
Data selection
Unlabeled data can be used at scale for pre-training because it is much easier to obtain compared to labeled data. Unlabeled data includes raw data, such as images, text files, or videos, with no meaningful informative labels to provide context. FMs require training on massive datasets from diverse sources.
Pre-training
Step 2 of the Foundational Model Lifecycle:
Pre-training
Although traditional ML models rely on supervised, unsupervised, or reinforcement learning patterns, FMs are typically pre-trained through self-supervised learning. With self-supervised learning, labeled examples are not required. Self-supervised learning makes use of the structure within the data to autogenerate labels.
During the initial pre-training stage, the FM’s algorithm can learn the meaning, context, and relationship of the words in the datasets. For example, the model might learn whether drink means beverage, the noun, or swallowing the liquid, the verb.
After the initial pre-training, the model can be further pre-trained on additional data. This is known as continuous pre-training. The goal is to expand the model’s knowledge base and improve its ability to understand and generalize across different domains or tasks.
Optimization
Step 3 of the Foundational Model Lifecycle:
Optimization
Pre-trained language models can be optimized through techniques like prompt engineering, retrieval-augmented generation (RAG), and fine-tuning on task-specific data. These methods will vary in complexity and cost and will be discussed later in this lesson.
Evaluation
Step 4 of the Foundational Model Lifecycle:
Evaluation
Whether or not you fine-tune a model or use a pre-trained model off the shelf, the next logical step is to evaluate the model. An FM’s performance can be measured using appropriate metrics and benchmarks. Evaluation of model performance and its ability to meet business needs is important.
Deployment
Step 5 of the Foundational Model Lifecycle:
Deployment
When the FM meets the desired performance criteria, it can be deployed in the target production environment. Deployment can involve integrating the model into applications, APIs, or other software systems.
Feedback and continuous improvement
Step 6 of the Foundational Model Lifecycle:
Feedback and continuous improvement
After deployment, the model’s performance is continuously monitored, and feedback is collected from users, domain experts, or other stakeholders. This feedback, along with model monitoring data, is used to identify areas for improvement, detect potential biases or drift, and inform future iterations of the model. The feedback loop permits continuous enhancement of the foundation model through fine-tuning, continuous pre-training, or re-training, as needed.
Large Language Model (LLM)
A type of Foundational Model.
Large language models (LLMs) can be based on a variety of architectures, but the most common architecture in today’s state-of-the-art models is the transformer architecture. Transformer-based LLMs are powerful models that can understand and generate human-like text. They are trained on vast amounts of text data from the internet, books, and other sources, and learn patterns and relationships between words and phrases.
LLMs use these tokens, embeddings, and vectors to understand and generate text. The models can capture complex relationships in language, so they can generate coherent and contextually appropriate text, answer questions, summarize information, and even engage in creative writing.
Tokens
Tokens are the basic units of text that the model processes. Tokens can be words, phrases, or individual characters like a period. Tokens also provide standardization of input data, which makes it easier for the model to process.
As an example, the sentence “A puppy is to dog as a kitten is to cat.” might be broken up into the following tokens: “A” “puppy” “is” “to” “dog” “as” “a” “kitten” “is” “to” “cat.”
Embeddings and Vectors
Embeddings are numerical representations of tokens, where each token is assigned a vector (a list of numbers) that captures its meaning and relationships with other tokens. These vectors are learned during the training process and allow the model to understand the context and nuances of language.
For example, the embedding vector for the token “cat” might be close to the vectors for “feline” and “kitten” in the embedding space, indicating that they are semantically related. This way, the model can understand that “cat” is similar to “feline” and “kitten” without being explicitly programmed with those relationships.
Diffusion models
Diffusion is a deep learning architecture system that starts with pure noise or random data. The models gradually add more and more meaningful information to this noise until they end up with a clear and coherent output, like an image or a piece of text. Diffusion models learn through a two-step process of forward diffusion and reverse diffusion.
Forward diffusion
Using forward diffusion, the system gradually introduces a small amount of noise to an input image until only the noise is left over.
Reverse diffusion
In the subsequent reverse diffusion step, the noisy image is gradually introduced to denoising until a new image is generated.
Multimodal models
Instead of just relying on a single type of input or output, like text or images, multimodal models can process and generate multiple modes of data simultaneously. For example, a multimodal model could take in an image and some text as input, and then generate a new image and a caption describing it as output.
These kinds of models learn how different modalities like images and text are connected and can influence each other. Multimodal models can be used for automating video captioning, creating graphics from text instructions, answering questions more intelligently by combining text and visual info, and even translating content while keeping relevant visuals.