AI Terms Flashcards
Algorithm
A set of instructions enabling a computer to learn and operate autonomously to solve specific problems or perform tasks.
Artificial Intelligence (AI)
Computer science branch focused on developing systems that mimic human intelligence, enabling tasks that typically require human intelligence.
AI Winter
a period of reduced funding and interest in artificial intelligence research.[1] The field has experienced several hype cycles, followed by disappointment and criticism, followed by funding cuts, followed by renewed interest years or even decades later.
Bard
A chatbot tool by Google based on the LaMDA large language model, facilitating dynamic conversations.
Chatbot
A computer program engaging in user conversations; AI-based chatbots use
machine learning and natural language processing for dynamic interactions.
ChatGPT
A commercially available chatbot from OpenAI based on large language
models like GPT-3.5 and GPT-4.
Continuous Active Learning (CAL)
AI application where the system corrects itself without
continuous human supervision, seen in e-discovery’s TAR 2.0.
Conversational AI
Technology using data, machine learning, and natural language
processing for human-like interactions, serving as the brain behind chatbots.
Deep Learning
Machine learning using neural networks to emulate the human brain, enabling data clustering and predictions through multiple layers of training.
Deep Fakes**
Can exist without generative AI. The
Foundational Model
A large AI model trained on vast unlabeled data, capable of performing various tasks with minimal fine-tuning.
Garbage In, Garbage Out
Expression highlighting that an AI system’s outputs are only as good as the quality of the training data.
Generative AI
AI category, including large language models, capable of independently creating novel content based on training data, exemplified by zero-shot learning.
GPT (Generative Pre-trained Transformer)
Prefix for OpenAI’s large language models,
with GPT-4 released in March 2023.
Graphics Processing Unit (GPU)
Efficient processor crucial for AI systems and large
language model training. A type of efficient processor that is used to render graphics on
a computer screen. GPUs are critical in the training of AI systems and large language
models that require significant processing power.
Hallucination
Occurs when an AI system confidently provides a false yet convincing answer to a query.
LaMDA
Language Model for Dialogue Applications, a large language model released by
Google in May 2021.
Large Language Model (LLM)
Deep learning model performing natural language tasks based on extensive training data.
LLaMA
Large Language Model Meta AI, released by Meta in February 2023.
Machine Learning
Broad AI branch focused on teaching systems tasks, concepts, or problem-solving through imitating human behavior.
Model
AI tool making decisions similar to human experts based on a defined dataset.
Multimodal AI
AI processing various data types, such as text, images, video, and sound.
Natural Language Processing (NLP)
AI branch dealing with computers’ understanding
of written and spoken language.
Neural Network
Machine learning mimicking the human brain, crucial for deep learning.
Parameters
Bits of knowledge in an AI model adjusted during training for desired outputs.
Prompt
Instruction for an AI model to generate a specific output.
Prompt Engineering
Identifying and using prompts to achieve desired outcomes from an AI tool.
Reinforcement Learning
AI training technique based on trial and error with feedback from its actions.
Robotic Process Automation (RPA)
Business process automation distinct from AI,
defining instructions for high-volume, repetitive tasks.
Robots.txt
a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website.
Self-Supervised Learning
Machine learning where a model generates data labels,
training itself on unstructured data.
Semi-Supervised Learning
Machine learning with some labeled input data, combining supervised and unsupervised learning.
Shadow Library
Online databases of readily available content that is normally obscured or otherwise not readily accessible. Such content may be inaccessible for a number of reasons, including the use of paywalls, copyright controls, or other barriers to accessibility placed upon the content by its original owners.
Supervised Learning
Machine learning where a model is trained on labeled data with manual correction.
Token
In NLP, a semantic unit or role in written language, formed by breaking language into meaningful elements.
Unsupervised Learning
Machine learning detecting data patterns without explicit training on labeled data.
Web Scraping
Extracting data from websites for training AI models.
Zero-Shot Learning
AI’s ability to respond to new questions or prompts not in its training data.
Robots.txt file
a text file part of website code to instruct web robots (typically search engine robots) how to crawl/scrape web pages
AI/Generative AI
deep-learning models that can generate high-quality text, images, and other
content based on the data they were trained
Machine Learning
algorithms which use structured, labeled data to make predictions; engineer/human pre-processing
Deep Learning
algorithms which can ingest and process unstructured data (vs. the pre-processed, structured data) by automating feature extraction through multiple layers of identification that mirrors human neurons, allowing larger datasets
LLMs
Large language models - large-scale deep learning models for predicting the relation between words and grammar
Foundation model
Non-task specific, general-use AI models such as GPT and DALL-E
Neural network
Subset of machine learning for deep learning algorithms which model the layers of activation which happens between neurons in the human brain to increase accuracy of recognition and prediction
Multimodal generative models
generative AI model trained on both text and images can generate a description of an image or a corresponding image given a text input
Retrieval augmented generation
(RAG) - an AI framework for retrieving facts from an external knowledge base (such as the internet) to ground LLMs on accurate, up-to-date information
Code executing models
Code execution is a fundamental aspect of programming language semantics that
reflects the exact behavior of the code.
Watermarking AI-generated content
“watermarking” synthetic output via alteration of metadata or pixels to indicate it is artificially generated
Training/fine-tuning/prompting
improving the automated “learning” of deep learning models with supervised task-specific layers to promote desired model output
Diffusion models (new developments -consistency models)
AI which generates an image by reversing layers of Gaussian blur noise of images it was trained on, predicting a likely denoised image
consistency models are “one-step” models rather than the iterative denoising
process traditional diffusion models use
Red Teaming
internal threat testing of AI models by simulating extreme “system failures” before it happens in real life
Jailbreaking
creating prompts to violate the content guidelines of the AI model and misuse it.
CPU
CPUs are more general purpose, serial high-speed core processing on smaller silicon
GPU
GPUs are better suited for data processing due to architecture of parallel processing on more cores w/ lower indv. clock speeds
Edge AI
AI which is hosted “locally” for physical processing on/near the device rather than via the cloud
GAN
generative adversarial network - AI training framework which uses two models “against” each other to iteratively “fool” the other in order to improve the realism of
its output
Transformer models
(eg. GPT) - deep learning architecture which allows models to process in parallel, allowing for near-real time “translation” of queries
Moore’s law
Intel’s co-founder predicting that the number of transistors on a chip would double every two years; also a term which refers to the exponential advancement in compute technology
Self-Supervised/Reinforcement Learning
architecture for deep learning models to iteratively improve on recognition/prediction of unstructured data without human processing