AI & ML Flashcards
Artificial Intelligence
Development of intelligent systems capable of performing tasks that typically require human intelligence
Examples include perception, reasoning, learning, problem-solving, decision-making
Used for technologies like computer vision, facial recognition, fraud detection, and intelligent document processing
Machine Learning
Type of AI for building methods that allow machines to learn, but not the same as AI
Data is leveraged to improve computer performance on a set of tasks
Make predictions based on data used to train the model; no explicit programming of rules
Neural Network
Method in AI where nodes are connected together and organized in layers, talking to each other by passing data to the next layer
Creates an adaptive system that computers use to learn from their mistakes and improve continuously
Consists of Input Layer, Hidden Layers, and Output Layer
Deep Learning
Method in AI that teaches computers to process data in a way that is inspired by the human brain
Uses neurons and synapses to train a model; process is more complex patterns in the data than traditional ML
Computer Vision, NLP; takes a large amount of input data and requires GPU
Generative AI
Field of computer science as a subset of Deep Learning for generating new data similar to the data it was trained on, such as images, text, audio, video, code, etc.
Unlabeled Data is used to pre-train a Foundation Model backed by a neural network; this model can then be adapted for more specific uses like text generation, info extraction, chatbots, and more
Training Data
Large dataset used to train MLs to process information and accurately predict outcomes, and is the most critical stage to building a good model
Can be Structured or Unstructured; Labeled or Unlabeled
Labeled Data
ML data that includes both input features and corresponding output labels
Used for Supervised Learning, where the model is trained to map inputs to known outputs
For example, dataset with images of animals where each image is labeled with the corresponding animal type
Unlabeled Data
ML data that includes only input features without any output labels
For example, a collection of images without any associated labels
Used for Unsupervised Learning, where the model tries to find patterns or structures in the data
Structured Data
Data is organized in a structured format, often in rows and columns
Tabular Data is data arranged in a table with rows representing records and columns representing features
Time Series Data is a series of data points collected or recorded at successive points in time
Unstructured Data
Data that doesn’t follow a specific structure and is often text-heavy or multimedia content
Text Data is unstructured text such as articles, social media posts, or customer reviews
Image Data is data in the form of images, which can vary widely in format and content
Supervised Learning
ML learning method that learns a mapping function that can predict the output for new unseen input data
Needs Labeled Data; very powerful, but difficult to perform on millions of datapoints
Regression
Supervised Learning technique used to predict a numeric value based on input data
The output variable is continuous, meaning it can take any value within a range
Used when the goal is to predict a quantity or a real value; predicting house prices, stock prices, weather forecasting, etc.
Classification
Supervised Learning technique used to predict the categorical label of input data
Output variable is discrete, which means it falls into a specific category or class
Used for scenarios where decisions or predictions need to be made between distinct categories; fraud, image types, diagnostics
Classify emails, animals; give labels to movies
Training Set
Data set used to train the model
Typically, 60-80% of the dataset
For example, 800 labeled images from a dataset of 1000 images
Validation Set
Data set used to tune model parameters and validate performance
Typically, 10-20% of the dataset
For example, 100 labeled images for hyperparameter tuning