Computer Vision Glossary Flashcards
Optical Character Recognition (OCR)
The technology that enables machines to convert images of text into machine-readable text data.
Azure AI Vision
A service on the Azure platform that provides AI-powered vision capabilities, including OCR.
Read API
The OCR engine within Azure AI Vision, used to extract text from images, PDFs, and TIFF files.
Machine Learning Model
Algorithms trained on data to recognize patterns and make predictions, used in OCR to identify text elements.
Bounding Box
A rectangular region that marks the location of an object within an image, described by its coordinate points.
Vision Studio
A graphical user interface within Azure that allows users to access and experiment with AI vision capabilities without needing to code.
REST API
A standardized way to interact with web services using HTTP requests, used to programmatically access the Read API.
SDK (Software Development Kit)
A set of tools and resources that developers can use to build applications, used for accessing the Read API through programming languages.
JSON
A lightweight, text-based data interchange format used to represent data structures, commonly used in APIs to return structured data.
Natural Language Processing (NLP)
A field of AI focused on enabling computers to understand, interpret, and generate human language.
Face Detection
The process of identifying the presence and location of human faces within an image or video.
Facial Analysis
The process of examining specific facial features to derive additional information.
Facial Recognition
The process of identifying individuals from their facial features using trained models.
Azure AI Face Service
A Microsoft Azure service that provides pre-built algorithms for face detection, recognition, and analysis.
Accessories
Objects such as glasses, masks, or headwear that can be detected on a face.
Occlusion
The blocking of a face in an image by an object, impacting accuracy of detection and analysis.
Limited Access
A policy restricting access to advanced features of the Azure AI Face service, requiring approval from Microsoft.
Responsible AI
Microsoft’s approach to AI development, which includes ethical guidelines for the design and implementation of AI technologies.
Liveness Detection
The process of determining whether an input source is real or fake to prevent manipulation or spoofing.
Convolutional Neural Network (CNN)
A type of deep learning architecture commonly used in computer vision, which uses filters to extract feature maps from images.
Deep Learning
A subset of machine learning that uses neural networks with many layers to learn complex patterns from data.
Feature Map
An array of numeric values that result from applying a filter to an image, used in deep learning models.
Filter (Kernel)
An array of numeric values used to perform convolutional filtering, modifying pixel values in an image.
Image Classification
The process of predicting the category or class of an image.
Multi-Modal Model
An AI model trained using multiple types of data, encapsulating relationships between image features and text embeddings.
Object Detection
The process of detecting and locating specific objects within an image, and classifying them.
Pixel
A single point of color in a digital image, represented by numerical values.
Resolution
The dimensions of an image, measured in pixels, indicating its quality or clarity.
Transformer
A type of neural network architecture commonly used in NLP, encoding words as vector-based embeddings.
Vector-based Embedding
An array of numeric values that represent semantic attributes of a word or token.