Read text with the Computer Vision service Flashcards
Read text with the Computer Vision service
The ability for computer systems to process written or printed text is an area of artificial intelligence (AI) where computer vision intersects with natural language processing. You need computer vision capabilities to “read” the text, and then you need natural language processing capabilities to make sense of it.
OCR
The basic foundation of processing printed text is optical character recognition (OCR), in which a model can be trained to recognize individual shapes as letters, numerals, punctuation, or other elements of text.
Uses of OCR
- note taking
- digitizing forms, such as medical records or historical documents
- scanning printed or handwritten checks for bank deposits
Azure resources for Computer Vision
- Computer Vision: A specific resource for the Computer Vision service. Use this resource type if you don’t intend to use any other cognitive services, or if you want to track utilization and costs for your Computer Vision resource separately.
- Cognitive Services: A general cognitive services resource that includes Computer Vision along with many other cognitive services; such as Text Analytics, Translator Text, and others. Use this resource type if you plan to use multiple cognitive services and want to simplify administration and development.
Whichever type of resource you choose to create, it will provide two pieces of information that you will need to use it:
A key that is used to authenticate client applications.
An endpoint that provides the HTTP address at which your resource can be accessed.
The Read API
The Computer Vision service provides one application programming interface (APIs) that you can use to read text in images: the Read API.
Because the Read API can work with large documents, it works asynchronously so as not to block your application while it is reading the content and returning results to your application.
to use the Read API, your application must use a three-step process:
- Submit an image to the API, and retrieve an operation ID in response.
- Use the operation ID to check on the status of the image analysis operation, and wait until it has completed.
- Retrieve the results of the operation.