combined_Fundamentals of optical character recognition_study guide Flashcards

Question 1

Q

What is the primary intersection of AI fields that enables OCR capabilities?

Answer

A

The primary intersection of AI fields is computer vision and natural language processing, where vision capabilities ‘read’ the text, and natural language processing makes sense of it.

Question 2

Q

Describe the core function of machine learning models in the context of OCR.

Answer

A

Machine learning models in OCR are trained to recognize individual shapes as letters, numerals, punctuation, or other elements of text.

Question 3

Q

What is the Azure AI Vision service’s Read API and what is it optimized for?

Answer

A

The Read API is Azure AI Vision’s OCR engine that powers text extraction from images, PDFs, and TIFF files, and it is optimized for general, non-document images.

Question 4

Q

Explain the three-level hierarchy of results returned by the Read API.

Answer

A

The Read API returns results arranged into a hierarchy of pages, lines within pages, and words within lines, each with bounding box coordinates and text.

Question 5

Q

What are the two resource types you can create in Azure for using Azure AI Vision?

Answer

A

The two resource types are a specific Azure AI Vision resource and a general Azure AI services resource that includes Azure AI Vision along with other AI services.

Question 6

Q

What are three ways you can use the Azure AI Vision Read API?

Answer

A

You can use the Azure AI Vision Read API through the Vision Studio, REST API, or Software Development Kits (SDKs) like Python, C#, and JavaScript.

Question 7

Q

What is the primary benefit of using Azure AI Vision Studio for OCR tasks?

Answer

A

Azure AI Vision Studio provides a graphical user interface for using the Read API without requiring any coding to get started.

Question 8

Q

How can you access the OCR engine within Vision Studio?

Answer

A

The OCR engine can be accessed by selecting ‘Optical Character Recognition’ and the ‘Extract text from images’ tile in Vision Studio.

Question 9

Q

What type of data format are the raw results of the OCR analysis returned in?

Answer

A

The raw results are returned in JSON format which includes bounding box locations on the page and the detected text.

Question 10

Q

Besides speed and efficiency, what is one other key benefit of automating text processing using OCR?

Answer

A

Automating text processing with OCR improves speed, efficiency and also removes the need for manual data entry, freeing up resources for more important tasks.