Image Processing and OCR Flashcards
What does OCR stand for
Optical Character Recognition
OCR turns text into?
image-based content into machine-readable text
What are the 3 OCR Engines that come with all Grooper installs
Tesseract OCR
Transym 4 OCR
Transym 5 OCR
Matrix matching and feature recognition are part of what phase of an OCR engine’s operation?
Character Recognition
Breaking up pixels into lines, words, and characters is part of what phase of an OCR engine’s operation?
Segmenting
Many OCR engines spell check OCR results to improve
their accuracy. This is part of what phase of an OCR
engine’s operation?
Post-Processing
In your own words, describe the Segmenting phase of an
OCR engine.
This is when the pixels are broken up into lines, individual words, and
characters
OCR engines that obtain results by comparing a grid of
pixels on an image to a grid of pixels of examples of
characters are performing….
Matrix Matching
The Grooper activity that performs OCR is….
Recognize
What image processing operation is required for an OCR
engine to obtain results, either through Grooper’s image
processing suite or via the OCR engine itself?
thresholding (or binarizing) the image
Image processing in Grooper serves one (1) of three (3)
basic purposes. What are they?
Archival Adjustments also OCR Cleanup and Layout Data collection (Archival Adjustments
ONLY pertain to permanent image processing via the Image
Processing activity)
The Grooper activity that performs permanent image
processing is….
Image Processing
In your own words, what is the benefit of performing
temporary image processing? How do you perform
temporary image processing in Grooper?
Temporary Image Processing is great because it will not make permanent
changes to the document itself. You assign a Temp IP Profile and run the
recognize activity. The only thing I will add is where that temporary IP
Profile gets assigned. It is assigned on the OCR Profile (which then
gets executed by the Recognize activity).
List three (3) common IP Commands used during permanent image processing.
Auto Deskew, Auto Border Crop, Rotate
List three (3) common IP Commands used during temporary image processing.
Line Removal, Speck Removal, Negative Region Removal