Computer Vision Flashcards

Question 1

Q

Use this feature for general, unstructured documents with smaller amount of text, or images that contain text.

Answer

A

Azure AI Vision - OCR

Question 2

Q

Use this service to read small to large volumes of text from images and PDF documents.

Answer

A

Azure AI Document Intelligence

Question 3

Q

Which service do you use to read text from street signs, handwritten notes, and store signs?

Question 4

Q

Which service do you use to read receipts, and invoices?

Answer

A

Document intelligence

Question 5

Q

Which API would be best for this scenario? You need to read a large number of files with high accuracy. The text is short sections of handwritten text, some in English and some of it is in multiple languages.

Answer

A

Image Analysis service OCR feature

Question 6

Q

What levels of division are the OCR results returned?

Answer

A

Results contain blocks, words and lines, as well as bounding boxes for each word and line.

Question 7

Q

You’ve scanned a letter into PDF format and need to extract the text it contains. What should you do?

Answer

A

The Document Intelligence API can be used to process PDF formatted files.

Question 8

Q

What features exist for prebuilt Document Intelligence models?

Answer

A

Text extraction
Key-value pairs
Entities.
Selection marks
Tables.
Fields.

Question 9

Q

What specific forms exist as prebuilt models in Document intelligence?

Answer

A

Invoice
Receipt
W2
ID document model. US drivers’ licenses and international passports
Business card
Health insurance card

Question 10

Q

What generic prebuilt models exist in Document intelligence?

Answer

A

Read model.
General document model
Layout model.

Question 11

Q

What features are available in the Read model in Document Intelligence?

Answer

A

Text extraction

Question 12

Q

What features are available in the General document model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Entities
Selection marks
Tables

Question 13

Q

What features are available in the Layout model in Document Intelligence?

Answer

A

Text extraction
Selection marks
Tables

Question 14

Q

What features are available in the Invoice model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Selection marks
Tables
Fields

Question 15

Q

What features are available in the Receipt model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Fields

Question 16

Q

What features are available in the W2 model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Selection marks
Tables
Fields

Question 17

Q

What features are available in the ID document model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Fields

Question 18

Q

What features are available in the Business card model in Document Intelligence?

Answer

A

Text extraction
Key-value pairs
Fields

Question 19

Q

Which file formats can be consumed by prebuilt Document Intelligence models?

Answer

A

JPEG
PNG
BMP
TIFF
PDF

Question 20

Q

What file size requirements exist for Document Intelligence documents?

Answer

A

The file must be smaller than 500 MB for the standard tier, and 4 MB for the free tier.

Question 21

Q

What image size requirements exist for Document Intelligence documents?

Answer

A

Images must have dimensions between 50 x 50 pixels and 10,000 x 10,000 pixels.

Question 22

Q

What limitations exist for PDF files in Document Intelligence?

Answer

A

PDF documents must have dimensions less than 17 x 17 inches or A3 paper size.
PDF documents must not be protected with a password.

Question 23

Q

What amount of pages are allowed for PDF and TIFF files in Document Intelligence?

Answer

A

PDF and TIFF files can have any number of pages but, in the standard tier, only the first 2000 pages are analyzed. In the free tier, only the first two pages are analyzed.

Question 24

Q

How do you use the Document Intelligence service?

Answer

A

For custom applications, use the REST API.
To explore the models and how they behavior with your forms visually, you can experiment in the Azure AI Document Intelligence Studio.

Question 25

Q

In the Read model in Doc intelligence, how can you select a page range for analysis?

Answer

A

Use the pages parameter

Question 26

Q

What is the purpose of the Read model in Document Intelligence?

Answer

A

The read model is ideal if you want to extract words and lines from documents with no fixed or predictable structure.

Question 27

Q

Which prebuilt Document Intelligence model supports Entity extraction?

Answer

A

general document model

Question 28

Q

What entity types can be detected in the General Document model?

Answer

A

Person. The name of a person.
PersonType. A job title or role.
Location. Buildings, geographical features, geopolitical entities.
Organization. Companies, government bodies, sports clubs, musical bands, and other groups.
Event. Social gatherings, historical events, anniversaries.
Product. Objects bought and sold.
Skill. A capability belonging to a person.
Address. Mailing address for a physical location.
Phone number. Dialing codes and numbers for mobile phones and landlines.
Email. Email addresses.
URL. Webpage addresses.
IP Address. Network addresses for computer hardware.
DateTime. Calendar dates and times of day.
Quantity. Numerical measurements with their units.

Question 29

Q

What is the purpose of the Layout model in Document Intelligence?

Answer

A

Use when you need rich information about the structure of a document.

Question 30

Q

What fields can be identified by the ID Document model?

Answer

A

First and last names.
Personal information such as sex, date of birth, and nationality.
The country and region where the document was issued.
Unique numbers such as the document number and machine readable zone.
Endorsements, restrictions, and vehicle classifications.

Question 31

Q

What fields are included in the Business Card model?

Answer

A

First and last names.
Postal addresses.
Email and website addresses.
Various telephone numbers.

Question 32

Q

What fields are included in the W-2 model?

Answer

A

Information about the employer, such as their name and address.
Information about the employee, such as their name, address, and social security number.
Information about the taxes that the employee has paid.

Question 33

Q

What underlying form models exist for custom forms in Document Intelligence?

Answer

A

Custom template models
Custom neural models

Question 34

Q

Describe custom template models

Answer

A

Custom template models accurately extract labeled key-value pairs, selection marks, tables, regions, and signatures from documents.
Training only takes a few minutes, and more than 100 languages are supported.

Question 35

Q

Describe custom neural models

Answer

A

Custom neural models are deep learned models that combine layout and language features to accurately extract labeled fields from documents.
This model is best for semi-structured or unstructured documents.

Question 36

Q

What is included in a successful response to a call to the Document Intelligence API?

Answer

A

A successful JSON response contains analyzeResult that contains the content extracted and an array of pages containing information about the document content.
Some fields in pages include pageNumber, angle, width, height, words.
The words field has the word in the content field, a polygon array, and a confidence score

Question 37

Q

How can you improve confidence scores in Document Intelligence?

Answer

A

You want to make sure that the form you’re analyzing has a similar appearance to forms in the training set if the confidence values are low. If the form appearance varies, consider training more than one model, with each model focused on one form format.

Question 38

Q

What projects does the Azure Document Intelligence Studio support?

Answer

A

Document analysis models
Read: Extract printed and handwritten text lines, words, locations, and detected languages from documents and images.
Layout: Extract text, tables, selection marks, and structure information from documents (PDF and TIFF) and images (JPG, PNG, and BMP).
General Documents: Extract key-value pairs, selection marks, and entities from documents.
Prebuilt models
Custom models

Question 39

Q

What do you need to analyze a document in the Document Intelligence Studio?

Answer

A

Azure Document Intelligence or Azure AI service endpoint and key

Question 40

Q

What steps do you take to create and use a custom model in Document Intelligence Studio?

Answer

A

Create an Azure Document Intelligence or Azure AI Services resource
Collect at least 5-6 sample forms for training and upload them to your storage account container.
Configure cross-domain resource sharing (CORS). CORS enables Azure Document Intelligence Studio to store labeled files in your storage container.
Create a custom model project in Azure Document Intelligence Studio. You’ll need to provide configurations linking your storage container and Azure Document Intelligence or Azure AI Service resource to the project.
Use Azure Document Intelligence Studio to apply labels to text.
Train your model. Once the model is trained, you’ll receive a Model ID and Average Accuracy for tags.
Test your model by analyzing a new form that wasn’t used in training.

Question 41

Q

A person plans to use an Azure Document Intelligence prebuilt invoice model. To extract document data using the model, what are two calls they need to make to the API?

Answer

A

The Analyze Invoice function starts the form analysis and returns a result ID, which they can pass in a subsequent call to the Get Analyze Invoice Result function to retrieve the results.

Question 42

Q

A person needs to build an application that submits expense claims and extracts the merchant, date, and total from scanned receipts. What’s the best way to do this?

Answer

A

Use the Azure Document Intelligence’s prebuilt receipts model. It can intelligently extract the required fields even if the scanned receipts have different names in them.

Question 43

Q

A person is building a custom model with Azure Document Intelligence services. What is required to train a model?

Answer

A

Along with the form to analyze, JSON files need to be provided.

Question 44

Q

You want to use the Azure AI Vision Analyze Image function to generate an appropriate caption for an image. Which visual feature should you specify?

Answer

A

To generate a caption, include the Description visual feature in your analysis.

Question 45

Q

What is the purpose of the Azure AI Vision service?

Answer

A

The Azure AI Vision service is designed to help you extract information from images through various functionalities.

Question 46

Q

In AI Vision Service what features are available in the VisualFeatures enum?

Answer

A

VisualFeatures.TAGS: Identifies tags about the image, including objects, scenery, setting, and actions
VisualFeatures.OBJECTS: Returns the bounding box for each detected object
VisualFeatures.CAPTION: Generates a caption of the image in natural language
VisualFeatures.DENSE_CAPTIONS: Generates more detailed captions for the objects detected
VisualFeatures.PEOPLE: Returns the bounding box for detected people
VisualFeatures.SMART_CROPS: Returns the bounding box of the specified aspect ratio for the area of interest
VisualFeatures.READ: Extracts readable text

Question 47

Q

What functionality exists in Azure Video Indexer?

Answer

A

Facial recognition - detecting the presence of individual people in the image. This requires Limited Access approval.
Optical character recognition - reading text in the video.
Speech transcription - creating a text transcript of spoken dialog in the video.
Topics - identification of key topics discussed in the video.
Sentiment - analysis of how positive or negative segments within the video are.
Labels - label tags that identify key objects or themes throughout the video.
Content moderation - detection of adult or violent themes in the video.
Scene segmentation - a breakdown of the video into its constituent scenes.

Question 48

Q

What extensions can be made to Azure Video Indexer for custom insights?

Answer

A

People. Add images of the faces of people you want to recognize in videos, and train a model. Video Indexer will then recognize these people in all of your videos.
Note
This only works after Limited Access approval, adhering to our Responsible AI standard.

Language. If your organization uses specific terminology that may not be in common usage, you can train a custom model to detect and transcribe it.
Brands. You can train a model to recognize specific names as brands, for example to identify products, projects, or companies that are relevant to your business.

Question 49

Q

You want Azure Video Indexer to analyze a video. What must you do first?

Answer

A

Upload the video to Azure Video Indexer and index it.

Question 50

Q

You want Azure Video Indexer to recognize brands in videos recorded from conference calls. What should you do?

Answer

A

Edit the Brands model to show brands suggested by Bing, and add any new brands you want to detect.

Question 51

Q

What resources need to be provisiioned to use the AI Custom Vision Service?

Answer

A

A training resource (used to train your models). This can be:
An Azure AI Services resource.
An Azure AI Custom Vision (Training) resource.
A prediction resource, used by client applications to get predictions from your model. This can be:
An Azure AI Services resource.
An Azure AI Custom Vision (Prediction) resource.

Question 52

Q

Explain multiclass classification

Answer

A

there are multiple classes in the image dataset, but each image can belong to only one class

Question 53

Q

Explain multilabel classification

Answer

A

an image might be associated with multiple labels

Question 54

Q

What steps can be performed in the Azure AI Custom Vision portal?

Answer

A

Create an image classification project for your model and associate it with a training resource.
Upload images, assigning class label tags to them.
Review and edit tagged images.
Train and evaluate a classification model.
Test a trained model.
Publish a trained model to a prediction resource.

Question 55

Q

You want to train a model that can categorize an image as “cat” or “dog” based on its subject. What kind of Azure AI Custom Vision project should you create?

Answer

A

To train a model that classifies an image using a single tag, use an Image classification (multiclass) project.

Question 56

Q

Which of the following kinds of Azure resource can you use to host a trained Azure AI Custom Vision model?

Answer

A

You can publish a trained Azure AI Custom Vision model to either an Azure AI Custom Vision (Prediction) resource, or an Azure AI Services resource.

Question 57

Q

What features are available in the Face service within Azure AI Vision?

Answer

A

Face detection (with bounding box).
Comprehensive facial feature analysis (including head pose, presence of spectacles, blur, facial landmarks, occlusion and others).
Face comparison and verification.
Facial recognition.
Facial landmark location
Facial liveness - liveness can be used to determine if the input video is a real stream or a fake

Question 58

Q

How can you detect and analyze faces with the Azure AI Vision service?

Answer

A

call the Analyze Image function (SDK or equivalent REST method), specifying People as one of the visual features to be returned.

Question 59

Q

What attributes can be returned in the Facial Attribute analysis?

Answer

A

Head pose (pitch, roll, and yaw orientation in 3D space)
Glasses (NoGlasses, ReadingGlasses, Sunglasses, or Swimming Goggles)
Blur (low, medium, or high)
Exposure (underExposure, goodExposure, or overExposure)
Noise (visual noise in the image)
Occlusion (objects obscuring the face)
Accessories (glasses, headwear, mask)
QualityForRecognition (low, medium, or high)
Head pose (pitch, roll, and yaw orientation in 3D space)
Glasses (NoGlasses, ReadingGlasses, Sunglasses, or Swimming Goggles)
Blur (low, medium, or high)
Exposure (underExposure, goodExposure, or overExposure)
Noise (visual noise in the image)
Occlusion (objects obscuring the face)
Accessories (glasses, headwear, mask)
QualityForRecognition (low, medium, or high)

Question 60

Q

How can the Face service be provisioned?

Answer

A

You can provision Face as a single-service resource, or you can use the Face API in a multi-service Azure AI Services resource.

Question 61

Q

Describe the Face detection process

Answer

A

When a face is detected by the Face service, a unique ID is assigned to it and retained in the service resource for 24 hours. The ID is a GUID, with no indication of the individual’s identity other than their facial features.

While the detected face ID is cached, subsequent images can be used to compare the new faces to the cached identity and determine if they are similar (in other words, they share similar facial features) or to verify that the same person appears in two images.

Question 62

Q

What steps do you take to train a facial recognition model in the Face service?

Answer

A

Create a Person Group that defines the set of individuals you want to identify (for example, employees).
Add a Person to the Person Group for each individual you want to identify.
Add detected faces from multiple images to each person, preferably in various poses. The IDs of these faces will no longer expire after 24 hours (so they’re now referred to as persisted faces).
Train the model.

Question 63

Q

You need to create a facial recognition solution to identify named employees. Which service should you use?

Answer

A

Use the Face service to create a facial recognition solution.

Question 64

Q

You need to verify that the person in a photo taken at hospital reception is the same person in a photo taken at a ward entrance 10 minutes later. What should you do?

Answer

A

Verify the face in the ward photo by comparing it to the detected face ID from the reception photo. This is the most efficient approach as the photos were taken within 24 hours

Answer 64

A

use the interactive interface in the Azure AI Custom Vision portal. This interface automatically suggests regions that contain objects, to which you can assign tags or adjust by dragging the bounding box to enclose the object you want to label.

Answer 65

A

The values are the proportion of the full image size. Consider the example:
Left: 0.1
Top: 0.5
Width: 0.5
Height: 0.25
This defines a box in which the left is located 0.1 (one tenth) from the left edge of the image, and the top is 0.5 (half the image height) from the top. The box is half the width and a quarter of the height of the overall image.

Answer 66

A

To take advantage of the smart labeler, tag some images and train an initial model. Subsequently, the portal will suggest tags for new images.

Answer 67

A

call the ImageAnalysis function (REST API or equivalent SDK method), passing the image URL or binary data, and optionally specifying a gender neutral caption or the language the text is written in (with a default value of en for English).

result = client.analyze(
image_url=,
visual_features=[VisualFeatures.READ]
)

https:///computervision/imageanalysis:analyze?features=read&…

Brainscape's Knowledge GenomeTM

Computer Vision Flashcards

Concepts related to Azure Computer Vision resources

Brainscape's Knowledge Genome^TM