Computer Vision Flashcards

Concepts related to Azure Computer Vision resources

1
Q

Use this feature for general, unstructured documents with smaller amount of text, or images that contain text.

A

Azure AI Vision - OCR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Use this service to read small to large volumes of text from images and PDF documents.

A

Azure AI Document Intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which service do you use to read text from street signs, handwritten notes, and store signs?

A

OCR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which service do you use to read receipts, and invoices?

A

Document intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which API would be best for this scenario? You need to read a large number of files with high accuracy. The text is short sections of handwritten text, some in English and some of it is in multiple languages.

A

Image Analysis service OCR feature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What levels of division are the OCR results returned?

A

Results contain blocks, words and lines, as well as bounding boxes for each word and line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

You’ve scanned a letter into PDF format and need to extract the text it contains. What should you do?

A

The Document Intelligence API can be used to process PDF formatted files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What features exist for prebuilt Document Intelligence models?

A

Text extraction
Key-value pairs
Entities.
Selection marks
Tables.
Fields.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What specific forms exist as prebuilt models in Document intelligence?

A

Invoice
Receipt
W2
ID document model. US drivers’ licenses and international passports
Business card
Health insurance card

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What generic prebuilt models exist in Document intelligence?

A

Read model.
General document model
Layout model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What features are available in the Read model in Document Intelligence?

A

Text extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What features are available in the General document model in Document Intelligence?

A

Text extraction
Key-value pairs
Entities
Selection marks
Tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What features are available in the Layout model in Document Intelligence?

A

Text extraction
Selection marks
Tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What features are available in the Invoice model in Document Intelligence?

A

Text extraction
Key-value pairs
Selection marks
Tables
Fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What features are available in the Receipt model in Document Intelligence?

A

Text extraction
Key-value pairs
Fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What features are available in the W2 model in Document Intelligence?

A

Text extraction
Key-value pairs
Selection marks
Tables
Fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What features are available in the ID document model in Document Intelligence?

A

Text extraction
Key-value pairs
Fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What features are available in the Business card model in Document Intelligence?

A

Text extraction
Key-value pairs
Fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which file formats can be consumed by prebuilt Document Intelligence models?

A

JPEG
PNG
BMP
TIFF
PDF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What file size requirements exist for Document Intelligence documents?

A

The file must be smaller than 500 MB for the standard tier, and 4 MB for the free tier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What image size requirements exist for Document Intelligence documents?

A

Images must have dimensions between 50 x 50 pixels and 10,000 x 10,000 pixels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What limitations exist for PDF files in Document Intelligence?

A

PDF documents must have dimensions less than 17 x 17 inches or A3 paper size.
PDF documents must not be protected with a password.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What amount of pages are allowed for PDF and TIFF files in Document Intelligence?

A

PDF and TIFF files can have any number of pages but, in the standard tier, only the first 2000 pages are analyzed. In the free tier, only the first two pages are analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do you use the Document Intelligence service?

A

For custom applications, use the REST API.
To explore the models and how they behavior with your forms visually, you can experiment in the Azure AI Document Intelligence Studio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

In the Read model in Doc intelligence, how can you select a page range for analysis?

A

Use the pages parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the purpose of the Read model in Document Intelligence?

A

The read model is ideal if you want to extract words and lines from documents with no fixed or predictable structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Which prebuilt Document Intelligence model supports Entity extraction?

A

general document model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What entity types can be detected in the General Document model?

A

Person. The name of a person.
PersonType. A job title or role.
Location. Buildings, geographical features, geopolitical entities.
Organization. Companies, government bodies, sports clubs, musical bands, and other groups.
Event. Social gatherings, historical events, anniversaries.
Product. Objects bought and sold.
Skill. A capability belonging to a person.
Address. Mailing address for a physical location.
Phone number. Dialing codes and numbers for mobile phones and landlines.
Email. Email addresses.
URL. Webpage addresses.
IP Address. Network addresses for computer hardware.
DateTime. Calendar dates and times of day.
Quantity. Numerical measurements with their units.

29
Q

What is the purpose of the Layout model in Document Intelligence?

A

Use when you need rich information about the structure of a document.

30
Q

What fields can be identified by the ID Document model?

A

First and last names.
Personal information such as sex, date of birth, and nationality.
The country and region where the document was issued.
Unique numbers such as the document number and machine readable zone.
Endorsements, restrictions, and vehicle classifications.

31
Q

What fields are included in the Business Card model?

A

First and last names.
Postal addresses.
Email and website addresses.
Various telephone numbers.

32
Q

What fields are included in the W-2 model?

A

Information about the employer, such as their name and address.
Information about the employee, such as their name, address, and social security number.
Information about the taxes that the employee has paid.

33
Q

What underlying form models exist for custom forms in Document Intelligence?

A

Custom template models
Custom neural models

34
Q

Describe custom template models

A

Custom template models accurately extract labeled key-value pairs, selection marks, tables, regions, and signatures from documents.
Training only takes a few minutes, and more than 100 languages are supported.

35
Q

Describe custom neural models

A

Custom neural models are deep learned models that combine layout and language features to accurately extract labeled fields from documents.
This model is best for semi-structured or unstructured documents.

36
Q

What is included in a successful response to a call to the Document Intelligence API?

A

A successful JSON response contains analyzeResult that contains the content extracted and an array of pages containing information about the document content.
Some fields in pages include pageNumber, angle, width, height, words.
The words field has the word in the content field, a polygon array, and a confidence score

37
Q

How can you improve confidence scores in Document Intelligence?

A

You want to make sure that the form you’re analyzing has a similar appearance to forms in the training set if the confidence values are low. If the form appearance varies, consider training more than one model, with each model focused on one form format.

38
Q

What projects does the Azure Document Intelligence Studio support?

A

Document analysis models
Read: Extract printed and handwritten text lines, words, locations, and detected languages from documents and images.
Layout: Extract text, tables, selection marks, and structure information from documents (PDF and TIFF) and images (JPG, PNG, and BMP).
General Documents: Extract key-value pairs, selection marks, and entities from documents.
Prebuilt models
Custom models

39
Q

What do you need to analyze a document in the Document Intelligence Studio?

A

Azure Document Intelligence or Azure AI service endpoint and key

40
Q

What steps do you take to create and use a custom model in Document Intelligence Studio?

A

Create an Azure Document Intelligence or Azure AI Services resource
Collect at least 5-6 sample forms for training and upload them to your storage account container.
Configure cross-domain resource sharing (CORS). CORS enables Azure Document Intelligence Studio to store labeled files in your storage container.
Create a custom model project in Azure Document Intelligence Studio. You’ll need to provide configurations linking your storage container and Azure Document Intelligence or Azure AI Service resource to the project.
Use Azure Document Intelligence Studio to apply labels to text.
Train your model. Once the model is trained, you’ll receive a Model ID and Average Accuracy for tags.
Test your model by analyzing a new form that wasn’t used in training.

41
Q

A person plans to use an Azure Document Intelligence prebuilt invoice model. To extract document data using the model, what are two calls they need to make to the API?

A

The Analyze Invoice function starts the form analysis and returns a result ID, which they can pass in a subsequent call to the Get Analyze Invoice Result function to retrieve the results.

42
Q

A person needs to build an application that submits expense claims and extracts the merchant, date, and total from scanned receipts. What’s the best way to do this?

A

Use the Azure Document Intelligence’s prebuilt receipts model. It can intelligently extract the required fields even if the scanned receipts have different names in them.

43
Q

A person is building a custom model with Azure Document Intelligence services. What is required to train a model?

A

Along with the form to analyze, JSON files need to be provided.

44
Q

You want to use the Azure AI Vision Analyze Image function to generate an appropriate caption for an image. Which visual feature should you specify?

A

To generate a caption, include the Description visual feature in your analysis.

45
Q

What is the purpose of the Azure AI Vision service?

A

The Azure AI Vision service is designed to help you extract information from images through various functionalities.

46
Q

In AI Vision Service what features are available in the VisualFeatures enum?

A

VisualFeatures.TAGS: Identifies tags about the image, including objects, scenery, setting, and actions
VisualFeatures.OBJECTS: Returns the bounding box for each detected object
VisualFeatures.CAPTION: Generates a caption of the image in natural language
VisualFeatures.DENSE_CAPTIONS: Generates more detailed captions for the objects detected
VisualFeatures.PEOPLE: Returns the bounding box for detected people
VisualFeatures.SMART_CROPS: Returns the bounding box of the specified aspect ratio for the area of interest
VisualFeatures.READ: Extracts readable text

47
Q

What functionality exists in Azure Video Indexer?

A

Facial recognition - detecting the presence of individual people in the image. This requires Limited Access approval.
Optical character recognition - reading text in the video.
Speech transcription - creating a text transcript of spoken dialog in the video.
Topics - identification of key topics discussed in the video.
Sentiment - analysis of how positive or negative segments within the video are.
Labels - label tags that identify key objects or themes throughout the video.
Content moderation - detection of adult or violent themes in the video.
Scene segmentation - a breakdown of the video into its constituent scenes.

48
Q

What extensions can be made to Azure Video Indexer for custom insights?

A

People. Add images of the faces of people you want to recognize in videos, and train a model. Video Indexer will then recognize these people in all of your videos.
Note
This only works after Limited Access approval, adhering to our Responsible AI standard.

Language. If your organization uses specific terminology that may not be in common usage, you can train a custom model to detect and transcribe it.
Brands. You can train a model to recognize specific names as brands, for example to identify products, projects, or companies that are relevant to your business.

49
Q

You want Azure Video Indexer to analyze a video. What must you do first?

A

Upload the video to Azure Video Indexer and index it.

50
Q

You want Azure Video Indexer to recognize brands in videos recorded from conference calls. What should you do?

A

Edit the Brands model to show brands suggested by Bing, and add any new brands you want to detect.

51
Q

What resources need to be provisiioned to use the AI Custom Vision Service?

A

A training resource (used to train your models). This can be:
An Azure AI Services resource.
An Azure AI Custom Vision (Training) resource.
A prediction resource, used by client applications to get predictions from your model. This can be:
An Azure AI Services resource.
An Azure AI Custom Vision (Prediction) resource.

52
Q

Explain multiclass classification

A

there are multiple classes in the image dataset, but each image can belong to only one class

53
Q

Explain multilabel classification

A

an image might be associated with multiple labels

54
Q

What steps can be performed in the Azure AI Custom Vision portal?

A

Create an image classification project for your model and associate it with a training resource.
Upload images, assigning class label tags to them.
Review and edit tagged images.
Train and evaluate a classification model.
Test a trained model.
Publish a trained model to a prediction resource.

55
Q

You want to train a model that can categorize an image as “cat” or “dog” based on its subject. What kind of Azure AI Custom Vision project should you create?

A

To train a model that classifies an image using a single tag, use an Image classification (multiclass) project.

56
Q

Which of the following kinds of Azure resource can you use to host a trained Azure AI Custom Vision model?

A

You can publish a trained Azure AI Custom Vision model to either an Azure AI Custom Vision (Prediction) resource, or an Azure AI Services resource.

57
Q

What features are available in the Face service within Azure AI Vision?

A

Face detection (with bounding box).
Comprehensive facial feature analysis (including head pose, presence of spectacles, blur, facial landmarks, occlusion and others).
Face comparison and verification.
Facial recognition.
Facial landmark location
Facial liveness - liveness can be used to determine if the input video is a real stream or a fake

58
Q

How can you detect and analyze faces with the Azure AI Vision service?

A

call the Analyze Image function (SDK or equivalent REST method), specifying People as one of the visual features to be returned.

59
Q

What attributes can be returned in the Facial Attribute analysis?

A

Head pose (pitch, roll, and yaw orientation in 3D space)
Glasses (NoGlasses, ReadingGlasses, Sunglasses, or Swimming Goggles)
Blur (low, medium, or high)
Exposure (underExposure, goodExposure, or overExposure)
Noise (visual noise in the image)
Occlusion (objects obscuring the face)
Accessories (glasses, headwear, mask)
QualityForRecognition (low, medium, or high)
Head pose (pitch, roll, and yaw orientation in 3D space)
Glasses (NoGlasses, ReadingGlasses, Sunglasses, or Swimming Goggles)
Blur (low, medium, or high)
Exposure (underExposure, goodExposure, or overExposure)
Noise (visual noise in the image)
Occlusion (objects obscuring the face)
Accessories (glasses, headwear, mask)
QualityForRecognition (low, medium, or high)

60
Q

How can the Face service be provisioned?

A

You can provision Face as a single-service resource, or you can use the Face API in a multi-service Azure AI Services resource.

61
Q

Describe the Face detection process

A

When a face is detected by the Face service, a unique ID is assigned to it and retained in the service resource for 24 hours. The ID is a GUID, with no indication of the individual’s identity other than their facial features.

While the detected face ID is cached, subsequent images can be used to compare the new faces to the cached identity and determine if they are similar (in other words, they share similar facial features) or to verify that the same person appears in two images.

62
Q

What steps do you take to train a facial recognition model in the Face service?

A

Create a Person Group that defines the set of individuals you want to identify (for example, employees).
Add a Person to the Person Group for each individual you want to identify.
Add detected faces from multiple images to each person, preferably in various poses. The IDs of these faces will no longer expire after 24 hours (so they’re now referred to as persisted faces).
Train the model.

63
Q

You need to create a facial recognition solution to identify named employees. Which service should you use?

A

Use the Face service to create a facial recognition solution.

64
Q

You need to verify that the person in a photo taken at hospital reception is the same person in a photo taken at a ward entrance 10 minutes later. What should you do?

A

Verify the face in the ward photo by comparing it to the detected face ID from the reception photo. This is the most efficient approach as the photos were taken within 24 hours

65
Q

What is the easiest option for labeling images for object detection?

A

use the interactive interface in the Azure AI Custom Vision portal. This interface automatically suggests regions that contain objects, to which you can assign tags or adjust by dragging the bounding box to enclose the object you want to label.

66
Q

What do the bounding box measurement units quantify in image labeling?

A

The values are the proportion of the full image size. Consider the example:
Left: 0.1
Top: 0.5
Width: 0.5
Height: 0.25
This defines a box in which the left is located 0.1 (one tenth) from the left edge of the image, and the top is 0.5 (half the image height) from the top. The box is half the width and a quarter of the height of the overall image.

67
Q

What must you do before taking advantage of the smart labeler tool when creating an object detection model?

A

To take advantage of the smart labeler, tag some images and train an initial model. Subsequently, the portal will suggest tags for new images.

68
Q

How can you use the Read API for OCR?

A

call the ImageAnalysis function (REST API or equivalent SDK method), passing the image URL or binary data, and optionally specifying a gender neutral caption or the language the text is written in (with a default value of en for English).

result = client.analyze(
image_url=,
visual_features=[VisualFeatures.READ]
)

https:///computervision/imageanalysis:analyze?features=read&…