03-Computer Vision Flashcards

Question

What are 5 Vision services

Answer 1

1. Computer Vision 2. Custom Vision 3. Face 4. Form Recognizer 5. Video Indexer

Answer 2

Analyze content in images and video

Answer 3

Customize image recognition to fit your business needs

Answer 4

Detect and identify people and emotions in image

Answer 5

Extract text, key-value pairs, and tables from documents

Answer 6

Analyze the visual and audio channels of a video, and index its content

Answer 7

1. Key to authenticate client applications | 2. Endpoint that provides the HTTP address at which your resource can be accessed

Answer 8

1. Custom Vision portal | 2. Custom Vision service programming language-specific software development kids (SDKs) - for programmers

Answer 9

1. Project ID: Unique ID of Custom Vision project you created to train the model 2. Model Name: Name you assigned to the model during publishing 3. Prediction endpoint: HTTP address of the endpoints for the prediction resource to which you published the model (not the training resource) 4. Prediction key: authentication key for the prediction resource to which you published the model (not the training resource)

Answer 10

Face detection involves identifying regions of an image that contains a human face, typically by returning bounding box coordinates that form a rectangle around the face

Answer 11

Uses algorithms to return information such as facial landmarks (nose, eyes, eyebrows, lips, etc). Used to train an ML model from which you can infer information about a person, such as their age, or perceived emotional state

Answer 12

Identify known individuals from their facial features

Answer 13

1. Security 2. Social media 3. Intelligent monitoring 4. Advertising 5. Missing persons 6. Identity validation

Answer 14

1. Image format should be JPEG, PNG, GIF, and BMP 2. File size is 4 MB or smaller 3. Face size range from 36 x 36 to 4096 x 4096. Smaller or larger faces will not be detected 4. Other issues such as extreme face angles, occlusion (objects blocking the face such as sunglasses or a hand). Best results are obtained when the faces are full-frontal or as near as possible to full-frontal

Answer 15

1. Smoothing - turn it off because the potential blur between frames tends to reduce clarity of the image in individual frames 2. Shutter speed - faster speed improves clarity of the images in each frame because the motion is reduced 3. Shutter angle - use lower shuttle angle to produce clearer frames, resulting in better clarify for recognition

Answer 16

Computer systems get the ability to process written or printed text. Computer vision to "read" the text and natural language processing to make sense of it

Answer 17

Model trained to recognize individual shapes as letters, numerals, punctuation, or other elements of text.

Answer 18

1. Note taking 2. Digitizing forms, such as medical records or historical documents 3. Scanning printed or handwritten checks for bank deposits

Answer 19

Form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices, and receipts. Combines optical character recognition (OCR) with predictive models that can interpret form data by 1. Matching field names to values 2. Processing tables of data 3. Identifying specific types of field, such as dates, telephone numbers, addresses, totals, and others

Answer 20

1. Custom models | 2. A pre-built receipt model

Answer 21

Custom models enable you to extract key/value pairs and table data from forms. Custom models are trained using your own data, which helps to tailor this model to your specific forms.

Answer 22

Model is provided out-of-the-box. Trained to recognize and extract data from sales receipts

Answer 23

Just an array of pixel values. These numerical values can be used as FEATURES to train ML models that make predictions about the image and its contents

Answer 24

1. Celebrities - service includes a model that has been trained to identify thousands of well-known celebrities 2. Landmarks - service identifies famous landmarks

Answer 25

1. Detect image types, i.e. clip art images or line drawings 2. Detect image color schemes - specifically, identify the dominant foreground, background, and overall colors in an image 3. Generate thumbnails - create small versions of images 4. Moderate content - detect images that contain adult content or depict violent, gory scenes

Answer 26

Deep learning techniques that make use of convolutional neural networks (CNNs) to uncover patterns in the pixels that correspond to particular classes

Answer 27

Nope because Custom Vision cognitive service encapsulates common techniques used to train image classification models

Answer 28

1. Product identification - perform visual searches for specific products in online searches or even, in-store using a mobile device 2. Disaster investigation - evaluate key infrastructure for major disaster preparation efforts, i.e. aerial surveillance may show bridges and classify them as such 3. Medical diagnosis - evaluating images from X-ray or MRI devices could quickly classify specific issues found as cancerous tumors, or many other medical conditions related to medical imaging diagnosis.

Answer 29

What percentage of class predictions made by the model are correct? If model predicts that 10 images are oranges, of which eight were actually oranges, then the precision is 0.8 (80%)

Answer 30

What percentage of class predictions did the model correctly identify? For example, if there are 10 images of apples, and the model found 7 of them, then the recall is 0.7 (70%)

Answer 31

Overall metric that takes into account both precision and recall

Answer 32

1. Project ID - unique ID of the Custom Vision project you create to train the model 2. Model name: the name you assigned to the model during publishing 3. Prediction endpoint: the HTTP address of the endpoints for the prediction resource to which you published the model (not the training resource) 4. Prediction key: the authentication key for the prediction resource to which you published the model (not the training resource)

Answer 33

ML based on computer vision in which a model is trained to categorize images based on the primary subject matter they contain.

Answer 34

Goes further than image classification to classify individual objects within the image, and to return the coordinates of a bounding box that indicates the object's location

Answer 35

1. Evaluate the safety of a building by looking for fire extinguishers or other emergency equipment 2. Create software for self-driving cars or vehicles with lane assist capabilities 3. Medical imaging such as an MRI or x-rays that can detect known objects for medical diagnosis

Answer 36

It suggests classes and bounding boxes for images you add to the training dataset

Answer 37

Overall metric that takes into account both precision and recall across all classes in object detection

Answer 38

1. Security - facial recognition can be used in building security applications, and increasingly it is used in smart phones operating systems for unlocking devices 2. Social media - automatically tag known friends in photographs 3. Advertising - help direct advertisement to an appropriate demographic audience 4. Missing persons - identify if a missing person is in the image frame 5. Identity validation - ports of entry kiosk where person holds a special entry permit

Answer 39

1. Face detection 2. Face verification 3. Find similar faces 4. Group faces based on similarities 5. Identify people

Answer 40

1. Age 2. Blur 3. Emotion 4. Exposure 5. Facial hair 6. Glasses 7. Hair 8. Head pose 9. Makeup 10 Noise 11. Occlusion 12. Smile

Answer 41

AI system not only reads text characters, but uses a semantic model to interpret with the text is about

Answer 42

1. note taking 2. digitizing forms, such as medical records or historical documents 3. scanning printed or handwritten checks for bank deposits

Answer 43

Quick extraction of small amounts of text in images. Operates synchronously to provide immediate results that can recognize text in numerous languages

Answer 44

1. Regions in the image that contain text 2. Lines of text in each region 3. Words in each line of text Also returns bounding box coordinates that define a rectangle to indicate the location in the image where the region, line, or word appears

Answer 45

Superior to OCR that has issues with false positives when image is considered text-dominant. Uses latest recognition models and is optimized for images that have lot of text or lot of visual noise

Answer 46

1. Submit image to API and retrieve operation ID in response 2. Use operation ID to check on the status of the image analysis operation, and wait until it has completed 3. Retrieve the results of the operation

Answer 47

Into a hierarchy 1. Pages - one for each page of text, including information about the page size and orientation 2. Lines - the lines of text on a page 3. Words - words in a line of text Each line and word includes bounding box coordinated indicating its position on the page

Answer 48

Intelligent form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices, and receipts

Answer 49

1. Pre-built receipt model - provided out-of-the-box and is trained to recognize and extract data from sales receipts 2. Custom models - extract key/value pairs and table data from forms. Custom models are trained using your own data, which helps to tailor this model to your specific forms. Starting with only 5 samples of your forms, you can train the custom model. After the first training exercise, you can evaluate the results and consider if you need to add more samples and re-train.

03-Computer Vision Flashcards

(73 cards)