03-Computer Vision Flashcards

1
Q

What is Computer Vision

A

AI that “see” the world and make sense of it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Image Classification

A

Train ML model to classify images based on their contents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Object Detection

A

Train ML model to classify individual objects within an image and identify their location with a bounding box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Semantic Segmentation

A

Advanced ML technique in which INDIVIDUAL PIXELS in the image are CLASSIFIED according to the object to which they belong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Image analysis

A

Combine ML models with advanced image analysis techniques to extract information from images, including “tags” that could help catalog the image or even descriptive captions that summarize the scene shown in the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Face detection, analysis, and recognition

A

Specialized form of object detection that locates human faces in an image. Combined with classification and facial geometry analysis techniques to infer details such as age, and emotional state; and even recognize individuals based on their facial features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Optical character recognition (OCR)

A

Technique to detect and read text in images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What 4 things does Cognitive Services include

A

Cognitive Service includes

  1. Decision
  2. Language
  3. Speech
  4. Vision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are 4 Decision services

A
  1. Anomaly Detector
  2. Content Moderator
  3. Metrics Advisor (Preview)
  4. Personalizer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Anomaly Detector

A

Identify potential problems early on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Content Moderator

A

Detect potentially offensive or unwanted content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are Metrics Advisor

A

Monitor metrics and diagnose issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Personalizer

A

Create rich, personalized experiences for every user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 5 Language services

A
  1. Immersive Reader
  2. Language Understanding
  3. QnA Maker
  4. Text Analytics
  5. Translator
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are 4 Speech services

A
  1. Speech to Text
  2. Text to Speech
  3. Speech Translation
  4. Speaker Recognition (Preview)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Immersive Reader

A

Helps readers of all abilities comprehend text using audio and visual cues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Language Understanding

A

Build natural language understanding into apps, bots, and IoT devices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is QnA Maker

A

Create a conversational question and answer layer over your data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Text Analytics

A

Detect sentiment, key phrases, and named entities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is Translator

A

Detect and translate more than 90 supported languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is Speech to Text

A

Transcribe audible speech into readable, searchable text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is Text to Speech

A

Convert text to life-like speech for more natural interfaces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Speech Translation

A

Integrate real-time speech translation into your apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Speaker Recognition

A

Identify and verify the people speaking based on audio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are 5 Vision services
1. Computer Vision 2. Custom Vision 3. Face 4. Form Recognizer 5. Video Indexer
26
What is Computer Vision
Analyze content in images and video
27
What is Custom Vision
Customize image recognition to fit your business needs
28
What is Face
Detect and identify people and emotions in image
29
What is Form Recognizer
Extract text, key-value pairs, and tables from documents
30
What is Video Indexer
Analyze the visual and audio channels of a video, and index its content
31
What two pieces of information do you need to use a cognitive service
1. Key to authenticate client applications | 2. Endpoint that provides the HTTP address at which your resource can be accessed
32
2 places to train image resource and label them
1. Custom Vision portal | 2. Custom Vision service programming language-specific software development kids (SDKs) - for programmers
33
What do Client Application developers need to use your model
1. Project ID: Unique ID of Custom Vision project you created to train the model 2. Model Name: Name you assigned to the model during publishing 3. Prediction endpoint: HTTP address of the endpoints for the prediction resource to which you published the model (not the training resource) 4. Prediction key: authentication key for the prediction resource to which you published the model (not the training resource)
34
What is Face detection
Face detection involves identifying regions of an image that contains a human face, typically by returning bounding box coordinates that form a rectangle around the face
35
What is Facial analysis
Uses algorithms to return information such as facial landmarks (nose, eyes, eyebrows, lips, etc). Used to train an ML model from which you can infer information about a person, such as their age, or perceived emotional state
36
What is Facial recognition
Identify known individuals from their facial features
37
Uses for facial detection, analysis, and recognition
1. Security 2. Social media 3. Intelligent monitoring 4. Advertising 5. Missing persons 6. Identity validation
38
How to improve accuracy of detection in images
1. Image format should be JPEG, PNG, GIF, and BMP 2. File size is 4 MB or smaller 3. Face size range from 36 x 36 to 4096 x 4096. Smaller or larger faces will not be detected 4. Other issues such as extreme face angles, occlusion (objects blocking the face such as sunglasses or a hand). Best results are obtained when the faces are full-frontal or as near as possible to full-frontal
39
How to improve detection using video feeds
1. Smoothing - turn it off because the potential blur between frames tends to reduce clarity of the image in individual frames 2. Shutter speed - faster speed improves clarity of the images in each frame because the motion is reduced 3. Shutter angle - use lower shuttle angle to produce clearer frames, resulting in better clarify for recognition
40
What happens when you intersect computer vision with natural language process
Computer systems get the ability to process written or printed text. Computer vision to "read" the text and natural language processing to make sense of it
41
What is optical character recognition (OCR)
Model trained to recognize individual shapes as letters, numerals, punctuation, or other elements of text.
42
How is OCR beneficial
1. Note taking 2. Digitizing forms, such as medical records or historical documents 3. Scanning printed or handwritten checks for bank deposits
43
What is Form Recognizer
Form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices, and receipts. Combines optical character recognition (OCR) with predictive models that can interpret form data by 1. Matching field names to values 2. Processing tables of data 3. Identifying specific types of field, such as dates, telephone numbers, addresses, totals, and others
44
How does Form Recognizer support automated document processing
1. Custom models | 2. A pre-built receipt model
45
What are Custom models
Custom models enable you to extract key/value pairs and table data from forms. Custom models are trained using your own data, which helps to tailor this model to your specific forms.
46
What is pre-built receipt model
Model is provided out-of-the-box. Trained to recognize and extract data from sales receipts
47
What is an image to an AI application
Just an array of pixel values. These numerical values can be used as FEATURES to train ML models that make predictions about the image and its contents
48
Which two specialized domain models does Computer Vision service support
1. Celebrities - service includes a model that has been trained to identify thousands of well-known celebrities 2. Landmarks - service identifies famous landmarks
49
What other capabilities does Computer Vision provide
1. Detect image types, i.e. clip art images or line drawings 2. Detect image color schemes - specifically, identify the dominant foreground, background, and overall colors in an image 3. Generate thumbnails - create small versions of images 4. Moderate content - detect images that contain adult content or depict violent, gory scenes
50
What are most modern image classification solutions based on
Deep learning techniques that make use of convolutional neural networks (CNNs) to uncover patterns in the pixels that correspond to particular classes
51
Do you need to know deep learning techniques to train and publish your model as a software service
Nope because Custom Vision cognitive service encapsulates common techniques used to train image classification models
52
What are some potential uses for image classification
1. Product identification - perform visual searches for specific products in online searches or even, in-store using a mobile device 2. Disaster investigation - evaluate key infrastructure for major disaster preparation efforts, i.e. aerial surveillance may show bridges and classify them as such 3. Medical diagnosis - evaluating images from X-ray or MRI devices could quickly classify specific issues found as cancerous tumors, or many other medical conditions related to medical imaging diagnosis.
53
What is Precision
What percentage of class predictions made by the model are correct? If model predicts that 10 images are oranges, of which eight were actually oranges, then the precision is 0.8 (80%)
54
What is Recall
What percentage of class predictions did the model correctly identify? For example, if there are 10 images of apples, and the model found 7 of them, then the recall is 0.7 (70%)
55
What is Average Precision (AP)
Overall metric that takes into account both precision and recall
56
What do client application developers need to use your classification model
1. Project ID - unique ID of the Custom Vision project you create to train the model 2. Model name: the name you assigned to the model during publishing 3. Prediction endpoint: the HTTP address of the endpoints for the prediction resource to which you published the model (not the training resource) 4. Prediction key: the authentication key for the prediction resource to which you published the model (not the training resource)
57
What is Image classification
ML based on computer vision in which a model is trained to categorize images based on the primary subject matter they contain.
58
What is Object detection
Goes further than image classification to classify individual objects within the image, and to return the coordinates of a bounding box that indicates the object's location
59
What are some sample application of object detection
1. Evaluate the safety of a building by looking for fire extinguishers or other emergency equipment 2. Create software for self-driving cars or vehicles with lane assist capabilities 3. Medical imaging such as an MRI or x-rays that can detect known objects for medical diagnosis
60
What is smart tagging
It suggests classes and bounding boxes for images you add to the training dataset
61
Mean Average Precision (mAP)
Overall metric that takes into account both precision and recall across all classes in object detection
62
What are usage of face detection and analysis
1. Security - facial recognition can be used in building security applications, and increasingly it is used in smart phones operating systems for unlocking devices 2. Social media - automatically tag known friends in photographs 3. Advertising - help direct advertisement to an appropriate demographic audience 4. Missing persons - identify if a missing person is in the image frame 5. Identity validation - ports of entry kiosk where person holds a special entry permit
63
What functions does Face support
1. Face detection 2. Face verification 3. Find similar faces 4. Group faces based on similarities 5. Identify people
64
What attributes can face return
1. Age 2. Blur 3. Emotion 4. Exposure 5. Facial hair 6. Glasses 7. Hair 8. Head pose 9. Makeup 10 Noise 11. Occlusion 12. Smile
65
What is machine reading comprehension (MRC)
AI system not only reads text characters, but uses a semantic model to interpret with the text is about
66
What are uses of optical character recognition (OCR) technologies
1. note taking 2. digitizing forms, such as medical records or historical documents 3. scanning printed or handwritten checks for bank deposits
67
What is OCR API good for
Quick extraction of small amounts of text in images. Operates synchronously to provide immediate results that can recognize text in numerous languages
68
What doe OCR API return when processing an image
1. Regions in the image that contain text 2. Lines of text in each region 3. Words in each line of text Also returns bounding box coordinates that define a rectangle to indicate the location in the image where the region, line, or word appears
69
What is Read API
Superior to OCR that has issues with false positives when image is considered text-dominant. Uses latest recognition models and is optimized for images that have lot of text or lot of visual noise
70
What 3-step process must your application do to use Read API
1. Submit image to API and retrieve operation ID in response 2. Use operation ID to check on the status of the image analysis operation, and wait until it has completed 3. Retrieve the results of the operation
71
How are results from the Read API arranged
Into a hierarchy 1. Pages - one for each page of text, including information about the page size and orientation 2. Lines - the lines of text on a page 3. Words - words in a line of text Each line and word includes bounding box coordinated indicating its position on the page
72
What does the Form Recognizer in Azure provide
Intelligent form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices, and receipts
73
How does Form Recognizer support automated documented processing
1. Pre-built receipt model - provided out-of-the-box and is trained to recognize and extract data from sales receipts 2. Custom models - extract key/value pairs and table data from forms. Custom models are trained using your own data, which helps to tailor this model to your specific forms. Starting with only 5 samples of your forms, you can train the custom model. After the first training exercise, you can evaluate the results and consider if you need to add more samples and re-train.