4. Choosing the Right ML Infrastructure Flashcards
ML models for video and images require a large number of computations for each loop of training. What mathematical calculations are involved?
matrix multiplications, additions, subtractions, and differentials
When you have an ML problem to solve, say image classification, you have three ways approaching it:
Pretrained (fastest to develop and least expertise needed)
AutoML (in between)
Custom (Most flexible but expertise needed)
Pretrained models are already deployed and can be readily used via …
APIs
What is the biggest advantage of using pretrained models?
ease of use and the speed
How can developers use pre-trained models?
Using CLI, Python, Java, or Node.js SDK.
Are pre-trained models serverless?
Yes
What is the biggest disadvantage of using pretrained models?
Less customizable
What can Vertex AI AutoML do for you?
Build your own model using your own data.
AutoML chooses the best ML algorithm, and the only thing that it needs is the data. What do you need to do?
Format the data and work on quality control
Unlike with pretrained models, you have to provision cloud resources for training and deploying the model on instances. What do you have to decide on?
Number of hours of instance time
Devices the models need to deploy on (cloud, phone, IOT)
What options do you have if pre-trained models and AutoML do not fit your need?
Use custom models in Vertex AI
What pre-trained models does Google have?
Vision AI
Video AI
Natural Language AI
Translation AI
Speech‐to‐Text
Text‐to‐Speech
What solutions does Google have in addition to pretrained models?
Document AI
Contact Center AI
What can Vision AI do for you?
perform image classification, detect objects and faces, and read handwriting (through optical character recognition)
What is the process for Vision AI to image classification?
Detect objects in the photo
Get a set of labels for your image (e.g., table, plant, chair)
Get the dominant colors for these images
Categorize (e.g., Adult, Spoof, Medical, Violence)
What can Video AI do?
Recognize objects, places, and actions in videos.
Give 3 use cases for Video AI?
Build a video recommendation system
Create an index of your video archives
Map advertisements to your content
What does Natural Language AI do?
Provides insights from unstructured text using pretrained machine learning models including entity extraction, sentiment analysis, syntax analysis, and general categorization.
What does entity extraction do?
Identifies entities such as the names of people, organizations, products, events, locations, and so on.
What does sentiment analysis do?
Provides you a positive, negative, or neutral score with magnitude for each sentence, for each entity, and the whole text.
What does syntax analysis do?
Identify the part of speech, dependency between words, lemma, and the morphology of text.
Give 2 use cases for Natural Language AI
Measure the customer sentiment toward a particular product.
Use Healthcare Natural Language API to understand details specific to healthcare text like clinical notes or healthcare research documents.
Translation AI has 2 levels, Basic and Advanced. What are the main differences?
Advanced version can use a glossary (a dictionary of terms mapped from source language to target language) and also can translate entire documents (PDFs, DOCs, etc.).
What do Media Translation API do?
Translates audio in source language into audio in target languages.
What is the use case for Speech‐to‐Text service to convert recorded audio or streaming audio into text?
Creating subtitles for video recordings and streaming video as well.