AI Services Flashcards
Rekognition
- Pre-trained deep learning API, where an image is presented to the API for classification. Can also be accessed through a Python script (boto3) or lambda function
1. Object and scene detection
2. Image moderation
3. Facial analysis (detect age, gender, expression etc)
4. Celebrity recognition
5. Face comparison
6. Text in image
Rekognition: potential use cases
- Create a filter to prevent inappropriate images being sent via a messaging platform. This could include nudity or offensive text
- Enhance metadata catalogue of an image library to include the number of people in each image
- Scan an image library to detect instances of famous people
Rekognition video
- Image and video analysis, pre-trained deep learning API
- Process stored videos in S3, or analyse streaming videos (i.e. live camera)
- Could stream video content in through Kinesis Video Stream to Rekognition Video
- Detect people of interest in a live video stream, or alternatively detect offensive content within videos uploaded to a social media platform
Polly
- Text to speech service
- Multiple languages supported, m&f, customer lexicons
- Speech Synthesis Markup Language. Allows you to further dictate how the polly speech is return (e.g. whispering)
- Lexicon: create your own specific words/phrases
Polly: use cases
- Create accessibility tools to “read” web content to blind people
- Provide automatically generated announcements via a PA system
- Create automated voice response (AVR) solution for a telephony system (e.g. when you ring a bank and an automatic voice responds)
Translate
- Text translation service
- Batch or real-time
- Supports several languages
- Supports custom terminology
- Use cases:
- Enhance online customer chat applications to translate conversations in real-time
- Batch translate documents within a multilingual company
- Create a news publishing solution to convert posted stories to multiple languages
Transcribe
- Automatic speech recognition
- Supports many languages as well as custom vocab
- Can hook up microphone (real time) or upload pre-recorded file
- Custom vocab: industry-specific words can be uploaded
- Use cases:
- Create a call centre monitoring solution that integrates with other services to analyse caller sentiment
- Create a solution to enable text search of media with spoken words
- Provide a closed captioning solution for online video training
Comprehend
- Text analysis, NLP
- Keyphrase extraction
- Sentiment analysis
- Entity recognition
- Syntax analysis (verbs, nouns, word types)
- Medical version
- Custom entities (product numbers etc)
- Language detection
- Custom classification (i.e. feed in pre-classified documents)
- Topic modelling
- Multiple language support
Comprehend use cases
- Perform customer sentiment analysis on inbound messages to support the system
- Create a system to label unstructured (clinical) data to assist in research and analysis
- Determine the topics from transcribed audio recordings of company meetings
Amazon Lex
- Powers Alexa
- Conversation interface service, chatbot
- Voice enabled or text
- Automatic speech recognition
- Natural language understanding
Lex use cases
- Create a chatbot that handles customer support requests directly on the product page of a website
- Create an automated receptionist that directs people as they enter a building
- Provide an interactive voice interface to […] application
Service chaining - scenario 1:
- Lambda function 1 retrieves text data from S3
- Passes to Translate, which passes the translation back to the lambda function 1
- Lambda function 1 re-directs results to Comprehend for analysis
- Analysis results are sent back to same Lambda function
The issue with this service chain is that the same Lambda function is handling too many functions. Best practice is for one Lambda function to handle one function
Service chaining - scenario 2:
- Same architecture as scenario 1, except we have Lambda function 1 between S3 and Translate, and Lambda function 2 taking the output of Lambda function 1 and passing on to Comprehend, then onwards
The issue with this service chain is that it is not best practice to have separate Lambda functions that are dependent on each other.
Service chaining - scenario 3:
- Lambda function 1 retrieves data from S3 and passes onto AWS Step Function.
- Within the Step Function, we have Lambda function 2 which deals with Translate, and Lambda function 3 which deals with Comprehend.
- The Step Function facilitates the interaction between Lambda functions 2 and 3, not the functions themselves
This architecture is best practice as it the Step Function controls the “state” of the workflow (i.e once one Lambda function finishes, the Step Function then triggers the next Lambda function)
Step Functions with stream data
We can make use of wait and decision functions with the Step Function to “check-in” with Transcribe (or another service) to see its status before progressing data onto Comprehend