Modeling 3 Flashcards
Amazon comprehend
Higher-level AI/ML services beyond SageMaker
it does NLP and Text Analytics
Amazon comprehend input
social media emails web pages documents transcripts medical records (comprehend medical)
Amazon comprehend Extracts?
Entities Phrases Sentiments Language Syntax Topics Document classification
can you train Amazon comprehend on your own data?
yes you can train
and also you can use some of out-of-the-box models
Amazon Translate
use deep learning to translate text
can you define some terminologies for Amazon Translate
yes you can
using CSV or TMX format
it’s appropriate for proper names, brands, names etc.
Amazon Transcribe
Speech to text
Does Amazon Transcribe support streaming audio?
yes it does
HTTP/2 or WebSocket
define the language
- French, English, Spanish only
Amazon Transcribe input
FLAC
MP3
MP4
Wave
does Amazon Transcribe do speaker identification?
yes it does
define how many speakers are in there and it will do the rest
does Amazon Transcribe do channel identification?
yes
i.e. two callers could be transcribed separately
Merging based on timing of utterances
does Amazon Transcribe do custom vocabulary?
yes you can
give it a list
special words, names, acronyms
also can do Vocabulary tables that include sound
Amazon Polly
Neural text-to-speech, many voices & languages supports: - Lexicons - SSML - Speech Marks
Does Amazon Polly handle Lexicons?
yes it does
e.g. W3C map to world wide web consortium
SSML
ssml (speech synthesis markup language)
alternative to plain text
speech synthesis markup language
gives control over emphasis, pronunciation, breathing, whispering, speech rate, pitch, pause
Polly Speechmarks
can encode when sentence/word starts and ends in the audio stream
useful for lip-synching animation
Amazon Rekognition
Computer Vision
Object and scene detection
- can use a collection of known faces
Image moderation Facial Analysis Celebrity recognition Face comparison Text in image Video analysis - object - people - celebrities marked on timeline - people pathing
Amazon Rekognition input
Video
- Kinesis Video Streams (H.264 encoded, 5-30FPS, favor resolution over fps)
Image
- S3
- part of the request