04-Natural Processing Language Flashcards
What is Natural Processing Language
Area of AI that deals with creating software that understands written and spoken language
What software can Natural Processing Language allow you to create
- Analyze text to extract key phrases and recognize entities (such as places, dates, or people)
- Perform sentiment analysis to determine how positive or negative the language used in a document is
- Interpret spoken language, and synthesize speech responses
- Automatically translate spoken or written phrases between languages
- Interpret commands and determine appropriate actions
Which cognitive services build natural language processing solutions
- Text analytics
- Translator text
- Speech
- Language Understanding
What is Text analytics
Analyzes text and extracts key phrases, detects entities (such as places, dates, and people), and evaluate sentiment (how positive or negative a document is)
What is Translator text
Translate between more than 60 languages
What is Speech service
RECOGNIZE and SYNTHESIZE speech, and TRANSLATE spoken languages
What is Language Understanding
Train a language model that can understand spoken or text-based commands
What will Language detection detect
- Language name, i.e. “English”
- ISO 6391 language code, i.e. “en”
- Score indicating level of confidence in the language detection
What score is assigned to text that is ambiguous or has mixed language content
NaN
How does sentiment analysis evaluate text and return sentiment scores and labels
Uses pre-built ML classification model. Returns sentiment score in range of 0 to 1, which 1 being most positive and 0 being most negative.
What score will sentiment analysis return if you pass French text but tell the service the language code is en for English
Service will return a score of precisely 0.5
What is Key Phrase Extraction
Evaluate text then identify main points around the context.
What is Entity Recognition
Item of a particular type or category, and in same cases, subtype, Person, Location, Organization, Quantity, etc.
Service also supports entity linking to help disambiguate entities by linking to a specific reference.
What is Speech recognition
Ability to detect and interpret spoken input
What is Speech synthesis
Ability to generate spoken output
What APIs are offered through the Speech cognitive service
- Speech-to-Text
2. Text-to-Speech
What can you use speech-to-text for
Perform real-time or batch transcription of audio into text format. Audio source can be real-time or an audio file
What model is used by the speech-to-text API
Model based on Universal Language Model that was trained by Microsoft.
Optimized for conversational and dictation. User can also create and train their own custom model including acoustics, language, and pronunciation.
What is real-time transcription
Used for presentation, demos, or any other scenario where person is speaking.
Application needs to listen for incoming audio from a microphone or other audio input source such as audio file.
Application code streams the audio to the service, which returns the transcribed text.
What is batch transcription
Previously recorded and stored audio files are transcripted.
Can have audio stored. You can point to audio files with shared access signature (SAS) URI and asynchronously receive transcription results.
Run in asynchronous manner because batch jobs are scheduled on best-effort bases. Job will execute within minutes of request, but no estimate for when a job changes into the running state
What is text-to-speech API
Enables you to convert TEXT input to audible SPEECH, which can either be played directly through computer speaker or written to an audio file
What is Speech synthesis voices
Pre-defined voices with support for multiple languages and regional pronunciation.
Includes standard voices as well as neural voices that leverage neural networks to overcome common limitations in speech synthesis with regard to intonation, resulting in ore natural sounding voice.
What is a neural network
A bunch of algorithms that try to recognize a relationship in a set of data . Tries to imitate the human brain in order to recognize a relationship in a set of data.. In this way the neural networks refer to systems of neurons, either organic or artificial in nature.
What is machine translation
Automated translation to convert one language to another. This enables collaboration with people of other cultures and geographic locations.
What is literal translation
Each word is translated to the corresponding word in the target language
What is semantic
Relating to meaning in language or logic
What is text translation used for
Used to translate documents from one language to another, translate email communications that come from foreign governments, and even provide the ability to translate web pages on the Internet
What is speech translation used for
Used to translate between spoken languages, sometimes directly (speech-to-speech translation) and sometimes by translating to an intermediary text format (speech-to-text translation)
What Translation services does Azure provide
- Translator Text - text-to-text translation
2. Speech - speech-to-text and speech-to-speech
What service does Translator Text use
Uses a Neural Machine Translation (NMT) model for translation, which analyzes the semantic context of the text and renders a more accurate and complete translation as a result
How do you specify language you are translating from and to in the Text Translator service
Use ISO 639-1 language code, i.e. en for English, fr for French.
How do you specify cultural variant of language
Use 3166-1 cultural code, i.e. en-US for US English, en-GM for British English.
What are optional configurations for Translator Text API
- Profanity filtering
2. Selective translation - tag content so it isn’t translated
What APIs does Speech service include
- Speech-to-text
- Text-to-speech
- Speech translation
What are 3 core concepts of Language Understanding
- Utterances
- Entities
- Intents
What are utterances
What a user might say, i.e. “Switch the fan on”
What are entities
An item to which an utterance refers, i.e. “Switch the FAN on”
What are intents
Purpose, or goal, expressed in a user’s utterance, i.e. “TurnOn” for “Switch the FAN on”
When should you use the None intent
To handle utterances that do not map any of the utterances you have created. Considered a fallback, and used to provide generic response to users when their requests don’t match any other intent.
What two main tasks are involved with creating a language understanding application
- Define entities, intents, and utterances with which to train the language model. Referred to as AUTHORING the model.
- Publish the model so the client applications can use it for intent and entity PREDICTION based on user input
What is the Language Understanding portal
Web-based interface for creating and managing Language Understanding applications
What are 4 types of entities
- Machine Learned - entities that are learned by your model during training from context in the sample utterances you provide
- List - entities that are defined as a hierarchy of lists and sublists, i.e device list might include sublists for light and fan. For each list entry, specify synonyms such as lamp for light
- RegEx - regular expression that describes a pattern, i.e. [0-9]{3}-[0-9]{3}-[0-9]{4} in the form 555-123-4567
- Pattern.any - entities that are used with patterns to define complex entities that may be hard to extract from sample utterances
What is training the model
The process of using sample utterances to teach your model to match natural language expressions that a user might say to probable intents and entities.
Training and testing is an iterative process.
What is predicting
After training and testing, publish Language Understanding application to prediction resource.
Predictions are returned to client application
What are some techniques to build software to analyze text
- Statistical analysis
- Extending frequency analysis to multi-term phrases
- Apply stemming or lemmatization algorithms
- Apply linguistic structure rules to analyze sentences
- Encode words or terms as numeric features that can be used to train a ML model
- Create vectorized models to capture semantic relationships
What is statistical analysis of terms used in text
Remove common “stop words”’, i.e. “the” or “a”. Perform FREQUENCY ANALYSIS of the remaining words (how often each word appears). This provides clues about the main subject of the text.
What does extending frequency analysis to multi-term phrases involve.
These are known as N-grams (a two-word phrase is a bi-gram, a three-word phrase is a tri-gram, etc).
Analyze the frequency analysis to such words.
What is stemming or lemmatization algorithms
These NORMALIZE WORDS before counting them, i.e. “power”, “powered”, and “powerful” are interpreted as being the same word.
What does applying linguistic structure rules to analyze sentences involve
For example, break down sentence into tree-like structure such as noun phrase, which itself contains nouns, verbs, adjectives, and so on
What does encoding words or terms as numeric features to train ML model involve
This technique is often used to perform SENTIMENT ANALYSIS, in which a document is classified as positive or negative.
For example, classify a text document based on the terms it contains.
What does creating vectorized models to capture semantic relationship between words involve
Capture semantic relationship between words by assigning them to locations in n-dimensional space.
For example, assign values to words “flower” and “plant” that locate them close to one another, while “skateboard” might be given a value that positions it much further away
How does Text Analytics cognitive service simplify application development
IT uses pre-trained models that can
- Determine the language of a document or text (i.e. French or English)
- Perform sentiment analysis from text that might indicate its main talking points.
- Extract key phrases from text that might indicate its main talking points
- Identify and categorize entities in the text. Entities can be people, places, organizations, or even everyday items such as dates, times, quantities, etc
How can Text Analytics be applied
- Social media feed analyzer to detect sentiment around a political campaign or a product in market
- Document search application that extracts key phrases to help summarize the main subject matter of documents in a catalog.
- Extract brand information or company names from documents or other text for identification purposes
What will Language detection capability of Text Analytics return
- Language name, i.e. “English”
- ISO 6391 language code, i.e. “en”
- Score indicating level of confidence n the language detection
For AI to accept vocal commands and provide spoken resources, what capabilities must it support
- Speech recognition - ability to detect and interpret spoken input
- Speech synthesis - ability to generate spoken output
How can software analyze speech patterns to determine recognizable patterns that are mapped to words
Uses
- An acoustic model that converts audio signal into phonemes (representations of specific sounds)
- A language model that maps phonemes to words, usually using a statistical algorithm that predicts the most probable sequence of words based on phonemes
What are phonemes
Representations of specific sounds
A phoneme is the smallest unit of sound in speech. When we teach reading we teach children which letters represent those sounds. For example – the word ‘hat’ has 3 phonemes – ‘h’ ‘a’ and ‘t’.
What are recognized words that are typically converted to text used for
- Provide closed captions for recorded or live videos
- Create transcript of a phone call or meeting
- Automated note dictation
- Determine intended user input for further processing
How is speech synthesis the reverse of speech recognition
Concerned with vocalizing data, usually by converting text to speech. Speech synthesis solution typically requires
- Text to be spoke
- Void to be used to vocalize the speech
How does a system synthesize speech
- Tokenizes the text to break it into individual words
- Assigns phonetic sounds to each words
- Breaks the phonetic transcription into prosodic units (such as phrases, clauses, or sentences) to create phonemes that will be converted to audio format.
- Phonemes are then synthesized by audio by applying a voice, which will determine parameters such as pitch and timbre; and generating an audio wave form that can be output to a speaker or written to a file
What are prosodic units
Phrases, clauses, or sentences
How can you use output of speech synthesis
- Generate spoken responses to user input
- Create voice menus for telephone systems
- Read email or text messages aloud in hands-free scenarios
- Broadcast announcements in public locations, such as railway stations or airports
What supports phrase extraction
Cognitive Services
What does Translator Text API offer to fine-tune results
- Profanity Filtering
2. Selective translation
What does Speech service APIs include
- Speech-to-text
- Text-to-speech
- Speech Translation
Can Speech service translate from audio source to text
Yes, it can
What does Translator Text Support
Text-to-text translation
What does Speech service support
Enables speech-to-text and speech-to-speech translation