Identify features of common NLP Workload Scenarios Flashcards
Natural Language Processing (NLP)
In order for computer systems to interpret the subject of a text in a similar way humans do, they use natural language processing (NLP), an area within AI that deals with understanding written or spoken language, and responding in kind.
-Text analysis describes NLP processes that extract information from unstructured text.
Natural language processing might be used to create:
-A social media feed analyzer that detects sentiment for a product marketing campaign.
-A document search application that summarizes documents in a catalog.
-An application that extracts brands and company names from text.
Text Analysis - Technique’s
Tokenization
The first step in analyzing a corpus is to break it down into tokens. For the sake of simplicity, you can think of each distinct word in the training text as a token.
The phrase “we choose to go to the moon” can be represented by the tokens [1,2,3,4,3,5,6].
Frequency Analysis
After tokenizing the words, you can perform some analysis to count the number of occurrences of each token. The most commonly used words (other than stop words such as “a”, “the”, and so on) can often provide a clue as to the main subject of a text corpus.
From this information, we can easily surmise that the text is primarily concerned with space travel and going to the moon.
Machine learning for Text Classification
Another useful text analysis technique is to use a classification algorithm, such as logistic regression, to train a machine learning model that classifies text based on a known set of categorizations.
A common application of this technique is to train a model that classifies text as positive or negative in order to perform sentiment analysis or opinion mining.
With enough labeled reviews, you can train a classification model using the tokenized text as features and the sentiment (0 or 1) a label.
Semantic Language model
Semantic language model is a technique that utilizes the semantic structure of an utterance to better rank the likelihood of words compos-ing the sentence.
Common NLP tasks supported by language models
-Text analysis, such as extracting key terms or identifying named entities in text.
-Sentiment analysis and opinion mining to categorize text as (+) or (-).
-Machine translation, in which text is automatically translated from one language to another.
-Summarization, in which the main points of a large body of text are summarized.
-Conversational AI solutions such as bots or digital assistants can interpret natural language input and return an appropriate response.
Azure AI Language
Azure AI Language is a cloud-based service that includes features for understanding and analyzing text. You can use a language resource for authoring and prediction.
-Named entity recognition identifies people, places, events, and more.
-Entity linking identifies known entities together
-Personal identifying information (PII) detection identifies personally sensitive information
-Language detection
-Sentiment analysis and opinion mining
-Summarization
-Key phrase extraction
-Conversational language understanding
-Question Answering
Language studio - a web-based interface for creating and managing Conversational Language Understanding applications.
You can easily create a user support bot solution on Microsoft Azure using a combination of two core services:
-Azure AI Language: includes a custom question answering feature that enables you to create a knowledge base of question and answer pairs that can be queried using natural language input.
-Azure AI Bot Service: provides a framework for developing, publishing, and managing bots on Azure.
The automatic bot creation functionality, enables you to create a bot for your deployed knowledge base and publish it as an Azure AI Bot Service application with just a few clicks.
- You can use Azure AI Language Studio to create, train, publish, and manage question answering projects.
- To create a project, you must first provision a Language resource in your Azure subscription.
- After creating a set of question-and-answer pairs, you must save it.
- After you’ve created and deployed a knowledge base, you can deliver it to users through a bot.
- When your bot is ready to be delivered to users, you can connect it to multiple channels
You can import question and answer pairs from an existing FAQ document into a question answering knowledge base.
Conversational Language
To work with conversational language understanding, you need to take into account three core concepts:
-An utterance is something a user might say, which your application must interpret. For example, when using a home automation system, a user might use the following utterances: “Switch the fan on.”
-An entity is an item to which an utterance refers. For example, “fan”.
-An intent represents the purpose, or goal, expressed in a user’s utterance. For example, the intent is to turn a device on. The intent encapsulates the task (getting the time) and the entity specifies the item to which the intent is applied (the city).
The None intent is considered a fallback, and is typically used to provide a generic response to users when their requests don’t match any other intent.
You have published your conversational language understanding application. What information does a client application developer need to get predictions from it?
-The endpoint and key for the application’s prediction resource
AI Speech capabilities
AI speech capabilities enable us to manage home and auto systems with voice instructions, get answers from computers for spoken questions, generate captions from audio, and much more.
To enable this kind of interaction, the AI system must support two capabilities:
Speech recognition - the ability to detect and interpret spoken input.
-An acoustic model that converts the audio signal into phonemes (representations of specific sounds).
-A language model that maps phonemes to words, usually using a statistical algorithm that predicts the most probable sequence of words based on the phonemes.
Speech synthesis - the ability to generate spoken output
-Generating spoken responses to user input
Azure AI Speech
Azure AI Speech provides speech to text and text to speech capabilities through speech recognition and synthesis.
You can use prebuilt and custom Speech service models for a variety of tasks, from transcribing audio to text with high accuracy, to identifying speakers in conversations, creating custom voices, and more.
Azure offers both speech recognition and speech synthesis capabilities through Azure AI Speech service, which includes the following:
-The Speech to text API: You can use Azure AI Speech to text API to perform real-time or batch transcription of audio into a text format.
-The Text to speech API: Enables you to convert text input to audible speech, which can either be played directly through a computer speaker or written to an audio file.
Document Intelligence
Document intelligence describes AI capabilities that support processing text and making sense of information in text.
-It automates the process of extracting, understanding, and saving the data in text.
-Relies on machine learning models that are trained to recognize data in text.
-The ability to extract text, layout, and key-value pairs are known as document analysis.
Azure AI Document Intelligence
Azure AI Document Intelligence supports features that can analyze documents and forms with prebuilt and custom models.
Azure AI Document Intelligence consists of features grouped by model type:
-Prebuilt models - pretrained models that have been built to process common document types such as invoices, business cards, ID documents, and more. These models are designed to recognize and extract specific fields that are important for each document type.
-Custom models - can be trained to identify specific fields that are not included in the existing pretrained models.
-Document analysis - general document analysis that returns structured data representations, including regions of interest and their inter-relationships.
-The merchant name and address can be identified using the receipt model.
-The receipt analyzer model is available as a service when you create an Azure AI Document Intelligence resource.
Azure AI Search
Azure AI Search provides the infrastructure and tools to create search solutions that extract data from various structured, semi-structured, and non-structured documents.
-Image processing, content extraction, and natural language processing to perform knowledge mining of documents.
-Provides a programmable search engine built on Apache Lucene
-99.9% uptime SLA
Azure AI Search comes with the following features:
-Data from any source
-Supporting both simple query and full Lucene query syntax
-AI powered search
-Linguistic analysis for 56 languages
-Geo-search filtering based on proximity
-Configurable user experience