B04 Sentiment Analysis Flashcards
What is Sentiment Analysis?
Sentiment analysis (or opinion mining) is the process of extracting an author’s emotional intent from text.
The task of sentiment analysis is not only about finding
the opinions about the whole entity but also the opinions about individual attributes of the entity and summarizing them.
Sentiment Analysis
Individual Attributes
aspects
Sentiment Analysis
The person making the opinion
Opinion holder
Sentiment Analysis
The nature of the sentiment expressed
orientation or polarity
Sentiment Analysis
The entity or aspect that the opinion is expressed about
Opinion target
Some attributes of sentiment analysis
1. Does a piece of text represent a positive or a negative sentiment? 2. What are the entities being discussed about, and are they being discussed about in a positive or negative way? 3. What attributes of the entity are discussed, and what are the sentiments expressed about them? 4. What do people think about this candidate or issue?
Challenges of Sentiment Analysis
- Cultural and demographic differences between authors.
- Discerning between feature-specific sentiment.
- Quantifying the hundreds of emotional states which are part of the human condition.
Plutchik emotion
framework classifies 8
evolutionary emotions
Anger, Fear, Anticipation, Surprise, Joy, Sadness, Trust, Disgust
Document Polarity
- Instead of trying to predict emotional states, an easier
approach is to simply state whether a document is positive or negative. - This is referred to as the polarity of a document.
- The approaches to calculating polarity vary and can either be fairly straightforward or rather sophisticated.
Opinion Words
A straightforward approach to calculating polarity involves the use of certain words that are associated with a particular emotional state.
Some examples of opinion words
Opinion words are often adjectives and adverbs (e.g. “good”, “bad”, “excellent”, etc.), although nouns (e.g., “trash”) or verbs (e.g., “annoy”) are also sometimes used.
Opinion or Sentiment Lexicon
- A collection of opinion words along with their polarity form what is known as an opinion or sentiment lexicon.
- Sentiment lexicons are created either through crowd
sourcing or by the labor of an author, and then validated by crowd sourcing or research. - We can calculate the polarity of a document by simply
adding up positive words and subtracting negative words.
Bing
Categorizes words in
a binary fashion into
positive and negative
categories.
AFINN
Assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment.
NRC
Categorizes words into categories of positive, negative, anger, anticipation, disgust, fear, joy, sadness, surprise, and trust.
Loughran
Categorizes words into categories of negative, litigious, positive, uncertainty, constraining and superfluous.
What does the get_sentiments() function do?
- Returns a specific sentiment lexicon in tidy format.
- Values for lexicon are either “afinn”, “bing”, “nrc”, or “loughran”.
- Other than using the get_sentiments(), we can also refer to the sentiments dataset directly.
Note with sentiments that:
- Not every English word is represented in the lexicons.
- The words do not take into account qualifiers. For example “no good” or “not true”.
- The size of the text analyzed can have an impact on the results.
Sentence Level Sentiment Analysis
Takes valence shifters into consideration in an effort to do more accurate sentiment analysis.
What are valence shifters?
Valence shifters are words that have an impact on the
overall polarity of a message.
The 4 categories of valence shifters:
- negators
- amplifiers
- de-amplifiers
- adversative conjunctions
Negators
Flip the sign of the polarized word.
“I do not love apple pie.”
Amplifier
Increases or intensifies the impact of a polarized word.
“I really love apple pie.”
De-amplifier
Reduces the impact of a polarized word.
“I hardly like apple pie.”
Adversative Conjunction
Overrules the previous clause containing a polarized word.
“I would love to bake apple pie but it’s not worth it.”
What does the sentimentr package do?
- Designed to quickly calculate text
polarity sentiment at the sentence level. - Optionally allows for aggregation by
rows or grouping variable(s).
What does the get_sentences() function do?
- Performs sentence boundary disambiguation.
- Returns a list of vectors of sentences.
What does the sentiment() function do?
- Approximates the sentiment (polarity) of text by sentence.
- Several polarity and valence shifter dictionaries can be used.
- Returns data table of element_id, sentence_id, word_count, and sentiment.
What does the sentiment_by() function do?
- Approximates the sentiment (polarity) of text by group.
- Returns data table of element_id, sentence_id, word_count, sd and ave_sentiment.