Sentiment Analysis Flashcards
What is the goal of sentiment analysis?
extract emotions, sentiments, and opinions
expressed by humans in texts
use the information for business or intelligence purposes
essentially is opinion mining
subjective analysis and data
a thought, belief, or judgement about someone or something
often the first step for sentiment analysis
Bing Liu’s Model for Sentiment Analysis
an opinion is a quintuple (o, f, so, h, t)
- o: the target object of opinion aka entity
- f: a feature of the object aka aspect
- s: sentiment value (positive, negative, objective, or numerical value)
- h: the sentiment holder
- t: the time
lexicon-based binary model
use a lexicon of opinion with polarity
lexicon
list of words of all expression
SENTENCE/DOCUMENT LEVEL
rule-based subjectivity classifier
rule-based sentiment classifier
(1) text is subjective if it has ‘n’ words from the emotion lexicon (‘n’ is fixed by an expert), else objective
(2) applied to objective text only. count the number of positive and negative words/phrases in the text
FEATURE LEVEL: rule-based sentiment classifier
assume feature can be identified in a previous step
identify emotion associated with those features
count negative and positive emotion words/phrases in the lexicon
negative if more negative than positive, positive if more positive than negative, otherwise neutral
feature-based rule-based sentiment classifier
input: an (f, S) pair where f is a product feature and S is a sentence containing the feature
output: a label in either negative, positive or neutral
Protocol: consider S = W1 .. Wn, the sentence containing f, with n as its length
-> select the emotions words wi in S
-> assign orientations to each of these words
negative = -1 // positive = 1 // neutral = 0
-> sum up the orientation and assignment a label to (f,S) accordingly
Limitations for Binary Lexicon Based Model
certain words are context-independent, however, others are context-dependent -> small positive consumption & have to deal with negations and intensifiers
Rules-based lexicon analysis: gradable
use a range of sentiment instead of a binary system and deals with the following rules to adjust the emotional weights: negation, capitalization, intensifier, diminisher, exclamation and emotion rules
the valence of the text is the sum of the weights of the emotions weights
Naive Bayes - Corpus-based
assign the sentiment or class having the highest posterior probability
Laplace smoothing
where p(t|s) = count(t,s) + 1 / count(s) + |V|