Sentiment Analysis: Sentiment and Rhetoric Flashcards

Question 1

Q

Sentiment Analysis (SA)

Answer

A

Computational study of opinions, sentiments, and emotions express in text.
A kind of semantic analysis: feeling, emotion, judgment in language.
-Comes into play with the rise of user generated content and social media
-Reviews are the most common use case
-Current state of the art focus on feature level objects

Question 2

Q

Sentiment Scoring

Answer

A

Most basic is positive/negative, but doesn’t say how it’s positive or negative - “table stakes” - what’s needed to get into the game.

Usually scored from -1 to 1 is simplest
0 can mean neutral or non-detection of polarity because the system only knows how to detect positive or negative. Sometimes it’s net over a document. Averages out. Should always find out what 0 means.

Question 3

Q

Two approaches to Sentiment

Answer

A

Supervised ML

- Unsupervised sentiment lexical knowledge base. What is valence score.

Question 4

Q

Supervised ML

Answer

A

Apply classification to sentences or documents.

Binary classifier SDG using sklearn, an SVM implementation.
Requires training data to build the model.

Pros

Can be ready quickly if you have a lot of training data
Don’t need to develop a coded vocabulary with valence score

Cons

It’s opaque, not explainable (not XAI)
It’s only as granular as the training data

Process

Establish training set
Normalize text (expand contractions, spelling corrections, etc.)
Extract feature vectors. Might decide to stem, but sometime with sentiment we decide not to stem. Past tense could be negative when present tense is positive. Worked fine vs. working fine. Also may not want to remove stop words.
Train a binary classifier. Use SVM/SDG.
After QA, decide if more training data is needed.

Question 5

Q

Unsupervised sentiment lexical knowledge base

Answer

A

Biggest choice is which sentiment lexicon to use?

There are many out there. A lot of people use AFINN (“Affective lexicon by Finn Nielsen), 2, 477 clues or Liu’s lexicon 6, 800 clues.
Pick up clues and put them together to measure overall sentiment
MPQA (“Multi-Perspective Question Answering”) subjectivity lexicon: 8222 clues.
SentiWordNet: Labels all 1000k + WordNet synsets was created by a machine.
VADER (Valence Aware Dictionary for sEntiment Reasoning): 7500. Rule-based framework built for social media. Scores for words, emoticons, slang.
Pattern library lexicon: 3000 clues but mostly adjectives, handcoded with Valences. It’s great when words are mapped to WordNet. Generally works pretty well. Specializes in the area of mood.
Custom lexicon: as many clues as you want. This is the best because every domain is different and custom.

Question 6

Q

Does size matter with unsupervised sentiment kbs?

Answer

A

Some of the smaller ones outperform larger ones so not necessarily. It matter if the domain is similar or how carefully they are constructed.

Pros

Does not require training data
Very explainable (XAI)

Cons

Needs a coded vocabulary (lexical KB)
Can be cumbersome to maintain in the face of new tropes (words or figures of speech)

Process

Establish valence-weighted vocabularies
Normalize text.
Extract feature vectors.
Execute a scoring algorithm. Essentially adding sentiment in each chunk of text.
QA, tweak vocabulary and rerun until it passes. Might have to adjust weights.

Question 7

Q

More advanced techniques (hard)

Answer

A

Determining referents and/oro topics to which sentiment attaches
Classifying into more categories than positive/negative
Picking up on non-sentiment vocabulary differences that align with sentiment around a topic - rhetoric analysis

Question 8

Q

Straightforward approach with a chunker

Answer

A

Run a chunker and send NP-chunks and VP-chunks instead of sentences into a sentiment analyzer. Then you can presume that the main noun in a noun phrase is the object of the sentiment. This will be correct a lot of time. Doesn’t work well for negation or double negation. Part of text normalization to rip apart/transform negation so that not unappealing to appealing. Nullifier handling, “hardly”, etc. But don’t be too worried with big enough numbers might not need to worry about negation (just ignore it). Directionally correct result.

Another vulnerability
-Some sentiment attached outside the NP-chunk instead of within it

Question 9

Q

Dependency parser

Answer

A

Run a dependency parser
Follow dependency paths from a sentiment trigger until an object is found
Most of the time gets to the target of the sentiment

Question 10

Q

Hierarchical Sentiment Scoring

Answer

A

Dimensionality of sentiment: What kind of positive or negative sentiment?
Build a taxonomy of different types of sentiment. At the top positive/negative and then break each of those down.

Question 11

Q

Typologies of Emotion

Answer

A

William James in psychology- tried to break into 4 emotions (1890)
Watt Smith (2015): 154 emotions
As time goes by there are more emotions
Sentiment is broader than emotion
Shaver has ~135 but in a hierarchy of 6.
Positive:  Love, joy, surprise
Negative emotions:  anger, sadness, fear
2nd level has 25 or 30 emotions
Can build another level until all 135 are used
-Ready made starter vocabulary

Question 12

Q

Hybrid Approach (recommended approach)

Answer

A

Semiautomated feature engineering and sentiment lexicography

Way to bootstrap to help manually edited lexicons
Differential frequency analysis - looking at negative sentiment vs. positive sentiment
User reviews as training data (4 or 5 good - positive, 1 or 2 bad - negative, ignore 3s because of ambiguity)
Lexicographer is engineering features when putting them into the lexicon. We can semi-automate with a bootstrap, extract features differential frequency as candidate clues. Then hand it over to a person. Could have the person stick a label on it while reviewing.
Machine could automatically suggest a weight.
Lexicographer can assign dimensions to these things. Not a blank slate.
Maintains explainable AI because we can point back to vocabulary that was built.
Save time from building a custom lexicon. A lot less manual labor than if it was 100% manual.

Question 13

Q

Sentiment and Insight

Answer

A

Actionable Insight: What did consumers hate about the product? Need to give a user friendly presentation that non-technical people can understand.

Product
-Pull out themes, pull out sentences and highlight the trigger that matches the clue

Question 14

Q

Rhetoric and Sentiment

Answer

A

Words that don’t necessarily show up in a sentiment lexicon, but have certain connotations.
“pro-life” vs. “pro-choice”
“second amendment” vs. “assault weapons
“Eastwood” vs. “Mr. Eastwood”

Map and correlate these to sentiments. For example people who like American Sniper referred to Clint as “Mr. Eastwood” instead of “Eastwood.”

Question 15

Q

Opinion

Answer

A

A quintuple that has a target object, a feature of an object, sentiment of the opinion holder on feature or the object, who is giving the opinion, the time the opinion is expressed.

Question 16

Q

SA requires

Answer

Study These Flashcards

A

Named Entity Extraction, Information Extraction, Sentiment determination, Information/Data extraction

Question 17

Q

Facts can have sentiment

Answer

Study These Flashcards

A

“The phone broke in two days”

Sentiment Analysis: Sentiment and Rhetoric Flashcards

(17 cards)