Lecture 5 Flashcards
Semantic Analysis
Syntactic analysis
- determines the syntactic category of the words
- decides phrase structure – how words are grouped
- assigns structural analysis to a sentence
Semantic analysis
- creates a representation of the meaning of a sentence
Clearly syntactic structure affects meaning (e.g. word order, phrase
attachment)
- “The man with the telescope watched Mary.”
- “Mary watched the man with the telescope.”
But meaning can determine syntactic structure
Recall that lexicalized statistical parsing used head word affinities (probabilities) to help determine parsing.
Tasks for Semantic Processing - 1
Decide if one sentence is a paraphrase of another (two way).
Your marks on the tests were excellent.
You scored very high on the exams.
Tasks for Semantic Processing - 2
Entailment: decide if the truth of one sentence implies the truth of
another (one way).
John lives in Toronto.
implies John’s residence is in Canada.
A semantic system
consists of different types of building blocks: entities, concepts, relations, and
predicates.
A semantic representation
shows how to put together blocks of a semantic system to describe a situation or
“semantic world”
Enables reasoning about that
semantic world
Semantic Representations
To link the surface, linguistic elements to
the non-linguistic knowledge of the world
Many words, few concepts
Semantic Representations
To represent the variety at the lexical
level at a unified conceptual level
* Unambiguous representations;
canonical forms
Semantic Representations
Structures composed from a set of
symbols
* All languages have a predicate-
argument structure
* Correspond to relationships that hold
among concepts underlying
constituent words and phrases of a
sentence, and then across sentences
Semantics that words (or base noun
phrases) represent – the objects
Entities
– individuals such as a particular person, location or product
- John F. Kennedy, Washington,
D.C., Cocoa Puffs
Semantics that words (or base noun
phrases) represent – the objects
Concepts
– the general category of
individuals such as
- person, city, breakfast cereal
Semantics indicated by verbs, prepositional phrases and other structures
Relations between entities and concepts
* John F. Kennedy “is-a” person
Semantics indicated by verbs, prepositional phrases and other structures
Relations between entities or between
concepts
* Hierarchy of specific to more general
concepts
* Wide variety of other relations (e.g.,
people are related to organizations,
locations are related to people, etc)
Semantics indicated by verbs, prepositional phrases and other structures
Predicates representing verb structures,
sometimes called events
* Semantic roles, case grammar
* Can also be used for relations
between objects
Semantic Representations
Some representation approaches:
* First Order Logic
* Semantic Nets
* Conceptual Dependency
* Frames
* Rule-Based
* Conceptual Graphs
Semantics of events in sentences
In a sentence, a verb and its semantic roles form a proposition; the verb can be called the predicate and the roles are known as arguments.
Syntactic structure is not the same as semantic structure
Syntactic similarities hide semantic dissimilarities
* We baked every Saturday morning.
* The pie baked to a golden brown.
* This oven bakes evenly.
3 subject NPs perform very different roles in regard to bake
Fillmore, Charles (1968) “The Case for Case.”
* A response to Chomskyʼs disregard for any semantics
* “A semantically justified syntactic theory”
Some of Fillmore’s original set of roles still in use as general descriptors of
roles
Agentive (A) - the instigator of the action, an animate being
* John opened the door.
* The door was opened by John.
Instrumental (I) - the thing used to perform the action, an inanimate object
* The key opened the door.
* John opened the door with the key.
Locative (L) - the location or spatial orientation of the state or action of the verb
* Itʼs windy in Chicago.
Verb-specific Roles
General thematic roles don’t work
for many verbs and roles
* Verb-specific roles are proposed in
treebanks
* PropBank annotates the verbs of
Penn Treebank
* FrameNet annotates the British
National Corpus
Automatic Semantic Role Labelling (SRL)
Define an algorithm that will process text and recognize roles for each
verb
* Task: given a verb in a sentence, find and label all arguments
Automatic Semantic Role Labelling (SRL)
A machine learning classification task: for each constituent in the
parse tree of the sentence, classify the argument role it has for the
verb
- For each constituent, define features of semantic roles
- Each feature describes some aspect of a text phrase that can help
determine its semantic role of a verb, e.g., the verb, POS tags, its position
in parse tree, etc. - Machine Learning process:
- Training a classifier on Treebank annotated with semantic roles (PropBank
or FrameNet) - Then classify syntactic phrases as to their roles
Parse Tree Constituents
- Each noun phrase is a candidate for role labeling based on its function relative to
its head verb (note explore has Arg0 at a distance.) - Define features from sentence processed into parse tree with Part-of-Speech tags
on words
Standard Features of an Argument Structure that Supports Role Labeling
PREDICATE: The predicate verb from the trainingdata. Usually stemmed or lemmatized
* “face” and “explore”
Standard Features of an Argument Structure that Supports Role Labeling
PHRASE TYPE: The phrase label of the argument candidate, e.g., NP, POS tags for single words
Standard Features of an Argument Structure that Supports Role Labeling
POSITION: Whether the argument candidate is before or after the predicate.
Standard Features of an Argument Structure that Supports Role Labeling
VOICE: Whether the predicate is in active or passive voice (passive voice is recognized if a past participle verb is preceded nearby by a form of the verb “be”)
Standard Features of an Argument Structure that Supports Role Labeling
SUBCATEGORY: The phrase labels of the children of the predicate’s parent in the syntax tree, subcat of “faces” is “VP -> VBZ NP”
Standard Features of an Argument Structure that Supports Role Labeling
PATH: The syntactic path through the parse tree from the argument constituent to the predicate.
* Arg0 for “faces”: NP -> S -> VP -> VBZ
Standard Features of an Argument Structure that Supports Role Labeling
HEAD WORD: The head word of the argument constituent
* Main noun of NP (noun phrase)
* Main preposition of PP (prepositional phrase)
* The part of speech tag of the head word of the argument constituent.
Standard Features of an Argument Structure that Supports Role Labeling
There are additional features such as:
* Temporal Cue Words: Special words occurring in ArgM-TMP phrases.
* Governing Category: The phrase label of the parent of the argument candidate
Automatic SRL – Constraints and Challenges
Results of the labeling classifier are
probabilities for each label for that
constituent
Automatic SRL – Constraints and Challenges
Use these with constraints to
assign a label
* Two constituents cannot have the
same argument label,
* A constituent cannot have more than
one label
* If two constituents have (different)
labels, they cannot have any overlap,
* No argument can overlap the
predicate.
Automatic SRL – Constraints and Challenges
For each verb in a sentence, the number
of constituents in the parse tree are
large compared to the number of
semantic roles
* Can be hundreds of constituents eligible to be labeled a role
* Leads to the problem of too many
“negative” examples
Sentiment Analysis - Affective States
Emotion: brief organically synchronized … evaluation of a major event
* angry, sad, joyful, fearful, ashamed, proud, elated
Mood: diffuse non-caused low-intensity long-duration change in subjective feeling
* cheerful, gloomy, irritable, listless, depressed, buoyant
Sentiment Analysis - Affective States
Interpersonal stances: affective stance toward another person in a specific interaction
* friendly, flirtatious, distant, cold, warm,
supportive, contemptuous
Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons
* liking, loving, hating, valuing, desiring
Personality traits: stable personality dispositions and typical behavior tendencies
* nervous, anxious, reckless, morose, hostile, jealous
Sentiment Analysis
Sentiment analysis is the detection of
attitudes - “enduring, affectively colored
beliefs, dispositions towards objects or
persons
Sentiment Analysis - Challenges
Word sense ambiguity - Words can carry
sentiments offering useful information to
sentiment analysis task. But they also have
different meanings in different contexts
Sentiment Analysis - Challenges
Subtlety, sarcasm or metaphor
Sentiment Analysis - Challenges
Thwarted expectations and ordering effects - a lot of good words set up an expectation that is then negated.
Sentiment Analysis - Challenges
Domain adaptation - Certain sentiment-related indicators seem domain-dependent; sentiment classifiers (especially those created via supervised
learning) have been shown to often be domain dependent
Sentiment Polarity Classification
Treat as a document classification task
* Positive, negative, and (possibly) neutral
* sentiment words are often more
important than topic words, e.g., great,
excellent, horrible, bad, worst, etc.
Sentiment Polarity Classification - Steps
Step 1 – Cleaning and Tokenization
* For text from web, deal with HTML
and XML markup
* Or Twitter mark-up (names, hash
tags)
* Capitalization (preserve for words in
all caps)
* Emoticons/emojis
* Useful code for twitter and other
social media text:
Sentiment Polarity Classification - Steps
Step 2 - Extracting Features
Which words to use? (adjectives, or All words)
* All words turns out to work better
in many cases
Good to have syntax too.
* Counts of POS tags to characterize
text
* Constituent or dependency parses
* Particularly at phrase level to find
dependencies of opinion words
* Also for finding the scope of
negation
Handling negation is important
* Typical approach
1. Look for “prototype” negation word
(negation cue words) like not, no and
never
2. Add a “negated context” to the
features
Sentiment Lexicons
Sentiment lexicons are lists of words and phrases that are commonly used to express positive or negative sentiments
MPQA Subjectivity Lexicon
Subjectivity Lexicon from the MPQA project with Jan Wiebe
* Gives a list of 8,000+ words that have been judged to be
weakly or strongly positive, negative or neutral in subjectivity
LIWC – Linguistic Inquiry and Word Count
Text analysis software based on dictionaries of word dimensions
* Dimensions can be syntactic
* Pronouns, past-tense verbs
- Dimensions can be semantic
- Social words, affect, cognitive mechanisms
ANEW
Affective Norms for English Words
* Provides a set of emotional ratings for a large number of words in the
English language
ANEW
Participants gave graded reactions from 1-9 on three dimensions
* Good/bad, psychological valence
* Active/passive, arousal valence
* Strong/weak, dominance valence
Lexical Semantics - Lexicons
– list of words (or lexemes or stems) with basic info
Lexical Semantics - Dictionaries
– a lexicon with definitions for each word sense
* Most are now available online
Lexical Semantics - Thesauruses
– add synonyms/ antonym for each word sense
* WordNet
Lexical Semantics - Semantic networks – add more semantic relations, including semantic categories
* WordNet, EuroWordNet
Lexical Semantics - Ontologies
– add rules about entities, concepts and relations, semantic categories
* UMLS
Lexical Semantics - Semantic Lexicon
– Lexicon where each word is assigned to a semantic class
* LIWC, ANEW, Subjectivity Lexicon