Syntactic Analysis: Parsing Flashcards

1
Q

Shallow parsing, aka light parsing, aka chunking

A
  • Runs from shallowest to deepest
  • Shallow parsing is the middle level
  • The shallowest parsing is from sentence to POS tagging only
  • Deepest is a full grammar tree, complicated tree
  • Shows how some chunks relate to other chunks
  • Full parse tree recognizes first every phrase, phrases can be made out of other phrases, then POS’s
  • Shallow parse tree breaks out (or chunks) main noun and verb phrases and prepositional phrases
  • Can run a full parse tree in python
  • POS tagging is a prerequisite for shallow parsing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why chunk?

A
  • Full parse takes longer
  • Full parse might not be needed
  • Full parse might not be accurate on User Generated Content
  • POS doesn’t group words together so sometimes we need the information about noun phrases, prepositional phrases, etc.
  • POS to chunking is just a little more time so why not do it? You’re almost there anyway.
  • Often will be doing NP chunking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

User Generated Content (UGC)

A

-Written by non-professionals, Twitter, FB, reviews, Reddit, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to chunk?

A

-Create rules with RegEx for obvious tag patterns

-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chinking

A
  • Parts we don’t want included are called chinks

- NP chunker just pulls NP, everything else is a chink

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Classifier-Based Chunkers

A

-TimBL, decision-tree based learning classifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

IOB Standard Annotation

A

Every word is a token
I: Inside a chunk
O: Outside a chunk (a chink)
B: Begins a new chunk

For annotating, followed by type of phrase, e.g., B-NP, I-NP, B-PP, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Uses for chunking

A

-Names Entity Recognition
-Pull out NP chunks
-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Full Grammar Parsing

A

Two types:

1) Constituency Parse, Parse, parse trees, dendrogram, penultimate is a POS tag, interior nodes, middle layer, phrases,
2) Dependency DAG - Directed Acyclic Graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Creating parse trees

A

Constituency parsers (90% of applications use these)

  • CYK algorithm
  • Stanford parser
  • Link grammar parser

Dependency parser

  • MST parser
  • MALT parser
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Uses for parse trees

A

How to choose dependency parser vs. constituency parser?

  • Constituency is often based on language. Languages with strict word order rules like English and German
  • Dependency parsers are easier to engineer for languages like Czech.

Type of application

  • Thematic extraction or text mining - constituency parser - titles, summaries.
  • Question-answering dependency parser. Tells us the object of a verb.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Combine lexical and syntactic analysis

A

Get themes

  • Wordnet, get noun word sense, synonyms, hyponyms
  • Wordnet, get verb sense, synonyms, troponyms
  • Use parser to pull out noun phrases and verb phrases that has the words pulled from WordNet

Improving Sentiment Analysis
-Vocabulary based
-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Valence

A

A number, usually -1 to 1, that indicates how emotional a word is. 0 is neutral. -1 is most negative and 1 is most positive.

Add them up to find out emotional charge.

How does a parser help with this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly