Syntactic Parsing Flashcards

1
Q

What is syntactic parsing?

A

It is the process of assigning a syntactic structure to a sentence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are two parsing structures?

A

Constituency structures and Dependency structures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the biggest challenge to syntactic parsing?

A

Ambiguity - specifically structural ambiguity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is structural ambiguity?

A

It is where multiple parse trees are possible in a grammar for the same sentence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is attachment ambiguity?

A

It is where a constituent could be attached to multiple places in a parse tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is coordination ambiguity?

A

It is where phrases can be conjoined in multiple ways

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is syntactic disambiguation?

A

It is choosing the correct parse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is useful to address ambiguity?

A

Dynamic Programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What algorithm is a classic dynamic programming approach to parsing?

A

The CKY algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is dynamic programming the same as?

A

Chart Parsing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does CKY require grammars to be?

A

In Chomsky Normal Form (CNF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the rules of Chomsky Normal Form?

A

The right side must be (i) two non-terminal nodes, or (ii) a single terminal node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we encode a parse tree in CNF?

A

We use a 2D matrix called a parse table. Indices before and after tokens are called fenceposts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does each cell represent in a parse table?

A

An entry for i,j

i is the start fencepost index for span

j is the end fencepost index for span

n is the length of the sentence

span (i,j) is constituent phrase with j - i tokens

span (0,n) is the sentence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a parse table, in what order do we move?

A

You move left to right, bottom to top

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

With the aid of the image, explain how you perform CKY parsing manually.

A

You start at the first position, and make your way from left to right looking from bottom to top. You look to see if there are any productions that end in any of the values in the cell you are at or below, and have the values that are observed to the left before it. For example, it the cell [0,3] we can produce a Verb (seen in cell [0,1]) followed by a NP (seen in cell [1,3]) by using a production from S, VP or X2.

17
Q

Does the CKY algorithm help to disambiguate possible parse trees?

A

No - the table is populated with all possible parse trees, it does not choose the best

18
Q

How does neural CKY work?

A

It learns to classify constituency parsing labels for text spans. Given a particular span, it looks the model can be trained such that it gives the correct constituency parse label.

19
Q

What is the input for a Neural CKY model?

A

We take our words, tokenize them to get the words, pass it into the word piece tokenizer to get the subwords and pass these to the BERT embedding layer to encode the words. We then use either the first or last subword embedding for each word to get back to words

20
Q

How are the word embeddings used in a Neural CKY model?

A

A post processing layer that is a deep learning stack (often a transformer) with a MLP is passed the word embeddings which are used to convert this into a sequence classifier to map them to a label.

21
Q

How can embeddings per word be used to represent a span?

A

Directional vectors where the embedding at the start is subtracted from the end, and vice versa to get a forward and backward vector, which are then concatenated in order to get the fenceposts, although the backward vector uses the values + 1 to account for the fact that the fencepost occurs somewhere between the words.

22
Q

When we have the span vectors for Neural CKY, what do we then do?

A

We then compute a score using an activation function (e.g. ReLU), the MLP output layer has dimensions equal to the number of non-terminal labels

23
Q

What does the activation function of the MLP produce in Neural CKY?

A

A distribution across all the possible labels stating which label is the most likely one

24
Q

How do we compute a score for the entire parse tree?

A

We sum the individual scores for all the possible spans and all possible labels to give a score for each possible parse tree.

25
Q

How is the best parse tree chosen?

A

The argmax function

26
Q

How frequently does argmax find a complete tree in practice?

A

95 percent of the time a complete tree is found