Dependency Parsing Flashcards

Question

Explain what is happening in the image

Answer 1

It shows the Arc Standard approach to dependency parsing. We start with the root node, check the oracle and shift the first word onto the stack, we check the oracle and shift the second word onto the stack. The oracle then says to perform a right arc from 'book' to 'me'. We the consult the oracle again, it says to shift 3 times. We then perform a left arc having consulted the Oracle, which connects 'flight' to 'morning'. This continues until we reach the point where the Oracle states that we are done.

Answer 2

The word that the relation is going to is removed from the stack, the word that the relation starts from remains on the stack.

Answer 3

It is a greedy algorithm - other parse trees that are viable may be ignored It assumes that the Oracle is 100 percent perfect which is not true in practice The action set is needed for each relation, meaning that large action space needs to be processed

Answer 4

The Arc Eager approach and the Beam Search

Answer 5

It is trained on treebanks using a supervised ML approach. We compile training examples of configurations and transition pairs. Configurations are a stack, word list and existing relation set. A transition is the action type. Feature templates are used for this model

Answer 6

Lemmas, POS, wordforms, word embeddings, dependency relations

Answer 7

It can slow the machine learning training down

Answer 8

We apply it to the training input to create the inputs for the ML algorithm. X is the value set for the feature types. Y is the actions that we take for the the same input from the reference parse tree. We then apply an ML algorithm such as LR, SVM or deep learning models

Answer 9

it is a different way of doing dependency parsing where we encode possible trees as directed graphs. They allow for the production of non-projective graphs and can handle long distance dependencies by scoring entire trees

Answer 10

It is a graph based approach to dependency parsing. We have a score which we want to maximise, which is the sum of the score for all its edges. We have an input sentence, and a number of possible parse trees. We want to find the best possible parse tree by taking argmax of all possible parse trees for that sentence.

Answer 11

They must be a maximum spanning tree, where every vertex (node) has one incoming edge

Answer 12

We use greedy selection to choose the most probable edge weight, and we perform cleanup to avoid cycles by adjusting weights using heuristics

Answer 13

It shows edge factored dependency parsing. We start at the root, and take the most probable edge, which is 12, then we do this again from the node we have reached, which is book, and choose the most probable edge again, which is 7 to take us to flight. Then from flight we go to that. Although we could go to flight, we will create a cycle, therefore a cleanup needs to be performed.

Answer 14

It is performed by subtracting the max edge weight (incoming) from all edge weights (incoming) We then collapse nodes in a cycle into a single node This process is then repeated recursively

Answer 15

This image shows how cleanup is performed in edge-factored dependency parsing. Here, we look at each node and its incoming edges and subtract the maximum edge from all of the other incoming edges. We then observe to see if there are any cycles, which in the top right graph, there are. We therefore collapse that node into a single node, called tf and adjust the edges such that we have a valid graph. This is repeated for as long as we have cycles

Answer 16

We apply a feature template to the training examples to produce X. We then produce a Y which is the edge probability or predictions. We have the ground truths and use a standard supervised learning algorithm as we did for the transition based approach (Arc Factored)

Answer 17

It uses a LSTM model using features from (word, POS and character) embeddings. The character embeddings help handle rare words that are not in the training vocabulary

Answer 18

Exact Match (is it perfect? conservative and minor errors cause failure) Labelled Attachment Score (TP = correct assignment of word → head + dep relation) Unlablled Attachment Score (TP = correct assignment of word → head) Label Accuracy Score (percentage of words with correct edge label)

Dependency Parsing Flashcards

(42 cards)