Chapter 10 - Recursive-Descent Parsing Flashcards
What are the two strategies for parsing?
Top-Down parser - attempts to carry out a derivation matching the input starting from the start symbol i.e. it constructs a tree from the root downwards in preorder
Bottom-up parser - Tries to reconstruct the parse tree from the leaves upwards by using the productions backwards
What is recursive-descent parsing?
It is a way in which to implement top-down parsing
How does recursive-descent parsing work?
Lets the input ‘guide’ the search. Adopts the following ideas:
Each parser tries to derive a prefix of the input according to the productions for the nonterminal
Each parser returns the remaining suffix if successful allowing this to be passed to the next parser for analysis
When parsing a word using a production, how does it work with recursive-descent parsing?
Basically, it finds the first production which is equal to the 1st character, and then ‘extracts’ that character, and returns the rest of the word. It then goes back to the start symbol again, and once again looks for a production which produces the 1st letter of the word. If it fails to find a production which can do that, then it ‘fails’.
How would a ‘look-ahead’ work within a parser?
This gives the parser a better view of the rest of the word. By looking ahead at the next symbol, if the production has multiple choices, then it can select a production based on the current letter being parsed, as well as the next symbol specified by the lookahead, leading to better, more informed choices.
How do you deal with the issue of picking the wrong production too early?
Full-backtracking is an option, but it is computationally expensive and produces unhelpful errors.
Predictive parsing - the grammar must adhere to certain conditions, but we will thereby know if the parser will work at time of construction
What happens if you feed a left-recursive grammar into a recursive-descent parser?
The parser will forever loop, as it will consume no input due to the left recursion when invoking a non-terminal.
What is predictive parsing?
Predictive parsing is when all parsing decisions are made based on a lookahead of limited length, typically one symbol.
The idea consists of:
Compute the set of terminal symbols that can start strings derived from each alternative, the first set.
If there is a choice between two or more alternatives, insist that the first sets for those are disjoint (grammar restriction)
The right choice can now be made simply by determining to which alternative’s first set the next input symbol belongs
The branches must be mutually exclusive
What is first and follow sets?
First sets are the combination, or result of a production i.e.
S -> ABC
A -> aA | epsilon
B -> b | epsilon
C -> c | d
First(C) = {c, d}
First(B) = {b}
First(A) = {a}
First(S) = first(ABC) = First(A) U First(B) U First(C) = {a, b, c, d}
What are nullable nonterminals?
Nullable nonterminals are nonterminals that can lead to epsilon i.e. the empty word/character
Note - anything that can eventually lead to epsilon counts as a nullable nonterminal i.e.
S -> B
B -> epsilon
Both B and S are considered nullable nonterminals here, thus the set size is equal to 2
However, if S -> AB
A -> a
B -> epsilon
Only B will be considered nullable, as both nonterminals for S have to be nullable for it to be considered nullable
How do you compute first sets?
You first compute the first sets for nullable nonterminals. Then, after doing that for every nullable nonterminal, you compute the first set for the other nonterminals.
How do you compute follow sets?
You take every single other production and figure out what it’s values are. That is then the follow set for the production you are looking at, and you disregard what it can reach. You then move on, and continue to ignore the value for the first production, or any production before the one you’re currently looking at. Example:
S -> ABC
A -> aA | epsilon
B -> b | epsilon
C -> c | d
The follow sets are as follows:
Follow(S) = {$} i.e. nothing
Follow(A) = {b, c, d}
Follow(B) = {c, d}
Follow(C) = {$} i.e. nothing