Parsing Flashcards
What is parsing?
The task of converting a program or other complex text into a parse tree
How do we write code that can convert a grammar into a parser?
- Identify the terminals of the grammar (the concrete syntax of the language)
- Ensure that the resulting string of terminals is induced by the grammar (could have been generated)
What are the two approaches to implementation?
Recursive descent
Table driven
Describe the pipeline from source text to a parse tree
Source text
Tokeniser
Parser
Parse tree
What is tokenisation?
The process that turns a sequence of characters into a symbol
What are some advantages of describing the tokens using a FSA?
Small, tight, fast, low memory footprint
Only looking for simple things, so no back-tracking or other complicated operations
Describe the approach of tokenisation using a FSA
Tokenise the input source, generating a stream of tokens for the grammar
Some tokens carry values with them
FSA keeps track of the characters it uses
Attaches them to the recognised symbol
Describe recursive descent parsing
Treat each production as a small recogniser that looks at the token stream and tries to recognise itself
‘Try and see’ method (back up the token stream)
Benefits of recursive descent parsing
Works with pretty much any grammar
Simple
Disadvantages of recursive descent parsing
Deep stack of recursive calls
Slow
Potentially a lot of back tracking
Describe table driven parsing
Start with the tokens
See which productions could have given rise to them
Build the tree as a constraints problem
Advantages of table driven parsing
Smaller and faster than recursive descent
Little more sensitive to features of the
grammar
No need for back tracking
Disadvantages of table driven parsing
Less clear
Can be tripped up by ambiguous grammars
What are the two approaches to tokenisation?
Sub grammars
Finite state automata
Advantages of using sub grammars for tokenisation
Just one thing to remember
Simply structure
Designed for pattern recognition