Compilers Flashcards
SDT: Synthetized and Inherited attributes
Synthetized:
Are evaluated in rules where the associated symbol is on the left side of the production.
(Once all the symbols in the production have been evaluated).
Example on simple declarations:
T -> int {T.type = integer} | float {T.type = real};
Inherited:
Are evaluated where the associated symbols is on the right side (A production is evaluating symbol by symbol)
Example on simple declarations:
D -> T L {L.inh = T.type}
L -> id {addtype(L.inh, id.entry)}
L -> L1, id {L1.inh = L.inh; addtype(L.inh, id.entry);}
SDD
Syntax-Directed Definition is a context-free grammar in which:
- each symbol can have set of attributes
- each production can have set of semantic rules
- evaluating attributes, interacting with the symbol table, writing lines of intermediate code to buffer …
SDT: Evaluation orders for SDD’s
- An attribute at a node cannot be evaluated until all attributes upon which depends are evaluated.
- the dependency relations in a parse tree define a dependency graph
- any topological sort of the dependency graph is an allowable order of evaluation for an SDD
- any DAG has at least one topological sort
- checking the dependency graph for cycles is NP-hard
- it is possible to define classed of SDD’s in such a way that
- cycles are not allowed (S-attributed: no inherited attributes)
- translation is performed in conntection with top-down or bottom-up parsing, without explicitly creating the tree nodes (L-attributed)
- it is possible to define classed of SDD’s in such a way that
SDT Schemes
Syntax-directed translation can be performed:
- createing a parse tree
- visiting the parse tree and evaluating an SDD accordint to a topological sort of the dependency graph
A syntax-directed Translation Scheme in an SDD with action of each semantic rule embedded at some positions in the right side of the associated production:
- an SDT implementation executes each action as soon as all the grammar symbols to the left of the actions are processed
- an SDT having all actions to the right ends of the production is called postfix SDT.
- the action a in the rule A → X {a} Y should be performed:
- as soon as this occurrence of X appears on top of the stack if the parser is bottom-up
- if the parser is top-down:
- if Y is non-terminal: just before attempting to expand this occurrence of Y
- if Y terminal just before checking fro Y on the input
S-attributes SDD
- A syntax-directed definition is S-attributed if every attribute is synthetized.
- all semantic ruels use only attributes of symbols in the right side of the associated productions
- S-attributed definitions can be evaluated in any bottom-up order.
- The evaluation order of function post_order(rootNode) corresponds to the the order in which the bottom-up parser creates the node.
S-attributed SDD’s conversion to postfix SDT
- S-attributed SDD’s can be converted to postfix SDT’s by placing each action at the right end of the associated production
- actions in postifix SDT can be exectued by a bottom-up parser with analog reductions
- Stack implementation of postfix SDT:
- synthetized attributes can be placed onto the parser stack together with grammar symbols.
- when a handle β is on top of the stack, all of its synthetized attributes have been evaluated
- when a reduction of β occurs, the associated action can be executed
- synthetized attributes can be placed onto the parser stack together with grammar symbols.
L-attributed SDDs
- an SDD is L-attribyted if any production A -> X1 X2 … Xn has:
- synthetized attribytes
- inherithed attribytes Xi.a computed in terms of inhertihed attribytes associated with A
- and inherited or synthetized attributes associated with symbols X1 X2 … Xi-1 located at the left (L) of Xi
- conversion to SDT:
- action that compute an inherited attribute for a symbol should be performed before the occurrence of that symbol
- action that compute synthetized attribytes should be performed at the end of the production.
Bottom-up evaluation of L-attributed defintions
a bottom-up parser is aware of the production it is using only when it performs a reduction, therefore can be executed only at the end of the production
But, action that compute inherited attriutes are not placed at the end of productions… we need to transform L-attribyted defintion into equivalent defintion where all actions are executed at the end of the production.
Inheriting attributes on the parser stack:
- A -> X { Y.i = Xs} Y where:
- X.s is synthetized
- Y.i is inherited defined by a copy rule
- X.s is already on the parser stack, thus, we can eliminate the copy rule (Y.i = X.s)
- The above is possible only if the grammar allows for the position of the attribyte value to be predicted
- the value of A.s in the image can be either one or two positions before the stack
- to work around this issue we should insert a marker non-terminal M with synthetized attribute M.s in rule(2) in order to have A.s always one position before C.
Simulating the evaluation of inherited attributes:
- A → X {Y.i = f(X.s) } Y
- X.s is a synthetized attribute
- Y.i is an inherited attribyte not defined by a copy rule
- since Y.i is not a copy of X.s the value is not already inside the stack prior to the reduction to Y
- it is possible to insert the new maker non terminal M before Y with:
- inherited attribute M.i = X.s
- a synthetized attribyte M.s to be copied in Y.i and to be evaluated in a new rule M → ε {M.s = f(M.i)}
Markers make possible to evaluate L-attributed translation schemes with bottom-up parsing, but:
- LR(1) grammar may not be an LR(1) grammar anymore after the insertion of markers.
- That is not the case for LL(1) grammars, that are a proper subset of LR(1)
LL(1) grammars
A grammar is said to be LL(1) if its predictive parsing table has no multily-defined entries:
- whenever A → 𝛂 | β then
- First(𝛂) ⋂ First(β) = ∅
- at most one of 𝛂 and β is nullable
- if 𝛂 is nullable then First(β) ⋂ Follow(A) = ∅
No ambiguos or left-recursive grammar can be LL(1).
LL(1) parser:
- scans the input from left to right (L)
- constructs a leftmost derivation (L)
- uses 1 lookahead input symbols in making parsing decisions.
The class of languages that can be parsed with LL(1) parsers a proper subset off the deterministic CFL’s.
SA: left recursive grammars
- a production like A → A 𝛂 is called left-recursive
- a grammar is left recursive if it can generate a derivation A => *A 𝛂
- a left-recursive gramma can cause a top-down parer to go into an infinite loop
Replacing left-recursive productions:
A → A 𝛂 | β ===
- A → β R
- R → 𝛂 R | 𝛆
SLR
SLR parsing tables for the same languages are much smaller respect to LR(1) parsing tables (several hundred states) but can contain conflicts.
LR parsers
- LR(0) parser
- scans the input from left to right (L)
- constructs a rightmost derivation in reverse (R)
- uses 0 lookahead input symbols in making parsing decisions
- parsable langagues are a subset of deterministc CFL’s
- LR(1) parser
- scans the input from left to right (L)
- constructs a rightmost derivation in reverse (R)
- uses 1 lookahead symbols in making parsing decisions
- parsable languages are exactly the calss of deterministic CFL’s
- parsing tables can be very large (several thousand states) for grammars generating common programming languages
- A grammar is LR(0) if the table generated by lr0Table(G) doesn’t include conflicts
- non-ambiguos
- A grammar is LR(1) if the table generated by lr1Table(G) doesn’t include conflicts
- LR(1) grammars are non-ambiguos.
*
- LR(1) grammars are non-ambiguos.
LALR(1) parsing
- Look-Ahead LR parser is a simplified version of a canonical LR parser, to parse (separate and analyze) a text according to a set of production rules specified by a formal grammar
- LALR(1) parsing tables have the same states of SLR tables and can conveniently express most programming languages
Core:
- two LR(1) items have the same core if they are identical except for the lookahead symbols
Union:
- a set of LALR(1) items is the union of sets of LR(1) items having the same core.
Conflicts:
- merging state with common cores can never produce shift/reduce conflict which was not present n one of the original states.
- merging may produce reduce/reduce conflicts
- class of parsable languages is a proper subset of the deterministic CFL’s
Predictive parsing
- backtracking can be avoided if it is possible to detect wich rule among A → 𝛂1 | 𝛂2 | … | 𝛂n has to be applied, by considering the current input symbol.