Unit 4 - Chapter 11 Flashcards
The program that performs lexical analysis is called a
lexical analyzer or more commonly, a scanner
tokens
are syntactical units that are treated as single, indivisible entities for the purposes of translations.
every scanner performs virtually the same set of operations:
(1) it discards blanks and other nonessential characters and looks for the beginning of a token; (2) when it finds the beginning, it puts characters together until (3) it detects the end of the token, at which point it classifies the token and begins looking for the next one. This algorithm works properly regardless of what tokens look like.
parsing phase
a compiler determines whether the tokens are syntactically legal statement of the programming language.
parse tree
It starts from the individual tokens in the statement, a,=,b,+,and c, and show how these tokens can be grouped into predefined grammatical categories such as , , and until the desired goal is reaches - in this case, . The successful constructions of a parse tree is proof that this statement is correctly formed according to the rules of the language. If a parser cannot produce such a parse tree, then the statement is not correctly formed.
In the field of compiler design, the process of diagramming a high-level language statement is called
parsing. and it is done by a program called a parser.
The parser must be given a formal description of the syntax - the grammatical structure - of the language that it is going to analyze. The most widely used notations for representing the syntax of a programming language is called
BNF, and acronym for Backus-Naur Form
In BNF
the syntax of a language is specified as a set of rules, also called productions. The entire collections of rules is called a grammar.
Terminals
are the actual tokens of the language recognized and returned by a scanner
nonterminal
A nonterminal is not an actual element of the language but an intermediate grammatical category used to help explain and organize the language.
Goal symbol
This is the final nonterminal, and it is the nonterminal object that the parser is trying to produce as it builds the parse tree.
lambda λ represents
the null string - nothing at all. It is possible that a nonterminal can be “empty”, and the symbol λ is used to indicate this.
look-ahead parsing algorithms
These algorithms “ look down the road” a few tokens to see what would happen if a certain choice is made.
Recursive definition
defines the nonterminal symbol in terms of itself.
Grammar that allows the construction of two or more distinct parse trees for the same statement is said to be
ambiguous.