2. Syntax Analysis Flashcards

1
Q

CFG

A

Context free grammar is a formal grammar which is used to generate all possible strings in given formal language
- CFG is defined by four tuples as:
**G= (V, T, P, S) **
where,
G- Grammar
T- finite set of Terminal symbols
V- Finite set of non terminal symbols
P- Set of production rules
S-start symbol
-Is used to derive the string
- we can derivatives string by repeatedly replacing a nonterminal by the right hand side of the production until all nonterminals have been replaced by terminal symbols

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Capabilities of CFG

A
  1. CFG Is useful to describe most of the programming languages
  2. Grammar is designed so properly that an efficient parser can be constructed automatically
  3. Using features of associatively and precedence information suitable grammars for expressions can be constructed
  4. Cfg is capable of describing nested structures like balanced parenthesis,
    matching begin-end,
    corresponding if-then-else’s and so on
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Derivation

A

Hierarchical representation of terminals and nonterminals
where,
Root of the parser- start symbol.
Leaves of the parse tree - terminals
Interior notes- non terminals
We have two options to decide which non terminal to be replaced with the production rule:
1. Leftmost derivation
2. Rightmost derivation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Leftmost derivation

A

Input is scanned and replaced with the production rule from left to right so we read the input string from left to right
eg:
production rules:
S = S + S
S = S - S
S = a | b |c

i/p:
a - b + c
The leftmost derivation is:
S = S + S
S = S - S + S
S = a - S + S
S = a - b + S
S = a - b + c

—- In the above example we can observe that Replacement according to the production rule is done from the left hand side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Rightmost derivation

A

1.** Input is scanned and replaced with the production rule **from right to left
2. So we read the input string from left to right
Example:
S = S + S
S = S - S
S = a | b |c

Input:
a - b + c
The right-most derivation is:
S = S - S
S = S - S + S
S = S - S + c
S = S - b + c
S = a - b + c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Parse tree

A
  1. Parse tree is the graphical representation of symbol. The symbol can be terminal or non-terminal.
  2. In parsing, the string is derived using the start symbol. The root of the parse tree is that start symbol.
  3. Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the operator in the parent node has less precedence over the operator in the sub-tree.

Parse tree follows three points:
1. Leafnotes–terminal
2. interior – Non - terminals
3. In-order traversal gives original input string

https://www.javatpoint.com/parse-tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ambiguity

A
  1. Grammar is set to be ambiguous if there exists more than one leftmost derivation or more than one rightmost derivation or more than 1 parse tree for a given input string
  2. or else called as unambiguous

eg: S = aSb | SS
S = ∈

If the grammar has ambiguity then it is not good for a compiler construction. No method can automatically detect and remove the ambiguity but you can remove ambiguity by re-writing the whole grammar without ambiguity.

https://www.javatpoint.com/ambiguity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

parser

A
  1. parser is a compiler that used to break the data into smaller elements coming from lexical analysis phase
  2. Parcel takes input as sequence of tokens and produces output in the form of parse tree

Parsing is a two types:
1. Top-down parsing
2. Bottom-up parsing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Top - down parsing

A
  1. Top down parsing is also known as Recursive parsing or predictive parsing
  2. In parsing starts from start symbol and it transforms it into input symbol
    Like above examples

https://www.javatpoint.com/parser

There Are two types of top down parsing:
1. Backtracking
2. Predictive parsers
A. Recursive decent
B. LL1 parser

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Back tracking

A

Top down parser starts from the the start symbol and match the input string against production rules to replace them if matched
– It means one derivation of production fails the syntax analyzer other production role

eg: Mam’s document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Recursive decent parsing

A
  1. Predictive parserno backtracking is required
  2. in this parsing technique** each non terminal is associated with recursive procedure**
    3.Look ahead pointer is used
    4.RHS of the production rule is directly Converted into code of respective procedure
    5.If RHS of the production rule is containing non terminal then it will invoke the respective procedure if it is terminal then its matched with the look ahead from the input string look ahead pointer is moved one position to right if match is found
    6.These procedures are responsible for matching the non terminal with the next part of the input
    7.If production rule have many alternatives then all alternatives are combined into single body of the procedure
    8.As it is top down parsing parser is activated by calling the procedure of start symbol
    eg:
    G:
    E –> i E’
    E’ –> + i E’ | e

    E()
    {
    if (lookahead == ‘i’)
    {
    match(‘i’);
    E’();
    }
    else if (lookahead == ‗$’)
    printf(“Parsing Successful”);
    else
    return error;
    }

E’()
{
if (lookahead == ‘+’)
{
match(‘+’);
if (lookahead == ‗i’)
match(‘i’);
E’();
}
}
match(char t)
{
if (lookhead== t)
{
lookahead = getchar();
}
else
printf(“Error”);
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

LL(1)

A

Diagram in the document
1. Non recursive top down parser
2. In LL(1),
1st L– Scanning of i/p from left to right
2nd L– Parsing technique that is leftmost derivation tree
1– we will read 1 symbol at a time

3. Predictive parser contains :
input
stack table
output
4. Input contains:
string to be passed
followed by $
Bottom of the stack marker
**5. Stack contains: **
Sequence of grammar symbols
Preceded by $
Bottom of the stack marker
— Stack holds leftmost derivation
5. Passing table is a **two dimensional array
M[A,a], **
Where,
A– Non terminal and
a– Terminal or symbol $

The parser is controlled by a programme that behaves as follows:
➢The program determines X, the symbol on top of the stack, and ‗a„the current input symbol.
➢These two symbols determine the action of the parser
There are three possibilities:
1. If X = a = $, the parser halts and announces
successful completion of parsing.
2. If X = a ≠ $, the parser pops X off the stack
and advances the input pointer to the next
input symbol.
3. If X is a nonterminal, the program consults
entry M[X, a] of the parsing table M.
This entry will be either an X-production of the grammar or an error entry.
➢If M[X, a] = {X → UVW}, the parser replaces X on top of the stack
by WVU (with U on top).
➢If M[X, a] = error, the parser calls an error recovery routine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

First & Follow
(Functions of LL1)

A
  1. Construction of predictive parser is aided by two functions First and follow which associated with grammar G
  2. These function allows us to fill the entries of productive parsing table for grammar G

FIRST:
Step for finding FIRST:
1. If X is terminal, then FIRST(X) is {X}
2. If X → ∈is a production , then add ∈to FIRST(X).
3. If X is a non-terminal and X →Y1Y2……….Yk
is a production, then place ‗a‘ in
FIRST(X) if for some i,‘a‘ i.s in FIRST(Yi
) and ∈is in all of FIRST(Y1
)….FIRST(Yi-1
) .
If ∈is in FIRST(Yj
) for all j=1,2,……,k then add ∈to FIRST(X)

** From every production rule Find the first letter and if epsilum comes at the last We need to continue to the next line
eg: https://www.youtube.com/watch?v=UXYqQ_CJsVE**

** FOLLOW:**
Step for finding FOLLOW:
1) FOLLOW(S) = { $ } // where S is the starting Non-Terminal and $ is the input right end
marker.
2) if there is a production A → αBβ ,then everything in FIRST(β) except for ∈is placed in
FOLLOW(B).
3) if there is a production A → αB or a production A → αBβ where FIRST(β ) contains ∈
then everything in FOLLOW(A) is in FOLLOW(B).
eg:
In the document

  1. Follow his calculator by the next letters in the right side of the first()
  2. Follow of starting is always dollar
    E-> T+ E———– Follow of E= $
  3. If Epsilon exists we need to jump to L H S

https://www.youtube.com/watch?v=WUwG50xJ9vc&t=0s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Parsing table construction

A

Didnt study
in the document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Bottom - up parsing

A
  1. Also known as shift reduce parsing
  2. used to construct parse tree for an input string
  3. Parsing starts with input symbol and constructs Parse tree up to start symbol by tracing out the rightmost derivation of string in reverse
    https://www.javatpoint.com/parser
17
Q

Shift reduce parsing
(Bottom up parsing)

A
  1. Process of reducing string to the start symbol of the grammar
    String—–(reduces to)—-start symbol
  2. Shift reduce parsing uses
    stack to hold the grammar
    input tape to hold the string
  3. It does two actions:
    Shift(Current symbol in the input string is pushed into stack)
    Reduce(Symbols will be replaced by non terminals)
  4. The symbol is the right side of the production and non terminal is the left side of the production
    eg:
    https://www.javatpoint.com/shift-reduce-parsing

There are two main categories of shift reduce parsing:
1. Operator-Precedence Parsing
2. LR-Parser

18
Q

LR parsing

A

Diagram in https://www.javatpoint.com/lr-parser
1. used to parse the large class of grammars
2. IN LR,
L– Scanning left to right
R– rightmost derivation in reverse
K– number of input symbols of the look ahead used to make number of parsing decision.
There are four types of LR Parsing:
1. LR (0) parsing,
2. SLR(1) parsing,
3. CLR(1) parsing
4. LALR(1) parsing.
LR algorithm:
1. The LR algorithm requires stack, input, output and parsing table.
2. In all type of LR parsing, input, output and stack are same but parsing table is different.

                   input buffer
 stacks
                   LR parser
									 
									 LR parsing table -  Input buffer is used to indicate end of the input and contains a string to be parsed followed by dollar symbol($string) - Stack is used to contain a sequence of grammar symbols which with dollar at the bottom of the stack - Parsing payable is a two Dimensional array contains two parts:  action part  go to part
19
Q

LR1 Parsing

A

Steps involved:
1. Forgiven input string write a cfg
2. check ambiguity of the grammar
3. Add Augment production in given grammar
4. Create canonical collection of LR0 items
5. Draw A data flow diagram dfa
6. Construct a LR 1 passing tree

20
Q

Augment grammar

A

Argument grammar G Will be generated if we add one more production in the grammar given G
– It will help the parser to identify when to stop the parsing and announce acceptance of input
Example
Given grammar
S → AA
A → aA | b
The Augment grammar G is represented by
S → S
S → AA
A → aA | b

21
Q

Canonical Collection of LR(0) items

A
  1. L R zero item is a production G with dot at some point on right side of the production
  2. These items are useful to indicate that how much of input has been scanned up to a given point in process of parsing
    eg: eg, its diag and its parsing table is in
    https://www.javatpoint.com/canonical-collection-of-lr-0-items
22
Q

SLR (1) Parsing

A
  1. SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing. The only difference is in the parsing table.To construct SLR (1) parsing table, we use canonical collection of LR (0) item.
  2. In the SLR (1) parsing, we place the reduce move only in the follow of left hand side.
  3. The only difference between LR1 and SLR1 is In LR1 reduction(R1, R2..) I’ve written in the whole row of states(L1, L2…) But in SLR follow method is used to find the follow and only at that state reduction is written
23
Q

YACC

A
  1. ** Yet another compiler compiler**
  2. it provides a tool to produce a parser for given grammar
  3. yacc is a programme designed to compile LALR(1) grammar
  4. It is used to produce source code of syntactic analyzer of the language produced by LALR(1) grammar
  5. i/p—- The rules are grammar(A CFG- file.y)
    o/p—- C program(A parser y.tab.c (yacc))
  6. Output file.output contains parsing tables
  7. The file.tab.h Contains declarations
  8. The parser called yyparse()
  9. Parser expects to use a function called yylex() to get tokens

Basic operational sequence of yacc:
1. gram.y– This file contains desired grammar in YACC format
2. YACC - - it shows the YACC programme
3. y.tab.c—- It is a source programme created by yacc
4. cc or gcc—C compiler
5. a. Out - - - executable file that will parse grammar given in gram.y