컴파일러구성 제 2 강 Recursive-descent Parser / Predictive Parser

Recursive-descent parsing ::= A top-down method that uses a set of recursive procedures to recognize its input with no backtracking. Create a procedure for each nonterminal. ex) G : S aa bb A aa c B bb d procedure ps; begin if nextsymbol = ta then begin getnextsymbol; pa end else if nextsymbol = tb then begin getnextsymbol; pb end else error end;

procedure pa; begin if nextsymbol = ta then begin getnextsymbol; pa end else if nextsymbol = tc then getnextsymbol else error end; procedure pb;... /* main */ begin getnextsymbol; ps; if nextsymbol = '$' then accept else error end. ω = aac$ procedure call sequence ::= leftmost derivation

The main problem in constructing a recursive-descent syntax analyzer is the choice of productions when a procedure is first entered. To resolve this problem, we can compute the lookahead of each production. LOOKAHEAD of a production Definition : LOOKAHEAD(A α) = FIRST({ω S * µaβ µαβ * µω V * T }). Meaning : the set of terminals which can be generated by * α and if α ε, then FOLLOW(A) is added to the set. Computing formula: LOOKAHEAD(A X 1 X 2...X n ) = FIRST(X 1 X 2...X n ) FOLLOW(A)

S asa ε A c Nullable Set = {S} FIRST(S) = {a, ε} FOLLOW(S) = {$,c} FIRST(A) = {c} FOLLOW(A) = {$,c} LOOKAHEAD(S asa) = FIRST(aSA) FOLLOW(S) = {a} LOOKAHEAD(S ε) = FIRST(ε) FOLLOW(S) = {$,c} LOOKAHEAD(A c) = FIRST(c) FOLLOW(A) = {c} LOOKAHEAD 를구하는순서 : Nullable => FIRST => FOLLOW => LOOKAHEAD

Definition : A α β P, LOOKAHEAD(A α) LOOKAHEAD(A β) = φ. Meaning : for each distinct pair of productions with the same left-hand side, it can select the unique alternate that derives a string beginning with the input symbol. The grammar G is said to be strong LL(1) if it satisfies the strong LL condition. ex) G : S asa ε A c LOOKAHEAD(S asa) = {a} LOOKAHEAD(S ε) = FOLLOW(S) = {$, c} LOOKAHEAD(S asa) LOOKAHEAD(S ε) = φ G 는 strong LL(1) 이다.

If a grammar is strong LL(1), we can construct a parser for sentences of the grammar using the following scheme. Terminal procedure: a V T, procedure pa; /* getnextsymbol => scanner */ begin end; if nextsymbol = ta then getnextsymbol else error getnextsymbol : 스캐너에해당하는루틴으로입력스트림으로부터토큰한개를만들어변수 nextsymbol 에배정한다.

Text p.284 A V N, procedure pa; var i: integer; begin case nextsymbol of LOOKAHEAD(A X 1 X 2...X m ): for i := 1 to m do px i ; LOOKAHEAD(A Y 1 Y 2...Y n ): for i := 1 to n do py i ; : LOOKAHEAD(A Z 1 Z 2...Z r ): for i := 1 to r do pz i ; LOOKAHEAD(A ε): ; otherwise: error end /* case */ end;

Improving the efficiency and structure of recursive-descent parser 1) Eliminating terminal procedures ::= In practice it is better not to write a procedure for each terminal. Instead the action of advancing the input marker can always be initiated by the nonterminal procedures. In this way many redundant tests can be eliminated. ex) text p.285 [ 예제 7.9] 2) BNF EBNF : reduce the number of productions and nonterminals. 1 repetitive part : { } 2 optional part : [ ] 3 alternation : ( )

< if_st > ::= if < cond > then < st > [ else < st > ] procedure pif; begin if nextsymbol = tif then begin getnextsymbol; pcond; if nextsymbol = tthen then begin getnextsymbol; pst end else error(10) end else error(11); if nextsymbol = telse then begin getnextsymbol; pst end end;

<id_list> ::= id {, id } procedure pid_list; begin if nextsymbol = tid then begin getnextsymbol; while (nextsymbol = tcomma) do begin getnextsymbol; if nextsymbol = tid then getnextsymbol else error(100) end end end;

< 문제 > 다음 grammar 를 extended BNF 로바꾸고그에따른 recursive-descent parser 를위한 procedure 를 작성하시오. <D> ::= label <L> integer <L> <L> ::= <id> <R> <R> ::= ;, <L> * <L> <id> {, <id> } ; <D> ::= ( label integer ) <id> {, <id>} ;

procedure pd; begin if nextsymbol in [qlabel,qinteger] then begin getnextsymbol; if nextsymbol = tid then begin getnextsymbol; while (nextsymbol = tcomma) do begin getnextsymbol; if nextsymbol = tid then getnextsymbol else error(3) end; end else error(2); if nextsymbol = tsemi then getnextsymbol else error(4) end else error(1) end;

Implement a recursive-descent syntax analyzer for the grammar given in exercise 5.31(text p. 229). Problem Specification Input : SPL program to find a Minimum and a Maximum. Output : left parse Methods : (1) write the getnextsymbol routine. (2) compute LOOKAHEADs for each production. (3) create a procedure for each nonterminal. (4) integrate the procedures with main program.

RDP 의단점 문법이변경되면프로그램을수정해야함. Predictive 파서 프로그램과테이블로분리 Driver routine + Table RDP 의단점을극복 문법변경시테이블만재구성 Predictive parsing ::= a deterministic top-down parsing method using a stack. The stack contains a sequence of grammar symbols.

The input buffer contains the string to be parsed, followed by $.

Current input symbol 과 stack top symbol 사이의관계에따라 parsing. Initial configuration : STACK INPUT $S ω$ Parsing table(ll) : parsing action을결정지어줌. M[X,a] = r : stack top symbol 이 X 이고 current symbol 이 a 일때, r 번생성규칙으로 expand.

Parsing Actions X : stack top symbol, a : current input symbol 1. if X = a = $, then accept. 2. if X = a, then pop X and advance input. 3. if X V N, then if M[X,a] = r (X ABC), then replace X by ABC else error.

Text p.291 Algorithm Predictive_Parser_Action; begin // set ip to point to the first symbol of ω$; repeat // let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then if X = a then pop X from the stack and advance ip else error(1) else /* X is nonterminal */ if M[X,a] = X Y 1 Y 2...Y k then begin pop X from the stack; push Y k Y k-1,...,y 1 onto the stack, with Y 1 on top; output the production X Y 1 Y 2...Y k end else error(2) until X = a = $ /* stack is empty */ end.

G : 1. S asb 2. S ba 3. A Aa 4. A b string : aabbbb Parsing Table: terminals nonterminal a b $ S 1 2. A 3 4.

STACK INPUT ACTIONS OUTPUT $S aabbbb$ expand 1 1 $bsa aabbbb$ pop a and advance expand $bs abbbb$ 1 1 $bbsa abbbb$ pop a and advance $bbs bbbb$ expand 2 2 $bbab bbbb$ pop b and advance $bba bbb$ expand 4 4 $bbb bbb$ pop b and advance $bb bb$ pop b and advance $b b$ pop b and advance $ $ Accept How to construct a predictive parsing table for the grammar.