ch4. Lexical and syntax analysis

Computers/Programming Language

ch4. Lexical and syntax analysis

emzei 2012. 3. 15. 14:13

▷ PL에 대해 접근하는 방식

♠ Compilation

- use complier / 고급언어를 기계언어로 translate해줌

- PLs used for large application, such as C++, COBOL

♠ Pure interpreter

- no translate

- usually used for smaller systems in which execution efficiency is not critical, such as scripts embedded in HTML, written in languages such as JavaScript

♠ Hybrid System

- translate programs written in high-level languages into intermediate forms, which are interpreted.

- JIT(just-in-time) compiler가 전파되면서 실행속도 개선.. Java와 .NET

- JIT컴파일러는 하이브리드 시스템을 지연된 컴파일러 시스템으로 변환시킴.

♠ Formal description of PL syntax - Context-free Grammar (type-2), BNF

* Advantage of BNF

- clear, concise

- used as the direct basis for the syntax analyzer

- easy to maintain because of their modularity

♠ compiler는 문법 분석을 2가지로 나눠서 함 : lexical analysis, syntax analysis

* lexical analysis : small-scale language construct 다룸. such as names & numeric literals.

* syntax analysis : large-scale construct. such as expressions, statements, and program units.

♣ 2가지로 나눈 이유 : simplicity / efficiency / portability

- efficiency : lexical한건 device로부터 읽어와서 블라블라~ 속도 느림.

syntax하나 건 계산만 바로하면 되니까 속도 빠름.

- portability : reuse!!!

▷ Lexical Analysis : essentially a pattern matcher

※ pattern matcher attempts to find a substring of a given string of characters that matches a given character pattern : 즉 서브스트링 찾는거...

- 초기의 패턴 매처는 유닉스에 도입된 초기 버전의 텍스트 에디터.

♠ serve as the front end of a syntax analyzer

- syntax analysis at the lowest level of program structure.

- lexeme : collect logical groupings

- token : assign internal codes for categories of these groupings

- skipping comments and blanks (프로그램의 의미와는 전혀 상관없으니까 무시)

- insert lexems for user-defined names into the symbol table (컴파일러가 나중에 프레이즈 하게...)

- detect syntactic errors & report errors to the user

◈ 3 approches to building a lexical analyzer

- formal description of the token patterns of the language

(↑↑ regular expression)

- state transition diagram (state diagram)

: directed graph.

: if/switch statement-based program

: table-driven program

◈ state-diagram representation : finite automata

(lexical analyzer를 사용한 형식의 state-diagram이 수학 머신의 클래스로 다시 표현된거)

- finite automata can be designed to recognize a class of languagese

=> regular language

- regular expressions

- regular grammar (type-3 grammar)

▷The Parsing Problem: syntax analysis

♠ Construct parse trees for given program - include all of the syntactic info.

♠ 2 Goals of syntax analysis

> to check whether the input program is syntactically correct

> to produce either a complete parse tree, or trace the structure of the complete parse tree

♠ Parsing algorithms never look ahead more than one token into the input program : sufficient to allow parsing : 토큰은 하나씩 읽는것 만으로 충분

♠ According to building direction...

◑Top-down

- trace/build a parse tree in pre-order ⇔ leftmost

- parser's task : to find the next sentential form in that leftmost derivation

* sentential form : xAα (x:terminal/A:non-term/α:mixing string)

* Next choosing correct grammar rule is that has A as its LHS

- under one toen look ahead

* must choose the correct RHS on the basis of the next token of the input

♥ LL Parser : Table-driven parsing

♥ Recursive-descent parser

: Syntax Diagram (함부로 그리는게 아니얌! 규칙이 이쒀!!!)

: if-then-else programming technique

: restrictions in syntax diagram

◑Bottom-up

- Rightmost derivation

- Given a right sentential form α,

- LR algorithm (Left-to-right scan, Rightmost derivation)

▷ Recursive descent parsing

♠ Recursive-descent parsing process

- collection of subprograms

- recursive (nested) calls

- produce a parse tree in top-down manner

♠ EBNF representation

♠ Parser has a subprogram for each nonterminal in the grammar

▷ LL Grammar Class : using symbol table

♠ Problem 1 : left recursion

ex. A:=A+B 하면 recursion 발생하니까

left recursion removal ::: 책참고!!!.

♠ Problem 2 : to choose the correct RHS (PASS)

♠ LL Parser 따라 풀어보기!

▷ LR Parser : Bottom-up Parsing/Shift-Reduce Algorithm (Rightmost derivation)

♠ Reverse of a right-most derivation

- starts with the last sentential form (input symbol)

- until all that remains is the start symbol

♠ Task is

- To find the specific RHS in the sentential form (handle)

♠ Shift-reduce

* shift action: move the next input token onto the parser's stack

* reduce action : replaces a RHS on top of the parser's stack by its corresponding LHS

♠ Pushdown automata (PDA)

♠ LR parser : use a relatively small amount of code and a parsing table

♠ LR Parser 따라 풀어보기!

저작자표시 비영리 변경금지 (새창열림)

'Computers > Programming Language' 카테고리의 다른 글

ch6-2. Data Type (0)	2012.03.19
ch6-1. Data Type (0)	2012.03.17
ch5. Names, bindings, type checking, and scope (0)	2012.03.16
ch3. describe syntax and semantics (0)	2012.03.14
ch1. preliminaries (0)	2012.03.13

현재글ch4. Lexical and syntax analysis

xen guest installation, inheritance, Android, GUI, DICTIONARY, Tree, java, layout, Python, string, XEN, xen compile, domu, polymorphism, 자료구조, xen 4.4, protected, interface, list, Data Structure,

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

emzei