COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Similar documents
CST-402(T): Language Processors

Working of the Compilers

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Programming Language Syntax and Analysis

Lexical Scanning COMP360

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Theory and Compiling COMP360

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

UNIT III. The following section deals with the compilation procedure of any program.

COP 3402 Systems Software Syntax Analysis (Parser)

Building Compilers with Phoenix

Compiling Regular Expressions COMP360

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

A programming language requires two major definitions A simple one pass compiler

CMSC 330: Organization of Programming Languages. Context Free Grammars

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CSE302: Compiler Design

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Lecture 4: Syntax Specification

Lexical Analysis. Introduction

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

List of Figures. About the Authors. Acknowledgments

CS 314 Principles of Programming Languages

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Lexical and Syntax Analysis

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Programming Languages and Compilers (CS 421)

CSCI312 Principles of Programming Languages!

CS 314 Principles of Programming Languages. Lecture 3

Syntax. In Text: Chapter 3

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) I. Major Phases of a Compiler. Programming Languages & Compilers

Figure 2.1: Role of Lexical Analyzer

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Part 5 Program Analysis Principles and Techniques

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Compiler Design (40-414)

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Chapter 3. Describing Syntax and Semantics ISBN

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

A simple syntax-directed

CS131: Programming Languages and Compilers. Spring 2017

Introduction to Lexing and Parsing

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

CS415 Compilers. Lexical Analysis

A Simple Syntax-Directed Translator


LECTURE NOTES ON COMPILER DESIGN P a g e 2

Crafting a Compiler with C (II) Compiler V. S. Interpreter

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Compiler Design Overview. Compiler Design 1

CMSC 350: COMPILER DESIGN

Chapter 4. Lexical and Syntax Analysis

CMSC 330: Organization of Programming Languages

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Introduction to Lexical Analysis

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Compiler phases. Non-tokens

CS 4201 Compilers 2014/2015 Handout: Lab 1

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

Question Bank. 10CS63:Compiler Design

VIVA QUESTIONS WITH ANSWERS

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

ICOM 4036 Spring 2004

Optimizing Finite Automata

Properties of Regular Expressions and Finite Automata

2068 (I) Attempt all questions.

CSE 3302 Programming Languages Lecture 2: Syntax

Automatic Scanning and Parsing using LEX and YACC

CS 406/534 Compiler Construction Putting It All Together

Wednesday, September 9, 15. Parsers

Evaluation Scheme L T P Total Credit Theory Mid Sem Exam

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

LANGUAGE PROCESSORS. Presented By: Prof. S.J. Soni, SPCE Visnagar.

Chapter 2 - Programming Language Syntax. September 20, 2017

CS 415 Midterm Exam Spring SOLUTION

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

4. Lexical and Syntax Analysis

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

UNIT -2 LEXICAL ANALYSIS

Syntax. A. Bellaachia Page: 1

Parsing and Pattern Recognition

CS5363 Final Review. cs5363 1

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

4. Lexical and Syntax Analysis

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Software II: Principles of Programming Languages

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Transcription:

Pune Vidyarthi Griha s COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR By Prof. Anand N. Gharu (Assistant Professor) PVGCOE Computer Dept.. 22nd Jan 2018

CONTENTS :- 1. Role of lexical analysis 2. Parsing, token, pattern, lexemes lex. Error 3. Regular def. for language construct & string 4. Sequences, comments & transition diagram for recognition of tokens, reserved word & ident. 5. Introduction to Compiler & Interpreters 6. General model of Compiler 7. Compare compiler and interpreter 8. Use of interpreter & component of interpreter 9. Overview of Lex & YACC Specifications.

What s a compiler? All computers only understand machine language This is a program 10000010010110100100101 Therefore, high-level language instructions must be translated into machine language prior to execution 3

What s a compiler? Compiler A piece of system software that translates high-level languages into machine language while (c!='x') { if (c == 'a' c == 'e' c == 'i') else printf("congrats!"); program.c if (c!='x') printf("you Loser!"); Congrats! } Compiler prog 10000010010110100100101 gcc -o prog program.c 4

Compiler Complier:- These are the system programs which will automatically translate the High level language program in to the machine language program Source program High level Lang. Prog. Compiler Target program / M/C Lang. Prog. Database

Types of Compiler Cross Assembler:- These are the system programs which will automatically translate the Assembly Language program compatible with M/C A, in to the machine language program compatible with M/C A Source program Assembly Lang. Prog. Compatible with M/C A Cross Assembler Target program / M/C Lang. Prog. Compatible with M/C A M/C B

Types of compiler Cross Compiler:- These are the system programs which will automatically translate the HLL program compatible with M/C A, in to the machine language program compatible with M/C A, but the underlying M/C is M/C B Source program HLL Prog. Compatible with M/C A Cross Compiler Target program / M/C Lang. Prog. M/C B

Types of Compiler

Interpreter - It is the language translator which execute source program line by line with out translating them into machine language. - It does not generate object code.

Compiler vs Interpreter, C++, Visual Basic

Phases of compiler

Structure of Compiler Any compiler must perform two major tasks Compiler o o Analysis Analysis of the source program Synthesis of a machine-language program Synthesis 13

Structure of Compiler (Character Stream) Source Program Scanner Tokens Parser Syntactic Semantic Structure Routines Intermediate Representation Symbol and Attribute Tables (Used by all Phases of The Compiler) Optimizer Code Generator 14 Target machine code

Structure of Compiler (Character Stream) Source Program Scanner Tokens Parser Scanner (Lexical Analysis) Syntactic Structure Semantic Routines Intermediate Representation The scanner begins the analysis of the source program by reading the input, Symbol character and by character, and grouping characters into Attribute individual words and symbols (tokens) Tables (Used by all RE ( Regular expression ) Phases of NFA ( Non-deterministic Finite Automata ) The Compiler) DFA ( Deterministic Finite Automata ) LEX Optimizer Code Generator 15 Target machine code

Structure of Compiler (Character Stream) Source Program Scanner Tokens Parser Parser (Syntax Analysis) Syntactic Semantic Structure Routines Intermediate Representation Given a formal syntax specification (typically as a context-free grammar Symbol [CFG] and ), the parse reads tokens and groups them into Attribute units as specified by the productions of the CFG being used. Tables As syntactic structure is recognized, the parser either calls corresponding semantic (Used by routines all directly or builds a syntax tree. Phases of CFG ( Context-Free Grammar ) The Compiler) BNF ( Backus-Naur Form ) GAA ( Grammar Analysis Algorithms ) Optimizer Code Generator 16 Target machine code

Structure of Compiler (Character Stream) Source Program Scanner Tokens Semantic Routines Parser Syntactic Semantic Structure Routines Intermediate Representation Perform two functions Symbol and Check the static semantics of each construct Attribute Do the actual translation Tables The heart of a compiler (Used by all Syntax Directed Translation Phases of Semantic Processing Techniques IR (Intermediate Representation) The Compiler) Optimizer Code Generator 17 Target machine code

Structure of Compiler (Character Stream) Source Program Scanner Tokens Parser Syntactic Semantic Structure Routines Intermediate Optimizer Representation The IR code generated by the semantic routines is analyzed and transformed Symbol into and functionally equivalent but improved IR code Attribute This phase can be very Tables complex and slow Peephole optimization (Used by all loop optimization, register Phases allocation, of code scheduling The Compiler) Register and Temporary Management Peephole Optimization Optimizer Code Generator 18 Target machine code

Structure of Compiler (Character Stream) Source Program Scanner Code Generator Interpretive Code Generation Generating Code from Tree/Dag Grammar-Based Code Generator Tokens Parser Syntactic Semantic Structure Routines Intermediate Representation Optimizer Code Generator Target machine code 19

Structure of Compiler Code Generator [Intermediate Code Generator] Scanner [Lexical Analyzer] Non-optimized Intermediate Code Tokens Parser [Syntax Analyzer] Code Optimizer Optimized Intermediate Code Parse tree Code Optimizer Semantic Process [Semantic analyzer] Target machine code Abstract Syntax Tree w/ Attributes 20

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Code Optimization

Code Generation

Code Optimization

Structure of Compiler Compiler writing tools Compiler generators or compilercompilers oe.g. scanner and parser generators oexamples : Yacc, Lex 52

Overview of Lex & YAAC Lex: Theory. Execution. Example. Yacc: Theory. Description. Example. Lex & Yacc linking. Demo. 53

Lex lex is a program (generator) that generates lexical analyzers, (widely used on Unix). It is mostly used with Yacc parser generator. Written by Eric Schmidt and Mike Lesk. It reads the input stream (specifying the lexical analyzer ) and outputs source code implementing the lexical analyzer in the C programming language. Lex will read patterns (regular expressions); then produces C code for a lexical analyzer that scans for identifiers. 54

STRUCTURE OF LEX

Lex A simple pattern: letter(letter digit)* Regular expressions are translated by lex to a computer program that mimics an FSA. This pattern matches a string of characters that begins with a single letter followed by zero or more letters or digits. 56

Lex Some limitations, Lex cannot be used to recognize nested structures such as parentheses, since it only has states and transitions between states. So, Lex is good at pattern matching, while Yacc is for more challenging tasks. 57

Lex Pattern Matching Primitives 58

Lex Pattern Matching examples. 59

Lex..Definitions section %% Rules section.. %%.C code section (subroutines).. The input structure to Lex. Echo is an action and predefined macro in lex that writes code matched by the pattern. 60

Lex Lex predefined variables. 61

Lex Whitespace must separate the defining term and the associated expression. Code in the definitions section is simply copied as-is to the top of the generated C file and must be bracketed with %{ and %} markers. substitutions in the rules section are surrounded by braces ({letter}) to distinguish them from literals. 62

Yacc Theory: Yacc reads the grammar and generate C code for a parser. Grammars written in Backus Naur Form (BNF). BNF grammar used to express context-free languages. e.g. to parse an expression, do reverse operation( reducing the expression) This known as bottom-up or shift-reduce parsing. Using stack for storing (LIFO). 63

STRUCTURE OF YACC

Yacc Input to yacc is divided into three sections.... definitions... %%... rules... %%... subroutines... 65

Yacc The definitions section consists of: token declarations. C code bracketed by %{ and %}. the rules section consists of: BNF grammar. the subroutines section consists of: user subroutines. 66

yacc& lex in Together The grammar: program -> program expr ε expr -> expr + expr expr - expr id Program and expr are nonterminals. Id are terminals (tokens returned by lex). expression may be : o sum of two expressions. o product of two expressions. o Or an identifiers 67

Lex file 68

Yacc file 69

Linking lex&yacc 70

Thank You Gharu.anand@gmail.com 1/22/2018 71