Life Cycle of Source Program - Compiler Design

Similar documents
About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

CST-402(T): Language Processors

COMPILER DESIGN LECTURE NOTES

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.


CS 4201 Compilers 2014/2015 Handout: Lab 1

Formal Languages and Compilers Lecture I: Introduction to Compilers

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.


KINGS COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING ACADEMIC YEAR / EVEN SEMESTER

SYLLABUS UNIT - I UNIT - II UNIT - III UNIT - IV CHAPTER - 1 : INTRODUCTION CHAPTER - 4 : SYNTAX AX-DIRECTED TRANSLATION TION CHAPTER - 7 : STORA

Dixita Kagathara Page 1

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

LESSON 13: LANGUAGE TRANSLATION

Question Bank. 10CS63:Compiler Design

COMP455: COMPILER AND LANGUAGE DESIGN. Dr. Alaa Aljanaby University of Nizwa Spring 2013

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

LECTURE NOTES ON COMPILER DESIGN P a g e 2

COMPILER DESIGN. For COMPUTER SCIENCE

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

VIVA QUESTIONS WITH ANSWERS

Compiler, Assembler, and Linker

VALLIAMMAI ENGINEERING COLLEGE

INTRODUCTION TO COMPILER AND ITS PHASES

COMPILER DESIGN LEXICAL ANALYSIS, PARSING

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Front End. Hwansoo Han

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Introduction to Compilers

DEPARTMENT OF INFORMATION TECHNOLOGY / COMPUTER SCIENCE AND ENGINEERING UNIT -1-INTRODUCTION TO COMPILERS 2 MARK QUESTIONS

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

LANGUAGE TRANSLATORS

PRINCIPLES OF COMPILER DESIGN

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Group A Assignment 3(2)

Compiler Design. Subject Code: 6CS63/06IS662. Part A UNIT 1. Chapter Introduction. 1.1 Language Processors

Compiler Optimization

UNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.

Introduction to Compiler Construction

Intermediate Code Generation

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A

GUJARAT TECHNOLOGICAL UNIVERSITY

Introduction to Compiler Construction

4. An interpreter is a program that

Introduction to Compiler Construction

Pioneering Compiler Design

Principles of Compiler Design

A Simple Syntax-Directed Translator

CS 321 IV. Overview of Compilation

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

SYSTEMS PROGRAMMING. Srimanta Pal. Associate Professor Indian Statistical Institute Kolkata OXFORD UNIVERSITY PRESS

QUESTION BANK CHAPTER 1 : OVERVIEW OF SYSTEM SOFTWARE. CHAPTER 2: Overview of Language Processors. CHAPTER 3: Assemblers

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

INSTITUTE OF AERONAUTICAL ENGINEERING (AUTONOMOUS)

GUJARAT TECHNOLOGICAL UNIVERSITY

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Compilers and Interpreters

Compiler Design Aug 1996

Computer Software: Introduction

Undergraduate Compilers in a Day

Compilers. Prerequisites

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND ENGINEERING COURSE PLAN

Compiler Design Overview. Compiler Design 1

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Name of chapter & details

TABLE OF CONTENTS S.No DATE TOPIC PAGE No UNIT I LEXICAL ANALYSIS 1 Introduction to Compiling-Compilers 6 2 Analysis of the source program 7 3 The

CA Compiler Construction

AS-2883 B.Sc.(Hon s)(fifth Semester) Examination,2013 Computer Science (PCSC-503) (System Software) [Time Allowed: Three Hours] [Maximum Marks : 30]

Alternatives for semantic processing

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

COMPILERS BASIC COMPILER FUNCTIONS

Intermediate Representations

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

UNIT-4 (COMPILER DESIGN)

CS101 Introduction to Programming Languages and Compilers

CS131: Programming Languages and Compilers. Spring 2017

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Intermediate Representations

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

CS 132 Compiler Construction

Computer Hardware and System Software Concepts

VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Working of the Compilers

Principles of Compiler Design

CMPT 379 Compilers. Anoop Sarkar.

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Introduction. Interpreters High-level language intermediate code which is interpreted. Translators. e.g.

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Transcription:

Life Cycle of Source Program - Compiler Design Vishal Trivedi * Gandhinagar Institute of Technology, Gandhinagar, Gujarat, India E-mail: raja.vishaltrivedi@gmail.com Abstract: This Research paper gives brief information on how the source program gets evaluated and from which sections source code has to pass and parse in order to generate target code or predicted output. In addition to that, this paper also explains the concept of Pre-processors, Translators, Linkers and Loaders and procedure to generate target code. This paper concentrates on Concept of Compiler and Phases of Compiler. Keywords: Macro; Token; Lexemes; Identifier; Operators; Operands; Sentinel; Prefix; Postfix; Quadruple; Triple; Indirect triple; Subroutine I. INTRODUCTION Whenever we create a source code and start the process of evaluating it, computer only shows the output and errors (if occurred). We don t know the actual process behind it. In this research paper, the exact procedure behind the compilation task and step by step evaluation of source code are explained. In addition to that touched topics are High level languages, Low level languages, Pre-processors, Translators, Compilers, Phases of Compiler, Cousins of Compiler, Assemblers, Interpreters, Symbol Table, Error Handling, Linkers and Loaders [1]. Figure 1: Life cycle of source code.

II. HIGH LEVEL LANGUAGES Source program is in the form of high level language which uses natural language elements and is easier to create program. It is programming language having very strong abstraction [1]. It makes the process of developing source code easier, simpler and more understandable. High level languages are very much closer to English language and uses English structure for program coding. Examples of high level languages are Visual Basic, PHP, Python, Delphi, FORTRAN, COBOL, C, Pascal, C++, LISP, BASIC etc. III. LOW LEVEL LANGUAGES Low level languages are languages which can be directly understand by machines. It is programming language having little or no abstraction. These languages are described as close to hardware. Examples of low level languages are machine languages, binary language, assembly level languages and object code etc. IV. PRE-PROCESSORS Pre-processor is a computer program that manipulates its input data in order to generate output which is ultimately used as input to some other program or compiler. Input of pre-processor is high level languages and output of pre-processor is pure high level languages. Pure high level language refers to the language which is having Macros in the program and File Inclusion. Macro means some set of instructions which can be used repeatedly in the program. Macro preprocessing task is done by pre-processor. Pre-processor allows user to include header files which may be required by program known as File Inclusion. Example: # defines PI 3.14 shows that whenever PI encountered in a program, it is replaced by 3.14 values (Figure 1). V. TRANSLATORS Translator is a program that takes input as a source program and converts it into another form as output. Translator takes input as a high level language and convert it into low level language [2]. There are mainly three types of translators (Figure 2). 1) Compilers 2) Assemblers 3) Interpreters Figure 2: Translators. VI. COMPILERS Compiler reads whole program at a time and generate errors (if occurred). Compiler generates intermediate code in order to generate target code. Once the whole program is checked, errors are displayed. Example of compilers is Borland Compiler, Turbo C Compiler. Generated target code is easy to understand after the process of compilation [3]. The process of compilation must be done efficiently. There are mainly two parts of compilation process (Figure 3).

6.1 Analysis Phase This phase of compilation process is machine independent. The main objective of analysis phase is to divide to source code into parts and rearrange these parts into meaningful structure. The meaning of source code is determined and then intermediate code is created from the source program. Analysis phase contains mainly three sub-phases named lexical analysis, syntax analysis and semantic analysis. 6.2 Synthesis Phase This phase of compilation process is machine dependent. The intermediate code is taken and converted into an equivalent target code. Synthesis phase contains mainly three sub-phases named intermediate code, code optimization and code generation. Figure 3: Compilers. VII. PHASES OF COMPILER As mentioned above, compiler contains lexical analysis, syntax analysis, semantic analysis, intermediate code, code optimization and code generation phases (Figure 4). Figure 4: Phases of compiler.

7.1 Lexical Analysis Lexical Analysis is first phase of compiler. Lexical Analysis is also known as Linear Analysis or Scanning. First of all, lexical analyser scans the whole program and divides it into Token. Token refers to the string with meaning. Token describes the class or category of input string. Example: Identifiers, Keywords, Constants etc. Sentinel refers to the end of buffer or end of token. Pattern refers to set of rules that describe the token. Lexemes refers to the sequence of characters in source code that are matched with the pattern of tokens. Example: int, i, num etc. There are two pointers in lexical analysis named Lexeme pointer and Forward pointer [4]. In order to perform token recognition, Regular Expressions are used to construct Finite automata which are separate topic itself (Table 1). Input is source code and output is token. Consider an Example: Input: a=a + b*c*2; Output: = a + b * c 2 Table 1: Tokens or tables of tokens. 7.2 Syntax Analysis Syntax analysis is also known as syntactical analysis or parsing or hierarchical analysis. Syntax refers to the arrangement of words and phrases to create well-formed sentences in a language. Tokens generated by lexical analyzer are grouped together to form a hierarchical structure which is known as syntax tree which is less detailed (Figure 5). Figure 5: Lexical and syntax analyzer. Input is token and output is syntax tree (Figure 6). Grammatical errors are checked during this phase. Example: Parenthesis missing, semicolon missing, syntax errors etc.

For above given example (Table 2). Input: = A + B * C 2 Table 2: Tokens or tables of tokens. Output: Figure 6: Syntax tree. 7.3 Semantic Analysis Semantic analyzer checks the meaning of source program. Logical errors are checked during this phase. Example: divide by zero, variable undeclared etc. Example of logical errors int a; float b; char c; c=a+b; Parse tree refers to the tree having meaningful data. Parse tree is more specified and more detailed. Input is syntax tree and output is parse tree (syntax tree with meaning) [5,6]. Output for above given syntax tree is parse tree (Figure 7). Figure 7: Parse tree.

7.4 Intermediate Code Input Intermediate code (IC) is code between high level language and low level language or we can say IC is code between source code and target code. Intermediate code can be easily converted to target code. Intermediate code acts as an effective mediator between front end and back end [7]. Types of intermediate code are three address code, abstract syntax tree, prefix (polish), postfix (reverse polish) etc. Directed Acyclic Graph (DAG) is kind of abstract syntax tree which optimizes repeated expressions in syntax tree (Figure 8). Figure 8: DAG Most commonly used intermediate code is three address code which is having no more than three operands. There are three representations used for three address code such as Quadruple, Triple and Indirect Triple. Input: Parse Tree Output: Three address code t1=int to float(2); t2=id4*t1; t3=id3*t2; t4=id2+t3; t4=id1; 7.5 Code Optimization 1. Code optimization is used to improve the intermediate code and execution speed. 2. It is necessary to have a faster executing code or less consumption of memory. 3. There are mainly two ways to optimize the code named Front-end (Analysis) and Back-end (Synthesis) [8,9]. 4. In Front-end, programmer or developer can optimizes the code. In Back-end, compiler can optimizes the code. Figure 9: Code optimization ways. 5. Mentioned below are various techniques for code optimization (Figure 9). Compile Time Evaluation Constant Folding

Constant Propagation Common Sub Expression Elimination Variable Propagation Code Movement Loop Invariant Computation Strength Reduction Dead Code Elimination Code Motion Induction Variables and Reduction in Strength Loop Unrolling Loop Fusion etc. Input: three address codes Output: Optimized three address code t1=id4*2.0; t2=t1*id3; id1=t2+id2; 6. Code Generation a) Code Generation is final phase of compiler. b) Intermediate code is translated into machine language which is pass and parse from above phases and lastly optimize. c) Properties desired by code generation phase are mentioned below. Correctness High Quality Quick Code Generation Efficient use of resources of target machine d) Common issues in code generator are mentioned below. Memory management Input of code generator Target programs Approaches to code generation Choice of evaluation order Register allocation Instruction selection etc. e) Target code which is now low level language goes into linker and loader. f) Input: Optimized three address code Output: Machine language MOV id4, R1 MUL #2.0, R1 MOV id3, R2 MUL R2, R1 MOV id2, R2 ADD R2, R1 MOV R1, id1

VIII. COUSINS OF COMPILERS Cousins refer to the context in which the compiler typically operates. There are mainly three cousins of compiler. 1. Pre-processors 2. Assemblers 3. Linkers and Loaders IX. ASSEMBLERS Assembler is a translator which takes assembly language as an input and generates machine language as an output. Output of compiler is input of assembler which is assembly language. Assembly code is mnemonic version of machine code. Binary codes for operations are replaced by names. Binary language or relocatable machine code is generated from the assembler [10]. Assembler uses two passes. Pass means one complete scan of the input program (Figure 10). Figure 10: Assemblers. X. INTERPRETERS Interpreter performs the line by line execution of source code. It takes single instruction as an input, reads the statement, analyzes it and executes it. It shows errors immediately if occur [11]. Interpreter is machine independent and which does not produces object code or intermediate code as it directly generates the target code. Many languages can be implemented using both compilers and interpreters such as BASIC, Python, C#, Pascal, Java, and Lisp etc. Example of interpreter is UPS Debugger (Built in C interpreter). There are mainly two phases of Interpreter (Figure 11). 1) Analysis Phase: Analysis phase contains mainly three sub-phases named lexical analysis, syntax analysis and semantic analysis. This phase of compilation process is machine independent. Analysis phase of interpreter works same as analysis phase of compiler. 2) Synthesis Phase: This phase of compilation process is machine dependent. Synthesis phase contains sub-phase named code generation (direct execution) as it does not generate intermediate code. Figure 11: Interpreters.

XI. SYMBOL TABLE AND ERROR HANDLING Translators such as compilers or assemblers use data structure known as Symbol table to store the information about attributes. It stores the names encountered in source program with its attributes. Symbol table is used to store the information about entities such as interfaces, objects, classes, variable names, function names, keyword, constants, subroutines, label name and identifier etc. Each and every phase of compiler detects errors which must be reported to error handler whose task is to handle the errors so that compilation can proceed [9]. Lexical errors contain spelling errors, exceeding length of identifier or numeric constants, appearance of illegal characters etc. Syntax errors contain errors in structure, missing operators, missing parenthesis etc. Semantic errors contain incompatible types of operands, undeclared variables, not matching of actual arguments with formal arguments etc. There are various strategies to recover the errors which can be implementing by analyzers (Figure 12). Figure 12: Symbol table. XII. LINKERS AND LOADERS Linker combines two or more separate object programs. It combines target program with other library routines. Linker links the library files and prepares single module or file. Linker also solves the external reference. Linker allows us to create single program from several files [8]. Loader is utility program which takes object code as input and prepares it for execution. It also loads the object code into executable code. Loader refers to initializing of execution process. Tasks done by loaders are mentioned below. Allocation Relocation Link editing Loading Relocation of an object means allocation of load time addresses and placement of load time addresses into memory at proper locations. XIII. CONCLUSION To conclude this research, source program has to pass and parse from above mentioned all sections to be converted into predicted target program. After studying this research paper, one can understand the exact procedure behind the compilation task and step by step evaluation of source code which contains pre-processors, translators, compilers, phases of compiler, cousins of compiler, assemblers, interpreters, symbol table, error handling, linkers and loaders.

XIV. ACKNOWLEDGMENT I am using this opportunity to express my gratitude to everyone who supported me in this research. I am thankful for their aspiring guidance, invaluably constructive criticism and friendly advice during the research. I am sincerely grateful to them for sharing their truthful and illuminating views on a number of issues related to the research work. XV. REFERENCES 1. High-level programming language and Compiler 2. https://www.draw.io/s 3. B Vangie, High-level language. Webopedia. 4. A Puntambekar Anuradha, Compiler Design. Technical Publication Second Revised Edition 2016. 5. K Dixita, Compiler Design. Darshan Institute of Engineering and Technology. 6. DA Hardik, System Programing. Darshan Institute of Engineering and Technology 7. Compiler Design - Symbol Table. Tutorials Point. 8. P Matt, W Christopher, Compilers. Department of Computer Science University of Wales Swansea, UK 2007. 9. P Neha, WM Niharika, et al. Introduction to Compilers. International Journal of Science and Research 2015; 4: 2399-2405. 10. A Charu, A Chetna, et al. Research Paper on Phases Of Compiler. International Journal of Innovative Research in Technology 2014; 1: 259-260. 11. VA Alfred, LS Monica, et al. Compilers: Principles, Techniques and Tools. Second Edition, Pearson, 2014.