Chapter 10 Language Translation

Similar documents
CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compiler Structure. Lexical. Scanning/ Screening. Analysis. Syntax. Parsing. Analysis. Semantic. Context Analysis. Analysis.

CPS 506 Comparative Programming Languages. Syntax Specification

Compiler Construction D7011E

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Compilers. Lecture 2 Overview. (original slides by Sam

Compiler Code Generation COMP360

Parsing. CS321 Programming Languages Ozyegin University

Compiler Design (40-414)

CS101 Introduction to Programming Languages and Compilers

If you are going to form a group for A2, please do it before tomorrow (Friday) noon GRAMMARS & PARSING. Lecture 8 CS2110 Spring 2014

Introduction. Interpreters High-level language intermediate code which is interpreted. Translators. e.g.

Introduction to Compiler

2.2 Syntax Definition

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Stating the obvious, people and computers do not speak the same language.

Lexical Scanning COMP360

A Simple Syntax-Directed Translator

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Compiling Regular Expressions COMP360

A simple syntax-directed

There are two ways to use the python interpreter: interactive mode and script mode. (a) open a terminal shell (terminal emulator in Applications Menu)

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Lexical Analysis. Introduction

There are two ways to use the python interpreter: interactive mode and script mode. (a) open a terminal shell (terminal emulator in Applications Menu)

CSCE 314 Programming Languages

printf( Please enter another number: ); scanf( %d, &num2);

Compiler I: Syntax Analysis

Introduction to Compiler Construction

CS 314 Principles of Programming Languages

CS415 Compilers. Lexical Analysis

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:

Provided by - Microsoft Placement Paper Technical 2012

Theory and Compiling COMP360

Compiler, Assembler, and Linker

6.001 Notes: Section 15.1

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Compiling Techniques

Introduction to Compiler Construction

CSE P 501 Exam 8/5/04

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Introduction to Compiler Construction

CS 314 Principles of Programming Languages. Lecture 3

GRAMMARS & PARSING. Lecture 7 CS2110 Fall 2013

More on Syntax. Agenda for the Day. Administrative Stuff. More on Syntax In-Class Exercise Using parse trees

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

CS 132 Compiler Construction

Reversing. Time to get with the program

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

COMPILER DESIGN LECTURE NOTES

Week 2: Syntax Specification, Grammars

(Refer Slide Time 3:31)

Introduction to Scientific Computing

LECTURE 3. Compiler Phases

Compilers. Prerequisites

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Computer Architecture

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Coefficient Constant Equivalent expressions Equation. 3 A mathematical sentence containing an equal sign

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CS Exam #1-100 points Spring 2011

LANGUAGE PROCESSORS. Presented By: Prof. S.J. Soni, SPCE Visnagar.

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

CS 230 Programming Languages

Syntax. 2.1 Terminology

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character


SLIDE 2. At the beginning of the lecture, we answer question: On what platform the system will work when discussing this subject?

Interpreters. Prof. Clarkson Fall Today s music: Step by Step by New Kids on the Block

VIVA QUESTIONS WITH ANSWERS

age = 23 age = age + 1 data types Integers Floating-point numbers Strings Booleans loosely typed age = In my 20s

CS 4201 Compilers 2014/2015 Handout: Lab 1

Variables, expressions and statements

Principles of Programming Languages COMP251: Syntax and Grammars

1. In C++, reserved words are the same as predefined identifiers. a. True

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler

Compiler Construction

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Programming language components

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

CS 314 Principles of Programming Languages

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Homework & Announcements

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Computer Architecture

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

This book is licensed under a Creative Commons Attribution 3.0 License

CSCE 145 Exam 1 Review. This exam totals to 100 points. Follow the instructions. Good luck!

SIMPLE INPUT and OUTPUT:


Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Transcription:

Chapter 10 Language Translation A program written in any of those high-level languages must be translated into machine language before execution, by a special piece of software, the compilers. Compared to assemblers, compilers are very difficult to design, thus, many person-years have to be spent there. One machine language instruction leads to one assembly language instruction. Hence, an assembler really just replaces something with something else. 1

On the other hand, one high-level language statement may lead to many machine language instructions. For example, the following Pascal statement A=B+C-D; corresponds to the following four instructions LOAD ADD SUBTRACT STORE B C D A To generate the corresponding instructions, a compiler must do a thorough analysis of the structure (syntax) and meaning (semantics) of the involved program, which is very complicated and difficult. 2

What should the compiler do? When performing translation, the foremost goal is to be correct. The generated machine language program must do exactly what the original program does, no more, no less. For example, the following machine code LOAD ADD STORE SUBTRACT STORE B C B D A does not correctly translate the statement A=B+C-D since it destroys the original data in B. 3

The second goal is that the resulted machine code should be efficient and concise. For example, to sum up 2x 1 +2x 2 + +2x 50000, for the following poorly written Pascal program (?) sum:=0.0; for i:=1 to 50000 do sum:=sum+(2.0*x[i]); the compiler should generate more efficient codes, as if the code had been based on the following: sum:=0.0; for i:=1 to 50000 do sum:=sum+x[i]; sum:=2.0*sum; 4

Compilation process There are generally four phases for the translation process. 1. Lexical analysis: The compiler looks at the individual characters in the source program and groups them into syntactic unit, called tokens. 2. Parsing: Sequence of tokens will be checked to see if it forms a syntactically correct program according to a specific program language. 3. Semantic analysis and code generation: The compiler analyzes the meaning of a program and generates the proper code. 4. Code optimization: The compile tries to make the just generated code more efficient. 5

Lexical analysis In this part, a lexical analyzer, part of the compiler, reads in a sequence of characters, and puts them into tokens. For example, for the following Pascal statement a = b + 319 - delta; based on the individual symbols such as, a, =, b, 3, 1, 9, -, d, e, l, t, a, ;, the analyzer forms the following 8 tokens: a, =, b, +, 319, -, delta, ;. From now on, the compiler can work at the level of symbols, numbers, and operators. 6

Token classification Besides forming tokens, the analyzer also tried to categorize them. For example, all names will be assigned a category 1, while all numbers, regards its values, will be assigned a 2, etc. We can have the following table. Token Type Classification symbol 1 number 2 = 3 + 4-5 ; 6 == 7 if 8 then 9 else 10 7

The reason that the tokens can be categorized is because we only care what occurs in where. For example, the following is a legal assignment, no matter what symbols are used and what values that number has. symbol = symbol + number ; To summarize, the input to a lexical analyzer is a high-level language statement from the source program. Its output is a list of all the tokens contained in the program, as well as their classification numbers. Homework: Exercises 1, 2 and 3. 8

Parsing During the parsing phase, a compiler determines if the tokens recognized fit together in a grammatically correct way, i.e., if it is a syntactically legal statement of the programming language. For example, the following assignment statement a = b + c; is a legal statement as we can construct the following parse tree. It shows how those tokens are grouped together. 9

Semantics and code generation During parsing, a compiler deals with the syntax of a statement, i.e., its syntactical structure. But, it is not the case that every syntactically correct sentence makes sense, e.g., The man big the dog. This problem is dealt with by checking the semantics of the statement. The compiler will analyze the meaning of the tokens and understand the actions it tries to perform. If it does not make sense, it will be rejected; otherwise, it will be translated into machine language. Given the following statement: sum=a+b; Although it is syntactically correct, it still might not make sense if we know that the types of a and b are char and integer, respectively. 10

Semantic record The previous example tells us that we have to add some additional information, such as the type of a data item, to the parsing tree. In general, we attach a semantic record to every node in the parsing tree. For example, below shows a more general parsing tree for a+b: Based on this information, the compiler can easily reject the expression that constructs the following tree. 11

Thus, the first step of code generation has to be semantic analysis, which checks every branch of the parse tree to make sure that they are semantically meaningful. If not, then it will report the errors. Otherwise, it will get into the next phase to produce the code. For example, below is a parse tree for x=y+z; 12

The completed code If we follow the appropriate procedure, we will translate the above parse tree into the following machine instructions LOAD Y ADD Z STORE TEMP LOAD TEMP STORE X... X:.DATA 0 Y:.DATA 0 Z:.DATA 0 TEMP:.DATA 0 Homework: Exercise 18. 13

Code optimization When compilers came out during the 1950 s, they were not accepted that well. The major reason is that the code they generated were not that efficient, hence, the need for code optimization. There are two types of optimization: local and global optimization. In local optimization, the compiler looks at a very small block of instruction to see if any more improvement can be made. For example, if an expression can be fully evaluated at this time, it should. Hence the following constant evaluation: LOAD ONE LOAD TWO ADD ONE =====> STORE X STORE X 14

Other techniques We also want to use simpler, and less time consuming operations, hence the following strength reduction: LOAD X LOAD X MULTIPLY TWO =====> ADD X STORE X STORE X Also we want to eliminate unnecessary work: LOAD Y LOAD Y STORE X =====> STORE X LOAD X STORE Z STORE Z 15

Global optimization We have seen s few local optimization. Global optimization requires the ability to see the big picture, which is more difficult and not always done. Below shows an example. Given the following code sum=0.0; i=0; while(i<=50000){ sum=sum+(2.0*x[i]); i=i+1; } 16

Global reduction We can eliminate lots of multiplication by moving this operation out of the loop. sum=0.0; i=0; while(i<=50000){ sum=sum+x[i]; i=i+1; } sum=2.0*sum; Homework: Exercises 22 and 23. 17