SOURCE LANGUAGE DESCRIPTION

Similar documents
Virtual Machine Tutorial

It is possible to define a number using a character or multiple numbers (see instruction DB) by using a string.

S3.0 : A multicore 32-bit Processor

Assembly Language Programming of 8085

Talen en Compilers. Johan Jeuring , period 2. December 15, Department of Information and Computing Sciences Utrecht University

1 Lexical Considerations

Microcontroller Intel [Instruction Set]

Intermediate Representations

Subprograms, Subroutines, and Functions

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

Assembly Language Programming of 8085

Microprocessors 1. The 8051 Instruction Set. Microprocessors 1 1. Msc. Ivan A. Escobar Broitman

ESCI 386 IDL Programming for Advanced Earth Science Applications Lesson 1 IDL Operators

Summary: Direct Code Generation

CSCE 5610: Computer Architecture

Intel 8086: Instruction Set

CSE P 501 Exam 8/5/04 Sample Solution. 1. (10 points) Write a regular expression or regular expressions that generate the following sets of strings.

Programming Fundamentals - A Modular Structured Approach using C++ By: Kenneth Leroy Busbee

Post processing optimization of byte-code instructions by extension of its virtual machine.

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Compiler Construction I

Practical Malware Analysis

UNIT-II. Part-2: CENTRAL PROCESSING UNIT

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

Q. Classify the instruction set of 8051 and list out the instructions in each type.

8051 Overview and Instruction Set

Binghamton University. CS-140 Fall Pippin

Intermediate Code Generation

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Computer Organization

Programming of 8085 microprocessor and 8051 micro controller Study material

Lexical Considerations

Compiling Code, Procedures and Stacks

CS111: PROGRAMMING LANGUAGE II

Introduction to Programming Using Java (98-388)


CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler

Summer 2003 Lecture 14 07/02/03

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University

F28HS2 Hardware-Software Interface. Lecture 7: ARM Assembly Language 2

CS251 Programming Languages Handout # 29 Prof. Lyn Turbak March 7, 2007 Wellesley College

EE 361 University of Hawaii Fall

Computer Architecture Prof. Smruti Ranjan Sarangi Department of Computer Science and Engineering Indian Institute of Technology, Delhi

Code generation scheme for RCMA

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

QUIZ. 1. Explain the meaning of the angle brackets in the declaration of v below:

Objectives. ICT106 Fundamentals of Computer Systems Topic 8. Procedures, Calling and Exit conventions, Run-time Stack Ref: Irvine, Ch 5 & 8

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

SN8F5000 Family Instruction Set

MAHALAKSHMI ENGINEERING COLLEGE TIRUCHIRAPALLI

Project 3 Due October 21, 2015, 11:59:59pm

Lexical Considerations

Computer Organization & Assembly Language Programming. CSE 2312 Lecture 15 Addressing and Subroutine

By the end of this section you should: Understand what the variables are and why they are used. Use C++ built in data types to create program

UNIT- 3 Introduction to C++

Quadsim Version 2.1 Student Manual

Chapter 9. Programming Framework

Architecture & Instruction set of 8085 Microprocessor and 8051 Micro Controller

(2) Explain the addressing mode of OR What do you mean by addressing mode? Explain diff. addressing mode for 8085 with examples.

CSCI 171 Chapter Outlines

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Summer 2003 Lecture 15 07/03/03

C-LANGUAGE CURRICULAM

The PCAT Programming Language Reference Manual

by Pearson Education, Inc. All Rights Reserved.

DR bit RISC Microcontroller. Instructions set details ver 3.10

UNIT 4. Modular Programming

Registers. Registers

Operators & Expressions

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

ORG ; TWO. Assembly Language Programming

Programming Model 2 A. Introduction

Java Primer 1: Types, Classes and Operators

Instruction Set Of 8051

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

3.0 Instruction Set. 3.1 Overview

Computer System Architecture

Design and Construction of a PC-Based Stack Machine Simulator for Undergraduate Computer Science & Engineering Courses

Introduction to Assembly Language Programming (Instruction Set) 1/18/2011 1

General issues. Section 9.1. Compiler Construction: Code Generation p. 1/18

8085 INSTRUCTION SET INSTRUCTION DETAILS

Java+- Language Reference Manual

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9

Object Code (Machine Code) Dr. D. M. Akbar Hussain Department of Software Engineering & Media Technology. Three Address Code

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

GBL Language Reference Manual

SISTEMI EMBEDDED. Stack, Subroutine, Parameter Passing C Storage Classes and Scope. Federico Baronti Last version:

CMa simple C Abstract Machine

Computer Architecture


MIPS Instruction Set

Motivation. Compiler. Our ultimate goal: Hack code. Jack code (example) Translate high-level programs into executable code. return; } } return

CSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1

Computer Components. Software{ User Programs. Operating System. Hardware

Summer 2003 Lecture 4 06/14/03

Instruction Sets: Characteristics and Functions Addressing Modes

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

Computer Architecture and System Software Lecture 07: Assembly Language Programming

Transcription:

1. Simple Integer Language (SIL) SOURCE LANGUAGE DESCRIPTION The language specification given here is informal and gives a lot of flexibility for the designer to write the grammatical specifications to his/her own taste. The following features are the minimal requirements for the language. 2. General Program Structure Global Declarations Function Definitions Main Function Definition 3. Global Declarations The global aration part of a SIL program begins with the keyword and ends with the keyword end. Declarations should be made for global variables and functions defined in the SIL program. Global variables may be of type Integer, Boolean, Integer array or Boolean array. The variables ared globally must be allocated statically. Boolean variables can hold only the special Boolean constants TRUE/FALSE. Global variables are visible throughout the program unless suppressed by a rearation within the scope of some function. Array type variables can be ared only globally. Only single dimensional arrays are allowed. Variables cannot be assigned values during the aration phase. For every function except the main function defined in a SIL program, there must be a aration. A function aration should specify the name of the function, the name and type of each of its arguments and the return type of the function. A function can have integer/boolean arguments. Parameters may be passed by value or reference. Arrays cannot be passed as arguments. If a global variable name appears as an argument, then within the scope of the function, the new aration will be valid and global variable aration is suppressed. Different functions may have arguments of the same name. However, the same name cannot be given to different arguments in a function. The return type of a function must be either integer or Boolean. The general form of arations is as follows: Type VarName/FunctionName [ArraySize]/(ParameterList); //Third part needed only for arrays/functions integer x,y,a[10],b[20]; // x,y are integers, a,b are integer arrays integer f1(integer a1,a2; boolean b1; integer &c1), f2(); // c1 is passed by reference, rest by value boolean t, q[10], f3(integer x); // variable, array and a functions ared together integer swap(integer &x, &y); // x, y are passed by reference end // Please note the use of "," and ";" Declaring functions at the beginning avoids the "forward reference" problem and facilitates simpler single pass compilation. Note that the aration syntax of functions is structurally same as that for variables. Finally, inside swap, the global variables x and y are no more visible because of the rearation and global aration for x is suppressed in f3. If a variable/function is ared twice at the same point, a compilation error should result. 4. Function Structure and Local Variables

All globally ared variables are visible inside a function, unless suppressed by a re-aration. Variables ared inside a function are invisible outside. The general form of a function definition is given below: <Type> FunctionName(ArgumentList) { Local Declarations Function Body The arguments and return type of each function definition should match exactly with the corresponding aration. Argument names must be type checked for name equivalence against the aration. Every ared function must have a definition. The compiler should report error otherwise. The syntax of local arations and definitions are similar to those of global arations except that arrays and functions cannot be ared inside a function. Local variables are visible only within the scope of the function where they are ared. Scope rules for parameters are identical to those for variables. The main() function, by specification, must be a zero argument function of type integer. Program execution begins from the body of the main function. The main function need not be ared. The definition part of main should be given in the same format as any other function. The Body of a function is a collection of statements embedded within the keywords begin and end. The definition of swap ared above may look like the following: integer swap (integer &x, &y) { integer q // q is re-ared causing suppression of global aration end begin q = x; x = y; y = q; // Note the syntax for using variables passed by reference. return 1; // swap must return an integer. end // Note that SIL doesn t support void functions. Local Variables and parameters should be allocated space in the run-time stack. The language supports recursion and static scope rules apply. 5. Main and Function Body A Body is a collection of statements embedded within the keywords begin and end. Each statement should end with a ; which is called the terminator. There are five types of statements in SIL. They are: a. Assignment Statement b) Conditional Statement a. Iterative statement d) Return statement e) Input/Output Before taking up statements, we should look at the different kinds of expressions supported by SIL. 6. Expressions

SIL has two kinds of expressions, a) Arithmetic and b) Logical 6.1 Arithmetic Expressions Any constant, integer variable or an indexed array variable is a SIL expression provided the expression is within the scope of the concerned variable arations. SIL treats a function as an expression and the value of a function is its return value. SIL supports recursion. SIL provides five arithmetic operators, viz., +, -, *, / (Integer Division) and % (Modulo operator) through which arithmetic expressions may be combined. Expression syntax and semantics are similar to standard practice in programming languages and normal rules of precedence, associativity and paranthesization hold. SIL is strongly typed and any type mismatch must be reported at compile time. Examples: 5, a[a[5+x]]+x, (f2() + b[x] + 5) etc. are arithmetic expressions. 6.2 Logical Expressions Logical expressions can take values TRUE or FALSE. Logical expressions may be formed by combining arithmetic expressions using relational operators. The relational operators supported by SIL are <, >, <=, >=, ==, and!=. Again standard syntax and semantics conventions apply. TRUE and FALSE are constant logical expressions. Every boolean variable is a logical expression and its value is the value stored in its location. Logical expressions themselves may be combined using logical operators AND, OR and NOT. Logical expressions may be assigned to boolean variables only. Note that a relational operator can compare only two arithmetic expressions and not two logical expressions. Similarly, a logical operator can connect only two logical expressions (except for NOT which is a unary logical operator). ((x==y)==a[3]) is not valid SIL expression because (x==y) is a logical expression, while a[3] is an arithmetic expression and "==" operates only between two arithmetic expressions. 7. Assignment Statement The SIL assignment statement assigns the value of an expression to a variable, or an indexed array of the same type. Type errors must be reported at compile time. The general syntax is as the following: <Variable> = <Expression>; q[3]=(x==y); t=true; are both valid assignments to boolean variables. 8. Conditional Statement The SIL conditional statement has the following syntax: if <Logical Expression> then Statements else Statements endif; The else part is optional. The statements inside an if-block may be conditional, iterative, assignment, or input/output statements, but not the return statement. 9. Iterative Statement The SIL iterative statement has the following syntax: while < Logical Expression > do Statements

endwhile; Standard conventions apply in this case too. The statements inside a while-block may be conditional, iterative, assignment, or input/output statements, but not the return statement. 10. Return Statement The main body as well as each function body should have exactly one return statement and it should be the last statement in the body. The syntax is: return <Expression> ; The return value of the function is the value of the expression. The return type should match the type of the expression. Otherwise, a compilation error should occur. The return type of main is integer by specification. 11. Input/Output SIL has two I/O statements read and write. The syntax is as the following: read (<IntegerVariable>); write (<Arithmetic Expression>); The read statement reads an integer value from the standard input device into an integer variable or an indexed array variable. The write statement outputs the value of the arithmetic expression into the standard output. Note that input/output operations are not allowed on boolean type. read (a [x]); write (7*(5+a[9]); 12. An Example SIL Program The following SIL program calculates and prints out the factorial of the first n numbers, value of n read from standard input. integer factorial(integer n); end integer factorial (integer n) { integer rvalue; end begin if (n==1) then rvalue = 1; else rvalue = n * factorial (n-1); endif; return rvalue; // Note only one RETURN statement is allowed. end

integer main( ){ // Main definition should always begin like this integer n,i ; end begin read (n); i = 1; while ( i <= n) do write ( factorial(i)); i = i + 1; endwhile; return 1; // Any integer value may be returned end ESIL: Extended SIL Providing User defined types: User defined types may be supported by allowing creation of compound data types from simple or already defined data types using the typedef statement. All type definitions should be given before global arations. The syntax is as follows: typedef name1{ integer x; boolean y; The member fields of a newly defined type may be of type integer, boolean or a previously defined type. Arrays are not allowed typedef name2 { name1 g; boolean t; Once a type is defined, variables of the type may be defined in the usual manner. The type aration is visible throughout the program. name2 w[10], u; integer temp; end The following statement illustrates the access to a variable of a user-defined type. temp = w[5].g.x; // The dot operator is used to address the fields. W[5] = u; // Name equivalence of types must be checked here. For each user-defined type, a type expression tree must be created at the time of parsing the type

definition. The relative addresses or offsets of each field element (from the starting storage location of a variable of that type) can be fixed at the time of type definition and may be stored in the expression tree itself. The symbol table entry for a variable shall contain a pointer to the corresponding type expression tree.

Brief description of the machine Architecture TARGET MACHINE ARCHITECTURE SIM or Simple Integer Machine is a hypothetical machine with an elementary instruction set that supports integer arithmetic. The machine has eight General Purpose Registers R 0..R 7, each of which can hold an integer. The Memory words are numbered 0,1,2 and each one can hold an integer. The arithmetic operations supported by the ALU are addition, multiplication, subtraction, division and modulo operation on integers. The logical operations support comparison between values in two registers. The branching instructions allow control transfer based on the result of a comparison. Each instruction in the instruction set of SIM fits into one memory word. The machine has three special registers Stack Pointer (SP), Base Pointer (BP) and Instruction Pointer (IP). The stack pointer is generally used to point to the last element of the stack and is normally initialised immediately below the global data of the program. When data is pushed on to the stack (using the push instruction) the stack pointer gets automatically incremented. Thus, the stack grows towards higher memory locations. The instruction pointer carries the address of the current instruction under execution and is automatically incremented to point to the next instruction to be executed after the completion of the current instruction. The base pointer is generally used to store the base address of an activation record for procedure evocations. Although any other register can act as the base pointer, availability of an explicit base pointer gives better structure and clarity to the run-time environment generation phase of program compilation. SIM INSTRUCTION SET SIM has eight instruction classes. All SIM arithmetic and logical instructions act on integer operands only. 1. Data transfer : MOV 2. Arithmetic : ADD, SUB, MUL, DIV, MOD, INR, DCR 3. Logical : LT, GT, EQ, NE, GE, LE 4. Branching : JZ, JNZ, JMP 5. Stack : PUSH, POP 6. Subroutine : CALL, RET 7. Input/Output : IN, OUT 8. Start/Halt : START,HALT Instruction Syntax and Semantics Comments Comments are specified after "//" following an instruction on the same line. MOV R0, R1 // This is a comment. Data Transfer Immediate Addressing: MOV Ri, NUM // The value NUM is transferred to the register Ri. MOV R0, -9 // Register R0 now contains 9 Register Addressing: MOV Ri, Rj // Copy contents of Rj to Ri

MOV R0, -9 MOV R1, 8 MOV R0, R1 //R0 now contains 8 Register Indirect Addressing: MOV Ri, [Rj] // Copy contents of memory location pointed by Rj to Ri MOV [Ri], Rj // Contents of Rj are copied to the location whose address is in Ri Let the memory location 1005 have value 1237. MOV R0, 1005 MOV R1, [R0] // Now R1 contains 1237 Direct Addressing: MOV [LOC], Rj // Contents of Rj are transferred to the address LOC MOV Rj, [LOC] // Contents of the memory location LOC are transferred to Rj Let the memory location 1005 have value 1237 MOV R0, [1005] // Now R0 has value 1237. Note: Ri, Rj can be SP or BP along with other registers. No instruction can take IP as an argument. Arithmetic ADD, SUB, MUL, DIV and MOD have the following general format. OP Ri, Rj // The result of Ri op Rj is stored in Ri INR and DCR are used to increment/decrement the value of a register by one. INR Rj // Similar syntax for DCR Here Ri, Rj may be any registers except SP, BP and IP. MOV R0, 3 MOV R1, 5 MOD R1, R0 // Now R1 stores value 2 Logical For all logical operators the operands may be any two registers except SP, BP and IP. a. LT Ri, Rj // Stores 1 in Ri if the value stored in Ri is less than that in Rj. Ri is set to 0 otherwise. b. GT Ri, Rj // Stores 1 in Ri if the value stored in Ri is greater than that in Rj. Ri set to 0 otherwise. c. EQ Ri, Rj // Stores 1 in Ri if the value stored in Ri is equal to that in Rj. Set to 0 otherwise. d. NE Ri, Rj // Stores 1 in Ri if the value stored in Ri is not equal to that in Rj. Set to 0 otherwise. e. GE Ri, Rj // Stores 1 in Ri if the value stored in Ri is greater than or equal to that in Rj. Set to 0 otherwise.

f. LE Ri, Rj // Stores 1 in Ri if the value stored in Ri is less than or equal to that in Rj. Set to 0 otherwise. Branching Branching is achieved by changing the value of the IP to the address of a specified LABEL. However, this is an implicit process and transparent to the programmer. a) JZ Ri, LABEL // Jumps to LABEL if the contents of Ri is zero. b)jnz Ri, LABEL // Jumps to LABEL if the contents of Ri is not zero. c) JMP LABEL // Unconditional Jump to instruction specified at LABEL Here Ri can be any register except SP, BP and IP. MOV R0, 4 MOV R1, 5 L1:NE R0, R1 // If R0 and R1 contain different values, set R0 to 1, else set R0 to 0. JZ R0, L2 // If R0 is 0, jump to Label L2 MOV R2, 1 ADD R0, R2 // This increments the value of R0 by 1 JMP L1 // Unconditional jump to Label L1 L2: OUT R0 // This outputs the value of R0. (See discussion that follows). Stack SP is normally set to the address of the last element of the stack. It is the programmer s responsibility to suitably initialise SP. Stack has to be allocated in the memory of SIM. a) PUSH Ri // Increment SP by 1and copy contents of Ri to the location pointed to by SP. b) POP Ri // Copy contents of the location pointed to by SP into Ri and decrement SP by 1. For both these instructions Ri may be any register except IP. MOV SP, 1000 // Initialise SP to 1000 MOV R0, 7 PUSH R0 // Now the memory location 1001 contains value 7. SP takes value 1001 POP BP // Now BP contains value 7. SP has value 1000. Subroutine The CALL instruction copies the address of the next instruction to be fetched (IP + 1) on to the stack, and transfers control to the label specified. The RET instruction restores the IP value stored in the stack and continues execution fetching the next instruction pointed to by IP. The subroutine instructions provide a neat mechanism for procedure evocations. a. CALL LABEL // Increment SP by 1, transfers IP+1 to location pointed to by SP and jumps to LABEL b. RET // Sets IP to the value pointed to by SP and decrements SP.

MOV SP, 2000 // SP is initialised MOV R0, 3 CALL L1 // SP takes value 2001 and the address of L2 is stored in that location L2: HLT // Machine halts (See discussion below). L1: MOV R0, 00 RET // The stack top is transferred to IP, SP is decrement to 2000. Input/Output a. IN Ri // Transfers the contents of the standard input to Ri b. OUT Ri // Transfers the contents of Ri to the standard output Ri can be any register except IP, BP and SP. MOV R0, 6 OUT R0 // 6 is printed by the standard output Start/Halt. START // IP will be initialised to this instruction automatically when a program is taken for execution and //execution starts from the next instruction after START HALT // This instruction halts the machine.