The ACK Modula-2 Compiler

Similar documents
Amsterdam Compiler Kit-Pascal reference manual. Johan W. Stevenson. (January 4, 1983 ) (revised) Hans van Eck. (May 1, 1989 )

Pascal Validation Suite Report

1 Lexical Considerations

Lexical Considerations

The PCAT Programming Language Reference Manual

Lexical Considerations

A Fast Review of C Essentials Part I

CSc 10200! Introduction to Computing. Lecture 2-3 Edgardo Molina Fall 2013 City College of New York

Programming Languages Third Edition. Chapter 9 Control I Expressions and Statements

Language Reference Manual simplicity

XQ: An XML Query Language Language Reference Manual

The SPL Programming Language Reference Manual

G Programming Languages - Fall 2012

Programming Languages Third Edition. Chapter 7 Basic Semantics

A Short Summary of Javali

Variables, Constants, and Data Types

Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS

Introduction to Visual Basic and Visual C++ Arithmetic Expression. Arithmetic Expression. Using Arithmetic Expression. Lesson 4.

BLM2031 Structured Programming. Zeyneb KURT

UNIT- 3 Introduction to C++

CS112 Lecture: Primitive Types, Operators, Strings

Reference Grammar Meta-notation: hfooi means foo is a nonterminal. foo (in bold font) means that foo is a terminal i.e., a token or a part of a token.

Programming in C++ 5. Integral data types

MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

Draw a diagram of an empty circular queue and describe it to the reader.

Lecture Overview. [Scott, chapter 7] [Sebesta, chapter 6]

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

More Programming Constructs -- Introduction

Examining the Code. [Reading assignment: Chapter 6, pp ]

CSE 1001 Fundamentals of Software Development 1. Identifiers, Variables, and Data Types Dr. H. Crawford Fall 2018

G Programming Languages - Fall 2012

Bits, Words, and Integers

Homework #3 CS2255 Fall 2012

4. Inputting data or messages to a function is called passing data to the function.

Topic IV. Block-structured procedural languages Algol and Pascal. References:

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

This book is licensed under a Creative Commons Attribution 3.0 License

Course Text. Course Description. Course Objectives. StraighterLine Introduction to Programming in C++

Type Checking and Type Equality

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation

Amsterdam Compiler Kit-ANSI C compiler compliance statements

LECTURE 17. Expressions and Assignment

Operators and Expressions

Exercise: Using Numbers

22c:111 Programming Language Concepts. Fall Types I

Topic IV. Parameters. Chapter 5 of Programming languages: Concepts & constructs by R. Sethi (2ND EDITION). Addison-Wesley, 1996.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

Spoke. Language Reference Manual* CS4118 PROGRAMMING LANGUAGES AND TRANSLATORS. William Yang Wang, Chia-che Tsai, Zhou Yu, Xin Chen 2010/11/03

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program

>B<82. 2Soft ware. C Language manual. Copyright COSMIC Software 1999, 2001 All rights reserved.

Chapter 2: Basic Elements of C++

Chapter 2: Basic Elements of C++ Objectives. Objectives (cont d.) A C++ Program. Introduction

Introduction. Problem Solving on Computer. Data Structures (collection of data and relationships) Algorithms

System Software Assignment 1 Runtime Support for Procedures

Introduction to Programming Using Java (98-388)

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CS356: Discussion #6 Assembly Procedures and Arrays. Marco Paolieri

Review of the C Programming Language for Principles of Operating Systems

There are four numeric types: 1. Integers, represented as a 32 bit (or longer) quantity. Digits sequences (possibly) signed are integer literals:

Intermediate Code Generation

Mirage. Language Reference Manual. Image drawn using Mirage 1.1. Columbia University COMS W4115 Programming Languages and Translators Fall 2006

Types. What is a type?

C Language Part 1 Digital Computer Concept and Practice Copyright 2012 by Jaejin Lee

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Quick Reference Guide

CODE TIME TECHNOLOGIES. Abassi RTOS MISRA-C:2004. Compliance Report

Index. object lifetimes, and ownership, use after change by an alias errors, use after drop errors, BTreeMap, 309

Run time environment of a MIPS program

IEEE LANGUAGE REFERENCE MANUAL Std P1076a /D3

Reference Grammar Meta-notation: hfooi means foo is a nonterminal. foo (in bold font) means that foo is a terminal i.e., a token or a part of a token.

A Unix process s address space appears to be three regions of memory: a read-only text region (containing executable code); a read-write region

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

Chapter 2 Basic Elements of C++

B.V. Patel Institute of Business Management, Computer & Information Technology, Uka Tarsadia University

C/C++ Programming for Engineers: Working with Integer Variables

G Programming Languages - Fall 2012

CHAPTER 1 Introduction to Computers and Programming CHAPTER 2 Introduction to C++ ( Hexadecimal 0xF4 and Octal literals 031) cout Object

BITG 1233: Introduction to C++

Chapter 2. C++ Syntax and Semantics, and the Program Development Process. Dale/Weems 1

Review of the C Programming Language

Chapter 12 Variables and Operators

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

C++ Programming: From Problem Analysis to Program Design, Third Edition

We do not teach programming

Lecture 12: Data Types (and Some Leftover ML)

By the end of this section you should: Understand what the variables are and why they are used. Use C++ built in data types to create program

University of Arizona, Department of Computer Science. CSc 520 Assignment 4 Due noon, Wed, Apr 6 15% Christian Collberg March 23, 2005

EDIABAS BEST/2 LANGUAGE DESCRIPTION. VERSION 6b. Electronic Diagnostic Basic System EDIABAS - BEST/2 LANGUAGE DESCRIPTION

Control Structures. Lecture 4 COP 3014 Fall September 18, 2017

CS 230 Programming Languages

Introduction Primitive Data Types Character String Types User-Defined Ordinal Types Array Types. Record Types. Pointer and Reference Types

Overview: Programming Concepts. Programming Concepts. Names, Values, And Variables

Overview: Programming Concepts. Programming Concepts. Chapter 18: Get With the Program: Fundamental Concepts Expressed in JavaScript

Programming, numerics and optimization

Lecture 9: Control Flow

A brief introduction to C programming for Java programmers

Transcription:

The ACK Modula-2 Compiler Ceriel J.H. Jacobs Department of Mathematics and Computer Science Vrije Universiteit Amsterdam The Netherlands 1. Introduction This document describes the implementation-specific features of the ACK Modula-2 compiler. It is not intended to teach Modula-2 programming. For a description of the Modula-2 language, the reader is referred to [1]. The ACK Modula-2 compiler is currently available for use with the VAX, Motorola MC68020, Motorola MC68000, PDP-11, and Intel 8086 code-generators. For the 8086, MC68000, and MC68020, floating point emulation is used. This is made available with the -fp option, which must be passed to ack[4,5]. 2. The language implemented This section discusses the deviations from the Modula-2 language as described in the "Report on The Programming Language Modula-2", as it appeared in [1], from now on referred to as "the Report". Also, the Report sometimes leaves room for interpretation. The section numbers mentioned are the section numbers of the Report. 2.1. Syntax (section 2) The syntax recognized is that of the Report, with some extensions to also recognize the syntax of an earlier definition, given in[2]. Only one compilation unit per file is accepted. 2.2. Vocabulary and Representation (section 3) The input "10.." isparsed as two tokens: "10" and "..". The empty string"" has type ARRAY [0.. 0] OF CHAR and contains one character:0c. When the text of a comment starts with a $, it may be a pragma. Currently, the following pragmas exist: (*$F (F stands for Foreign) *) (*$R[+ -] (Runtime checks, on or off, default on) *) (*$A[+ -] (Array bound checks, on or off, default off) *) (*$U (Allow for underscores within identifiers) *) The Foreign pragma is only meaningful in a DEFINITION MODULE, and indicates that this DEFINI- TION MODULE describes an interface to a module written in another language (for instance C, Pascal, or EM). Runtime checks that can be disabled are: range checks, CARDINAL overflow checks, checks when assigning a CARDINAL to an INTEGER and vice versa, and checks that FOR-loop control-variables are not changed in the body of the loop. Array bound checks can be enabled, because many EM implementations do not implement the array bound checking of the EM array instructions. When enabled, the compiler

-2- generates a check before generating an EM array instruction. Even when underscores are enabled, they still may not start an identifier. Constants of type LONGINT are integers with a suffix letter D (for instance 1987D). Constants of type LONGREAL have suffix D if a scale factor is missing, or have D in place of E in the scale factor (f.i. 1.0D, 0.314D1). This addition was made, because there was no way to indicate long constants, and also because the addition was made in Wirth s newest Modula-2 compiler. 2.3. Declarations and scope rules (section 4) Standard identifiers are considered to be predeclared, and valid in all parts of a program. They are called pervasive. Unfortunately, the Report does not state how this pervasiveness is accomplished. However, page 87 of [1] states: "Standard identifiers are automatically imported into all modules". Our implementation therefore allows redeclarations of standard identifiers within procedures, but not within modules. 2.4. Constant expressions (section 5) Each operand of a constant expression must be a constant: a string, a number, a set, an enumeration literal, a qualifier denoting a constant expression, a type transfer with a constant argument, or one of the standard procedures ABS, CAP, CHR, LONG, MAX, MIN, ODD, ORD, SIZE, SHORT, TSIZE, or VAL, with constant argument(s);tsize andsize may also have a variable as argument. Floating point expressions are never evaluated compile time, because the compiler basically functions as a cross-compiler, and thus cannot use the floating point instructions of the machine on which it runs. Also,MAX(REAL) andmin(real) are not allowed. 2.5. Type declarations (section 6) 2.5.1. Basic types (section 6.1) The type CHAR includes the ASCII character set as a subset. Values range from 0C to 377C, not from0c to177c. 2.5.2. Enumerations (section 6.2) The maximum number of enumeration literals in any one enumeration type ismax(integer). 2.5.3. Record types (section 6.5) The syntax of variant sections in [1] is different from the one in [2]. Our implementation recognizes both, giving a warning for the older one. However, see section 3. 2.5.4. Set types (section 6.6) The only limitation imposed by the compiler is that the base type of the set must be a subrange type, an enumeration type, CHAR, or BOOLEAN. So, the lower bound may be negative. However, if a neg ative lower bound is used, the compiler gives awarning of the restricted class (see the manual page of the compiler). The standard typebitset is defined as TYPE BITSET = SET OF [0.. 8*SIZE(INTEGER)-1]; 2.6. Expressions (section 8) 2.6.1. Operators (section 8.2) 2.6.1.1. Arithmetic operators (section 8.2.1) The Report does not specify the priority of the unary operators+or-: Itdoes not specify whether

-3- means or -1+1 -(1+1) (-1) + 1 I have seen some compilers that implement the first alternative, and others that implement the second. Our compiler implements the second, which is suggested by the fact that their priority is not specified, which might indicate that it is the same as that of their binary counterparts. And then the rule about left to right decides for the second. On the other hand one might argue that, since the grammar only allows for one unary operator in a simple expression, it must apply to the whole simple expression, not just the first term. 2.7. Statements (section 9) 2.7.1. Assignments (section 9.1) The Report does not define the evaluation order in an assignment. Our compiler certainly chooses an evaluation order, but it is explicitly left undefined. Therefore, programs that depend on it may cease to work later. The types INTEGER and CARDINAL are assignment-compatible with LONGINT, and REAL is assignment-compatible withlongreal. 2.7.2. Case statements (section 9.5) The size of the type of the case-expression must be less than or equal to the word-size. The Report does not specify what happens if the value of the case-expression does not occur as a label of any case, and there is noelse-part. In our implementation, this results in a runtime error. 2.7.3. For statements (section 9.8) The Report does not specify the legal types for a control variable. Our implementation allows the basic types (except REAL), enumeration types, and subranges. A runtime warning is generated when the value of the control variable is changed by the statement sequence that forms the body of the loop, unless runtime checking is disabled. 2.7.4. Return and exit statements (section 9.11) The Report does not specify which result-types are legal. Our implementation allows any result type. 2.8. Procedure declarations (section 10) Function procedures must exit through a RETURN statement, or a runtime error occurs. 2.8.1. Standard procedures (section 10.2) Our implementation supports NEW and DISPOSE for backwards compatibility, but issues warnings for their use. However, see section 3. Also, some new standard procedures were added, similar to the new standard procedures in Wirth s newest compiler: LONG converts an argument of typeinteger orreal to the typeslongint orlongreal. SHORT performs the inverse transformation, without range checks. FLOATD is analogous tofloat, but yields a result of typelongreal. TRUNCD is analogous totrunc, but yields a result of typelongint.

-4-2.9. System-dependent facilities (section 12) The type BYTE is added to the SYSTEM module. It occupies a storage unit of 8 bits. ARRAY OF BYTE has a similar effect to ARRAY OF WORD, but is safer. Insome obscure cases the ARRAY OF WORD mechanism does not quite work properly. The procedureiotransfer is not implemented. 3. Backwards compatibility Besides recognizing the language as described in [1], the compiler recognizes most of the language described in [2], for backwards compatibility. It warns the user for old-fashioned constructions (constructions that [1] does not allow). If the -Rm2-3 option (see [6]) is passed to ack, this backwards compatibility feature is disabled. Also, it may not be present on some smaller machines, like the PDP-11. 4. Compile time errors The compile time error messages are intended to be self-explanatory, and not listed here. The compiler also sometimes issues warnings, recognizable by a warning-classification between parentheses. Currently, there are 3 classifications: (old-fashioned use) These warnings are given onconstructions that are not allowed by [1], but are allowed by [2]. (strict) These warnings are given on constructions that are supported by the ACK Modula-2 compiler, but might not be supported by others. Examples: functions returning structured types, SET types of subranges with negative lower bound. (warning) The other warnings, such as warnings about variables that are never assigned, never used, etc. 5. Runtime errors The ACK Modula-2 compiler produces code for an EM machine as defined in [3]. Therefore, it depends on the implementation of the EM machine for detection some of the runtime errors that could occur. The Tr aps module enables the user to install his own runtime error handler. The default one just displays what happened and exits. Basically, a trap handler is just a procedure that takes an INTEGER as parameter. The INTEGER is the trap number. This INTEGER can be one of the EM trap numbers, listed in [3], or one of the numbers listed in the Tr aps definition module. The following runtime errors may occur: array bound error range bound error Range bound errors are always detected, unless runtime checks are disabled. set bound error The current implementations detect this error. integer overflow cardinal overflow This error is detected, unless runtime checks are disabled. cardinal underflow This error is detected, unless runtime checks are disabled. real overflow

-5- real underflow divide by 0 divide by 0.0 undefined integer undefined real conversion error This error occurs when assigning a negative value of type INTEGER to a variable of type CARDI- NAL, or when assigning a value of CARDINAL that is > MAX(INTEGER), to a variable of type INTEGER. It is detected, unless runtime checking is disabled. stack overflow heap overflow Might happen when ALLOCATE fails. case error This error occurs when non of the cases in a CASE statement are selected, and the CASE statement has no ELSE part. All current EM implementations detect this error. stack size of process too large This is most likely to happen if the reserved space for a coroutine stack is too small. In this case, increase the size of the area given to NEWPROCESS. Itcan also happen if the stack needed for the main process is too large and there are coroutines. In this case, the only fix is to reduce the stack size needed by the main process, f.i. by avoiding local arrays. too many nested traps + handlers This error can only occur when the user has installed his own trap handler. Itmeans that during execution of the trap handler another trap has occurred, and that several times. In some cases, this is an error because of overflow ofsome internal tables. no RETURN from function procedure This error occurs when a function procedure does not return properly ("falls" through). illegal instruction This error might occur when floating point operations are used on an implementation that does not have floating point. In addition, some of the library modules may give error messages. The Traps-module has a suitable mechanism for this. 6. Calling the compiler See [4,5,6] for a detailed explanation. The compiler itself has no version checking mechanism. Aspecial linker would be needed to do that. Therefore, a makefile generator is included [7]. 7. The procedure call interface Parameters are pushed on the stack in reversed order, so that the EM AB (argument base) register indicates the first parameter. For VAR parameters, its address is passed, for value parameters its value. The only exception to this rule is with conformant arrays. For conformant arrays, the address is passed, and an array descriptor is passed. The descriptor is an EM array descriptor. It consists of three fields: the lower

-6- bound (always 0), upper bound - lower bound, and the size of the elements. The descriptor is pushed first. If the parameter is a value parameter, the called routine must make sure that its value is never changed, for instance by making its own copy ofthe array. The Modula-2 compiler does exactly this. When the size of the return value of a function procedure is larger than the maximum of SIZE(LONGREAL) and twice the pointer-size, the caller reserves this space on the stack, above the parameters. Callee then stores its result there, and returns no other value. 8. References [1] Niklaus Wirth, Programming in Modula-2, third, corrected edition, Springer-Verlag, Berlin (1985) [2] Niklaus Wirth, Programming in Modula-2, Stringer-Verlag, Berlin (1983) [3] A.S.Tanenbaum, J.W.Stevenson, Hans van Staveren, E.G.Keizer, Description of a machine architecture for use with block structured languages, Informatica rapport IR-81, Vrije Universiteit, Amsterdam [4] UNIX manual ack(1) [5] UNIX manual modula-2(1) [6] UNIX manual em_m2(6) [7] UNIX manual m2mm(1)