Appendix: Generic PbO programming language extension

Similar documents
IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

CS201 - Introduction to Programming Glossary By

Short Notes of CS201

RSL Reference Manual

The SPL Programming Language Reference Manual

The PCAT Programming Language Reference Manual

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

1 Lexical Considerations

Intermediate Code Generation

Programming Languages Third Edition. Chapter 9 Control I Expressions and Statements

Full file at

6.096 Introduction to C++

CE221 Programming in C++ Part 1 Introduction

GraphQuil Language Reference Manual COMS W4115

Lecture 2: Big-Step Semantics

Decaf Language Reference Manual

Objectives. In this chapter, you will:

Lecture 2 Tao Wang 1

CSI33 Data Structures

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program

Chapter 2: Basic Elements of C++

Chapter 2: Basic Elements of C++ Objectives. Objectives (cont d.) A C++ Program. Introduction

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

Learning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104)

Lexical Considerations

Arrays. Defining arrays, declaration and initialization of arrays. Designed by Parul Khurana, LIECA.

CSc 10200! Introduction to Computing. Lecture 2-3 Edgardo Molina Fall 2013 City College of New York

2.2 Syntax Definition

Variables and Constants

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

3. Except for strings, double quotes, identifiers, and keywords, C++ ignores all white space.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

IEEE LANGUAGE REFERENCE MANUAL Std P1076a /D3

Lexical Considerations

Chapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

SMURF Language Reference Manual Serial MUsic Represented as Functions

Cpt S 122 Data Structures. Introduction to C++ Part II

Type Checking and Type Equality

Programming Language 2 (PL2)

The role of semantic analysis in a compiler

CPS122 Lecture: From Python to Java

EDIABAS BEST/2 LANGUAGE DESCRIPTION. VERSION 6b. Electronic Diagnostic Basic System EDIABAS - BEST/2 LANGUAGE DESCRIPTION

1.1 Introduction to C Language. Department of CSE

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

6.001 Notes: Section 8.1

Declaration and Memory

2 nd Week Lecture Notes

SPARK-PL: Introduction

CS112 Lecture: Primitive Types, Operators, Strings

Language Reference Manual simplicity

Array. Prepared By - Rifat Shahriyar

QUIZ. What is wrong with this code that uses default arguments?

Appendix A. The Preprocessor

Haskell Introduction Lists Other Structures Data Structures. Haskell Introduction. Mark Snyder

CSc Introduction to Computing

Announcements. Working on requirements this week Work on design, implementation. Types. Lecture 17 CS 169. Outline. Java Types

Brief Summary of Java

Introduction to Programming Using Java (98-388)

Advanced Algorithms and Computational Models (module A)

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Topic 2: Introduction to Programming

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

A Short Summary of Javali

ucalc Patterns Examine the following example which can be pasted into the interpreter (paste the entire block at the same time):

DATA STRUCTURES CHAPTER 1

Arrays. Comp Sci 1570 Introduction to C++ Array basics. arrays. Arrays as parameters to functions. Sorting arrays. Random stuff

CS313D: ADVANCED PROGRAMMING LANGUAGE

Programming Language Concepts, cs2104 Lecture 04 ( )

Honu. Version November 6, 2010

GBIL: Generic Binary Instrumentation Language. Language Reference Manual. By: Andrew Calvano. COMS W4115 Fall 2015 CVN

Wellesley College CS251 Programming Languages Spring, 2000 FINAL EXAM REVIEW PROBLEM SOLUTIONS

Computer Programming

.. Cal Poly CPE 101: Fundamentals of Computer Science I Alexander Dekhtyar..

Logical Methods in... using Haskell Getting Started

ECE 122 Engineering Problem Solving with Java

6.001 Notes: Section 1.1

Agenda & Reading. VB.NET Programming. Data Types. COMPSCI 280 S1 Applications Programming. Programming Fundamentals

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

The Dynamic Typing Interlude

C How to Program, 6/e by Pearson Education, Inc. All Rights Reserved.

Topic 1: Introduction

C++ Programming: From Problem Analysis to Program Design, Third Edition

1 Introduction Java, the beginning Java Virtual Machine A First Program BlueJ Raspberry Pi...

IT 374 C# and Applications/ IT695 C# Data Structures

Chapter 2 Basic Elements of C++

What are the characteristics of Object Oriented programming language?

Chapter-8 DATA TYPES. Introduction. Variable:

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

9. Elementary Algebraic and Transcendental Scalar Functions

Introduction to C Final Review Chapters 1-6 & 13

Tree Parsing. $Revision: 1.4 $

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

Fundamental Concepts and Definitions

c) Comments do not cause any machine language object code to be generated. d) Lengthy comments can cause poor execution-time performance.

VARIABLES AND CONSTANTS

Programming Languages Third Edition. Chapter 7 Basic Semantics

printf( Please enter another number: ); scanf( %d, &num2);

FRAC: Language Reference Manual

Transcription:

Holger H. Hoos: Programming by Optimization Appendix: Generic PbO programming language extension As explained in the main text, we propose three fundamental mechanisms to be covered by a generic PbO programming language extension; these provide light-weight support for exposing parameters, for specifying alternative blocks of code and for flexibly exposing information collected at run-time. While the precise syntax and realization of the mechanisms may undergo changes in the future, we believe that what is described in the following forms a solid basis for design space specifications in the context of PbO-based software development. After outlining the three mechanisms that form the generic language extension and illustrating their use by means of brief examples, we explain the design objectives underlying our proposal and discuss potential drawbacks. A.1 Support for exposing parameters We propose the following light-weight construct for declaring parameters that are to be exposed as commandline arguments (where the part in square brackets is optional): ##PARAM( type name [ = value ] ) type indicates the type of the parameter and can take one of the following values: bool = boolean int = integer float = real char = character str = string where the two literals shown in each row are synonyms. Furthermore, to support ranges and sets of allowable values, complex type specifiers of the following forms can be used: int[a,b] = int[a..b] int{i 1,i 2,,i n } float[a,b] float{x 1,x 2,,x n } char{c 1,c 2,,c n } str{s 1,s 2,,s n } where a, b, i k, x k, c k, s k are valid literals of the respective types and bracket symbols,, and.. are used as shown. 1 name indicates the name of the parameter and needs to be a legal variable name in the programming language under consideration. value is an optional default value assigned to the parameter when the program is called; if a different value is assigned on the command line or configuration file, that value overrides the default. For Boolean parameters, values can be specified as false and true, f and t, 1 Further extensions of this syntax e.g., to include hints on potential discretization of real-valued parameters or preferred values can be easily imagined; however, we believe that it would be preferable to specify this type of information, possibly along with more complex constraints on allowable values, in a separate input file to the PbO weaver.

or as 0 and 1. Character and string values are delimited by single or double quotes. An error is generated when the specified default value does not have the correct type. Examples: ##PARAM(int numiterations) integer parameter numiterations gets exposed; there is no default value, and an error will be generated when calling the executable without assigning a value to this parameter. ##PARAM(int numiterations=10) integer parameter numiterations gets exposed, and its default value is set to 10. ##PARAM(int[0..1000] numiterations=10) integer parameter numiterations with possible values between 0 and 1000 gets exposed, and its default value is set to 10. Parameter declarations can occur at any place in the source code, but we suggest that they should be located consistently either at the beginning of the source for the main procedure, function, object or class of the program, or just before their first occurrence (i.e., the place where the parameter value is first accessed). We also suggest to use one separate line for each parameter declaration. Throughout the program, parameter values are accessed using the following syntax: ##name Parameters can be read arbitrarily often and at arbitrary places throughout the program, wherever it would be legal for a literal representing the parameter value to occur. Access to parameters declared via this mechanism is read-only, i.e., their values cannot be changed other than via a command-line argument or an entry in the configuration file. Example: ##numiterations the value of parameter numiterations (read-only) Resolution by the PbO weaver To process source code using these constructs for declaring and accessing parameters, the PbO weaver collects all parameter declarations from a software project. It checks for validity of parameter names and types, as well as for agreements between types and default values (where those have been specifies). It also ensures that all declarations use different parameter names. In its parametric mode, the weaver generates source code for creating a distinct global variable for each parameter; parsing and processing command-line arguments and configuration files;

generating error messages, if invalid values are being assigned to parameters via the command line or configuration file, or if no value is assigned to a default-less parameter; accessing parameter values where needed. This code is integrated into the given source(s), replacing or removing the original parameter references and declarations (##name and ##PARAM statements) as needed. In instantiation mode, the weaver replaces some or all exposed parameters with literal values. This is done by replacing the parameter wherever its value is read with the respective literal. Instantiated parameters are thus hard-wired into the source code of the program under consideration and cannot be changed. Attempts to set them via the command line or configuration file produce the same error message as an attempt to set an undeclared parameter. The parameters to be instantiated and their respective values are given as an input to the weaver via its user interface (GUI or command line / specific instantiation file). Any uninstantiated parameters are treated as in parametric mode and thus exposed to be set at run time via the command line or configuration file. A.2 Support for specifying alternative blocks of code The mechanism used for the specification of design alternatives in the form of alternative blocks of code is based on so-called choices and choice points. A choice is a set of interchangeable blocks of code that represent design alternatives (where one alternative might be an empty block); those blocks are called instances of the choice. A choice point is a location in a program at which a choice is available. During execution, an instance of a choice is called active if it has been explicitly selected as such (via a command-line argument, configuration file or instantiation by the weaver). Choices are declared using the following syntax (where the part in square brackets is optional): ##BEGIN CHOICE name [ = id ]) code ##END CHOICE name name is the name of the choice and needs to be a valid variable name in the programming language under consideration; it also needs to be different from any parameter exposed via the mechanism described previously. id is an (optional) identifier given to the instance of the choice represented by code fragment code (which can be empty); this identifier can be an arbitrary sequence of letters and digits. Examples: ##BEGIN CHOICE preprocessing block 1 the code in block 1 is marked up as a choice named preprocessing, and will only be executed if preprocessing is active at run time.

##BEGIN CHOICE preprocessing=standard block s ##BEGIN CHOICE preprocessing=enhanced block e a choice named preprocessing with two alternative instances, named standard and enhanced, which correspond to the code fragments block s and block e, respectively. As a shorthand for declaring choices with multiple instances, the syntax demonstrated in the following example can be used: ##BEGIN CHOICE preprocessing=standard block s ##CHOICE preprocessing=enhanced block e semantically equivalent shorthand form of the previous example. The same choice name (and instances) can appear in multiple places within a program. At each of these choice points, the fragments of code specified for the respective choice declared at that point are available. Choices of this kind are called distributed choices. Example: ##BEGIN CHOICE preprocessing block 1... ##BEGIN CHOICE preprocessing block 2 two occurrences (choice points) of the same choice; the code fragments block 1 and block 2 (which may be different, of course), will only be executed if choice preprocessing is active at run time. For distributed choices, the set of instances available at each choice point may differ, as in the following example: ##BEGIN CHOICE preprocessing=standard block s ##CHOICE preprocessing=enhanced block e...

##BEGIN CHOICE preprocessing=enhanced block f if instance standard of choice preprocessing is active at run time, at the first choice point, block s is executed, while at the second choice point, an empty choice is made (since no code was specified for choice preprocessing=standard). Choices can be nested, as in the following example: ##BEGIN CHOICE steptype=1 block 1 ##BEGIN CHOICE steppostoptimization block 2 ##END CHOICE steppostoptimization block 3 ##END CHOICE steptype block 1 and block 3 are executed if choice steptype=1 is active; block 2 is executed (between block 1 and block 3 ) if steptype=1 and steppostoptimization are active at the same time. Choices nested within instances of other choices are only available if all the enclosing instances are active (in the example above, choice steppostoptimization, and therefore, block 2, is only available if choice steptype=1 is active). Such nested choices are therefore said to be conditional with respect to the enclosing (higher-level) choices. Resolution by the PbO weaver To handle choices, the PbO weaver, in its parametric mode, checks the validity of choice names and instance identifiers; it then introduces new, exposed parameters that allow for the control of choices at run time and transforms the choice declarations as required to facilitate this control. In instantiation mode, the weaver replaces the respective choice declarations with the code fragment corresponding to a specific instance for that choice. Instantiated choices are thus hard-wired into the source code of the program under consideration and cannot be controlled at run time, exactly like instantiated parameters. The choices to be instantiated and their respective instances are given as an input to the weaver via its user interface (GUI or command line / specific instantiation file). Any uninstantiated choices are treated as in parametric mode and thus exposed to be controlled at run time via the command line or configuration file. A.3 Support for exposing information collected at run-time To provide a light-weight mechanism for exposing information collected during run-time which could serve, for example, as input to execution controllers that adaptively control certain parameter settings at run-time we propose the construct ##LOG( type name = value ) where value is an expression whose value is logged (i.e., collected) at this point of the computation, name is a user-defined name used to identify the information being collected and needs to be a legal variable name in

the programming language under consideration, and type indicates the data type of the expression being evaluated and is one of the type indicators described in Section A.1. 2 Example: ##LOG(int timesincelastimpr = (curriter - lastimpriter)) the value of the expression curriter - lastimpriter (referring to two ordinary variables in the target languague) is logged under the name timesincelastimpr as program execution reaches the location where this ##LOG statement appears. Resolution by the PbO weaver The weaver replaces the ##LOG statements with calls to routines that perform type checking and write the value of the given expression to a destination determined by weaver settings. The weaver can also be instructed, via further settings, to ignore individual logging statements (these are identified by their names), or to disable logging alltogether. In both cases, the respective ##LOG statements are simply excised from the source. We note that logging statements intended to provide information to adaptive control mechanisms that only apply to certain design alternatives would tyically be found within instances of the respective choices, and would therefore only be activated along with those choices. Specifying information to be logged via this mechanism, rather than using target language constructs, enables transparent support for different logging mechanisms: Depending on settings of the PbO weaver, time-stamped values logged in this way can be written to one or more files (in particular, to the standard output); they could also be written to a database or sent directly to an execution controller, using other mechanisms, such as remote procedure calls or network sockets. A.4 Design objectives and potential drawbacks The main design objective behind the generic language extensions discussed in the previous sections was to render PbO design space specifications as straightforward as possible. This concerns in particular level 1, which we believe is likely to be the entry point for many developers exploring the use of PbO. In this context, the mechanisms for exposing parameters and design alternatives need to be as intuitive and light-weight as possible, and the respective constructs have to be designed with this in mind. By allowing parameter declarations to be made anywhere in the code, and in particular, just before the first actual use of the respective parameter, it becomes possible to expose a magical constant (or hidden parameter) by means of an extremely simple and, more importantly, entirely local modification of an existing source. Likewise, we judged it important to keep the overhead involved in declaring alternative blocks of code as low as possible. We believe that these design decisions, although motivated strongly by the needs of level 1 PbO-based software development, also benefit higher levels of PbO. A second, rather obvious design objective was to avoid clashes with target language constructs, and to provide a clean and obvious separation between the constructs of the PbO language extension and the remainder of the sources. This facilitates the construction of PbO weavers, which do not have to parse target language syntax. (This is a useful property also found in commonly used preprocessors.) The ability to parse the PbO constructs easily and independently of the target language also facilitates the implementation of special 2 Altough types could in most cases be inferred during static program analysis, we see some benefits in forcing the programmer to provide them explicitly; for example, explicit type information simplifies the design of the PbO weaver and can provide the basis for additional type checking.

support for them in software development environments, such as syntax highlighting, folding and convenient affordances for managing parameters, choices and logs. A third goal was to allow the design space of a program to be explored without the overhead of repeated compilation during the design optimization process, as well as the instantiation of some or all design choices. The first part matters, because compilation, even when restricted to certain modules and using caching of compiled code, often takes enough time to seriously slow down automated design optimization. The second part is important, because instantiation leads to faster and leaner executables, and furthermore, because in certain contexts, it might not be desirable to release a version of a program that gives the user access to the full design space (examples include demo versions, restricted versions, as well as situations where optimization for a given use context is a paid service). One potential drawback of the generic language extension presented here stems from the fact that the names of parameters, choices and logs are global to a software project. This raises the possibility of naming clashes, particularly when separately developed parts of a PbO-based project are integrated or merged, and may be seen to be at odds with established practices for modularization of name spaces. The decision to use global names was made because they offer the most light-weight way of dealing with the fact that design choices often cut across the structure of a program (and in particular, across module boundaries). Furthermore, we expect that even at the highest levels of PbO, there will be orders of magnitude fewer design choices (and logs) than other named entities (such as variables, functions or objects), and that therefore, segregation of name spaces is much less important for the former than for the latter. Finally, naming clashes are easily detected by the PbO weaver, and resolving them (e.g., when assembling parts of a system previously developed or used in different contexts) will be easy. The need for such resolution can be further reduced by adopting canonic naming schemes for parameters, choices and logs used in libraries and other parts of a project that are designed to be reused; for example, all names introduced by PbO constructs could be prefixed with the name of the library or module to which they belong.