String Computation Program

Similar documents
String Computation Program

CSCI 1061U Programming Workshop 2. C++ Basics

Language Reference Manual

Language Reference Manual simplicity

Overview. - General Data Types - Categories of Words. - Define Before Use. - The Three S s. - End of Statement - My First Program

OUTLINES. Variable names in MATLAB. Matrices, Vectors and Scalar. Entering a vector Colon operator ( : ) Mathematical operations on vectors.

1 Lexical Considerations

Full file at

printf( Please enter another number: ); scanf( %d, &num2);

Java+- Language Reference Manual

Programming Language Basics

The SPL Programming Language Reference Manual

22c:111 Programming Language Concepts. Fall Types I

JME Language Reference Manual

age = 23 age = age + 1 data types Integers Floating-point numbers Strings Booleans loosely typed age = In my 20s

Programming with C++ as a Second Language

d-file Language Reference Manual

Lexical Considerations

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

easel LANGUAGE REFERENCE MANUAL

Getting started with Java

EDIABAS BEST/2 LANGUAGE DESCRIPTION. VERSION 6b. Electronic Diagnostic Basic System EDIABAS - BEST/2 LANGUAGE DESCRIPTION

Scheme: Strings Scheme: I/O

EZ- ASCII: Language Reference Manual

CHAD Language Reference Manual

The PCAT Programming Language Reference Manual

Chapter 2 Working with Data Types and Operators

Key Differences Between Python and Java

Sprite an animation manipulation language Language Reference Manual

Standard 11. Lesson 9. Introduction to C++( Up to Operators) 2. List any two benefits of learning C++?(Any two points)

Lexical Considerations

.. Cal Poly CPE 101: Fundamentals of Computer Science I Alexander Dekhtyar..

Such JavaScript Very Wow

VLC : Language Reference Manual

Input File Syntax The parser expects the input file to be divided into objects. Each object must start with the declaration:

UNIT- 3 Introduction to C++

CS 541 Spring Programming Assignment 2 CSX Scanner

Pace University. Fundamental Concepts of CS121 1

VHDL Lexical Elements

Project 1: Scheme Pretty-Printer

Programming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

DaMPL. Language Reference Manual. Henrique Grando

Typescript on LLVM Language Reference Manual

A First Program - Greeting.cpp

BoredGames Language Reference Manual A Language for Board Games. Brandon Kessler (bpk2107) and Kristen Wise (kew2132)

CS1 Lecture 3 Jan. 22, 2018

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

CSI33 Data Structures

Program Fundamentals

A Simple Syntax-Directed Translator

sottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi

Use of scanf. scanf("%d", &number);

GraphQuil Language Reference Manual COMS W4115

Introduction to Programming Using Java (98-388)

Python for Non-programmers

COMS 3101 Programming Languages: Perl. Lecture 1

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program

CGS 3066: Spring 2015 JavaScript Reference

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

C++ Basics. Lecture 2 COP 3014 Spring January 8, 2018

Programming Languages Third Edition. Chapter 7 Basic Semantics

UNIVERSAL SERIAL INTERFACE

SOAPScript For squeaky-clean code, use SOAPScript Language Summary Bob Nisco Version 0.5 β

Primitive Data Types: Intro

ECE251 Midterm practice questions, Fall 2010

Tools : The Java Compiler. The Java Interpreter. The Java Debugger

Chapter 2 Using Data. Instructor s Manual Table of Contents. At a Glance. Overview. Objectives. Teaching Tips. Quick Quizzes. Class Discussion Topics

Binghamton University. CS-120 Summer Introduction to C. Text: Introduction to Computer Systems : Chapters 11, 12, 14, 13

Introduction to Computation for the Humanities and Social Sciences. CS 3 Chris Tanner

MEIN 50010: Python Flow Control

CS1 Lecture 3 Jan. 18, 2019

ECS 120 Lesson 7 Regular Expressions, Pt. 1

Petros: A Multi-purpose Text File Manipulation Language

Cellular Automata Language (CAL) Language Reference Manual

Introduction to: Computers & Programming: Review prior to 1 st Midterm

MR Language Reference Manual. Siyang Dai (sd2694) Jinxiong Tan (jt2649) Zhi Zhang (zz2219) Zeyang Yu (zy2156) Shuai Yuan (sy2420)

CSEN 102 Introduction to Computer Science

First Java Program - Output to the Screen

Makefile Brief Reference

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #29 Arrays in C

Chapter 2 Basic Elements of C++

1. Lexical Analysis Phase

Wentworth Institute of Technology. Engineering & Technology WIT COMP1000. Java Basics

Section 2.2 Your First Program in Java: Printing a Line of Text

Chapter 2 Elementary Programming

ucalc Patterns Examine the following example which can be pasted into the interpreter (paste the entire block at the same time):

JavaScript I Language Basics

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Chapter 2: Using Data

Ordinary Differential Equation Solver Language (ODESL) Reference Manual

ARG! Language Reference Manual

UNIT - I. Introduction to C Programming. BY A. Vijay Bharath

Assigns a number to 110,000 letters/glyphs U+0041 is an A U+0062 is an a. U+00A9 is a copyright symbol U+0F03 is an

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

Full file at

x = 3 * y + 1; // x becomes 3 * y + 1 a = b = 0; // multiple assignment: a and b both get the value 0

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

Mr. Monroe s Guide to Mastering Java Syntax

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

Transcription:

String Computation Program Project Proposal Scott Pender scp2135@columbia.edu COMS4115 Fall 2012 9/26/2012 Project proposal for the COMS4115 term project: Explain what problem the language solves & how it is used; describe an interesting representative program in your language; provide examples of its syntax

Mission Design and implement a custom language whose compiler is built using O Caml. The language should be able to solve a problem by accepting a combination of commands, algorithms, and data, perform appropriate computation, and then return meaningful output. Additionally, this goal should be achieved using a minimalistic base language. Language Description String Computation Program (scam) is a small, yet powerful compiled language created for the sole purpose of text processing. Using a minimal character set, scam allows a program to interpret and process string tokens. Adapting a hand-chosen blend of features from PERL, Python, and awk scripting languages, scam uses shorthand notation to denote relationships between strings. Key components of the scam language include variables, arrays, functions, control statements, strings, and integers. Simply put, scam may be used to compute algorithms on string data. The language itself does not include any algorithms to process this data beyond Boolean comparison operators. Rather than include processing rules in the language itself, scam provides the ability to access included data structures and their respective attributes so that you as the developer can create your own rules. Syntax Among the most compelling features of scam is the ability to call functions using postfix notation. For example, the user-defined function to test for equality, = might accept two arguments X and Y and return the strings yes or no To call this function, a program would include a line similar to that listed below. @: Does X=Y? X Y = #Does X=Y? no This line of code combines three main tasks into one command. First, the compiler parses from the right and sees the = function label. Since the function takes two values, the next two space-delimited variables to the left are accepted to the function and a yes or no string is returned. As the parser continues to the left it concatenates the two string values. When it reaches the colon, it reads the token to the left to see what to do with the data. The lone @ to the left of the colon tells the compiler to send the value to the right of the colon to the system console. The concatenated values are printed to the console without quotes. For a more detailed description of language syntax and behavior, an alpha version of the scam reference manual can be found in APPENDIX A. 1

Sample Program A sample of a small program in scam is provided below.? This is a program to compare files called diff? The program is called using the following command:? scam diff filename1 filename2 (without quotes) MAIN: data1() : @1 data2() : @2 data1() data2() DIFF @: Diff test complete DIFF: file1() file2() linenum: (0) file1unfinished: (1) file2unfinished: (1) WHILE: file1unfinished file2unfinished OR IF: linenum (file1()) > file1unfinished: (0) @: DIFF: Line linenum of @2 does not exist in @1 IF: linenum (file2()) > file2unfinished: (0) @: DIFF: Line linenum of @1 does not exist in @2 IF: file1unfinished file2unfinished AND IF: file1(linenum) file2(linenum)!= COMPARE_LINES: file1(linenum) file2(linenum) linenum linenum: linenum (1) + COMPARE_LINES: line1 line2 linenum charnum = (0) testfinished: (0) WHILE: testfinished! IF: [line1(charnum) line2(charnum) =]! @: DIFF: Character charnum of line linenum does not match IF: [charnum (line1) >=] [charnum (line2) >=] OR testfinished: (1) charnum: charnum (1) + 2

APPENDIX A: Language Reference Manual (alpha) Parsing rules Right-first traversal Token properties @ external source/destination o @./filename.txt o @ no following symbols means console o @1 @2 @3 etc. represent command-line arguments : define something or set left to value of right (integers) o Integer numbers are always read as strings except when surrounded by () o Mathematical functions + and are defined only for use on integers, using postfix notation. " strings o No need to escape characters - everything is valid until the closing " except for string formatting characters o String formatting characters ~N for new line ~T for tab character o Implicit concatenation of everything but entire arrays and numbers (ignore these) o All null values or missing parameters default to "" o A string follows the same rules as an array Sub-array o "mystring"(0:2) references the first 3 characters in a string Array cell o "mystring"(0) references the first char in a string Length o ("mystring") returns (8) o Single character string,"a" ("A") if string length = 1, return the ascii decimal value of the character [] Group expressions together? Single-line comments Input/Output Output o Output X to console (concatenates all string values) @: X o Output to file @filepath: "text to print to file" @"C:/my folder/textfile": "text to print to file" Input o From console X(): @ o From file X(): @filepath 3

Functions, arrays, & variables Arrays o X() array of undefined size. Used both to define and reference the array. o X(3) references cell 4 of an array - 0 based o X(0:2) represents the sub-array containing first 3 cells o M(): X Y Z create an array where M(0)=X, M(1)=Y, & M(2)=Z o Index order must be sequential and starts at 0 o X(indexVal) gets cell of index value, previously defined as a variable. If undefined use return. o (X(12)) gets the length -> (13) o Arrays may not store numbers. Strings only! o No space between array name and () Functions o mylabel: X Y Z Impl of mylabel mylabel: "returnvalue" o Declaration of a function 'mylabel' that accepts parameters X,Y, and Z. Note that tab is used as the delimiter for the inside of a function. If there is no tab-delimited line following the method signature, the definition will be parsed as a variable definition. o Within function def, the function label may be set like a variable to indicate a return value. Default return value is "". Since this is set like a variable, it can be used to make recursive calls o X Y Z mylabel...call the function Y Z mylabel...y is read as X, Z is read as Y Counts left looking for the exact number of args it accepts. Will stop early if there are extras. If less than expected it will stop once a non-quote symbol or eol is hit and use only tokens before (to the right of) the symbol. o Implicit return from function (can return string or integer) @: X Y Z mylabel Variables o X: 2 Define variable X as string value of 2 o X: (2) Define variable X as integer value of 2 o XY:"strinng" Define variable XY as string "strinng" o (X) Length of value in X. default is 0. o All global (outside of functions) vars always accessible (except if a local var within a procedure hides it) Defined Functions Boolean evaluation o =, <, >, <=, >= will return (0) or (1) o OR & AND o! Not Control Statements 4

Pre-defined labels using the same notation as a standard function. The right-hand side of the evaluation will accept (1) as true (unless! is used). All other values are false. IF: X Do a bunch of stuff WHILE: X Do a bunch of stuff FOREACH: cellx myarr() Do a bunch of stuff with cellx o Only applies to arrays or strings 5