CIF Changes to the specification. 27 July 2011

Size: px
Start display at page:

Download "CIF Changes to the specification. 27 July 2011"

Transcription

1 CIF Changes to the specification 27 July 2011 This document specifies changes to the syntax and binary form of CIF. We refer to the current syntax specification of CIF as CIF1, and the new specification as CIF2. To date all archival CIFs are CIF1. The changes to syntax are necessitated by the adoption of new dictionary functionalities that introduce several extensions, including new data types, and method definitions using drel. It is assumed the reader has a thorough understanding of the CIF1 specification. TERMINOLOGY Reference to character(s) means abstract characters assigned code points by Unicode. Specific characters are referenced according to Unicode convention, U+xxxx[x[x]], where xxxx[x[x]] is the four- to six-digit hexadecimal representation of the assigned code point. The preferred character encoding for CIF2 is UTF-8. Reference to ASCII characters means characters U+0000 through U+007F, or, equivalently the first 128 characters of the ISO (LATIN-1) character set. Reference to newline or \n means the sequence that conventionally terminates a line record (which is environment dependent). See Change 3. Reference to whitespace means the characters ASCII space (U +0020), ASCII horizontal tab (U+0009) and the newline characters. Without regard to local convention, the various other characters that Unicode classifies as whitespace (character categories Zs and Zp) do not constitute whitespace for the purposes of CIF2. PREAMBLE CIF2 significantly extends CIF1 functionality, primarily through new dictionary features. CIF2 is not fully backwards-compatible with CIF1: many files compliant with CIF1 are also compliant with CIF2, but some are not (see especially change 5, below). The CIF1 standard will continue to operate for the foreseeable future in parallel with CIF2. CHANGE 1 NEW (MAGIC CODE) CIF2 file is uniquely identified by a required magic code at the beginning of its first line. The code is, #\#CIF_2.0 followed immediately by whitespace. The immediately following space on this line is reserved for encoding disambiguation signatures. Note that where a Unicode BOM is used, it would appear prior to the magic code in the byte stream and does not form part of the CIF text.

2 CHANGE 2 NEW (CHARACTER SET) CIF2 files are standard variable length plain text files, which for compatibility with older processing systems will have a maximum line length of 2048 characters. As discussed above and below, however, there are some restrictions on the character set for token delimiters, separators and data names. For compatibility with CIF1 behaviour, there is no formal restriction on the encoding of CIF2 files, providing they contain only code points from the ASCII range. If a CIF2 file contains characters equivalent to Unicode code points greater than U+007F (127 decimal), then the particular encoding used must either be UTF8 or algorithmically identifiable from the CIF2 file itself. Acceptable identification algorithms will be published as necessary as annexes to this standard (see description of magic code and encoding disambiguation in Change 1). Annexes notwithstanding, (i) a CIF2 file containing characters outside the ASCII range with no BOM and no disambiguation signature will be a UTF8 file, and (ii) a CIF2 file containing characters outside the ASCII range with a valid UTF8 or UTF16 BOM and no disambiguation signature, will be a Unicode file written in the indicated encoding. In keeping with XML restrictions we allow the characters U+0009 U+000A U+000D U+0020 U+007E U+00A0 U+D7FF U+E000 U+FDCF U+FDF0 U+FFFD U U+10FFFD In addition, character U+FEFF and characters U+xFFFE or U+xFFFF where x is any hexadecimal digit are disallowed. Unicode reserves the code points E000 F8FF for private use. The IUCr and only the IUCr may specify what characters are assigned to these code points in the context of CIF2. Reasoning: There is growing demand for the wider character set afforded by Unicode to be made available in applications, especially those where internationalisation is an issue. CHANGE 3 RESTRICTION Treatment of Newline CIF2 processors are required to treat <U+000A>, <U+000D> and <U+000D><U+000A> as newline characters, by normalising them to <U+000A> on read. No other characters or character sequences may represent newline. In particular, CIF2 processors should not interpret the Unicode characters U+2028 (line separator) or U+2029 (paragraph separator) as newline. CHANGE 4 DEFINITION

3 Character set for data names. In CIF2 the tags referred to as data names are composed of characters from the allowed set above, excluding a whitespace, since this terminates data name string. A data name begins with an ASCII _ and may be followed by any number of characters within the 2048 character restriction. All data names in a valid CIF must be defined in a CIF dictionary referenced implicitly or explicitly by the CIF, and the DDLs in which such dictionaries are written may place additional restrictions on the data names that can be defined. For example, data names defined in a DDLm dictionary must match the regular expression _[A Za z0 9_.]+ (the. is the explicit ASCII period character). Note: In CIF2, as with CIF1, there is no explicit meaning to the sequence of characters, or the placement of _ or. in the data name. The separation of the category and attribute in a data name by a period (.) is purely a convention. CHANGE 5 RESTRICTION Whitespace delimited data values. A data value in CIF2 may be a whitespace delimited string of allowed characters. A whitespace delimited string cannot contain any of the ASCII characters [ { ] }. Furthermore, the first character of a whitespace delimited string cannot be any of the ASCII characters " ' _ $. STAR keywords may not appear as whitespace delimited strings: loop_ global_ save_* stop_ data_* (case insensitive), where * refers to zero or more characters. Reasoning: The above exclusions are required for CIF2 syntax to be unambiguous. CHANGE 6 RESTRICTION Delimited strings. The delimited strings accepted in CIF2 are, (1) A string delimited by ASCII single-quotes('). The string is initiated by an ASCII singlequote, can consist of allowed characters excluding the newline, and is terminated by the first subsequent ASCII single-quote. Clearly, the string within cannot contain ASCII single-quote characters. CIF2 does not specify any interpretation of the contents of the string. For example loop author.family _name 'Harris' 'Gr\"uber' As far as CIF2 is concerned, the string values are Harris and Gr\"uber; handling of any elide characters is left to the calling application. (2) A string delimited by ASCII double quotes ("). The string is initiated by an ASCII double-quote, can consist of allowed characters excluding the newline, and is terminated by the first subsequent ASCII double-quote. Clearly, the string within cannot contain ASCII double-quote characters.

4 CIF2 does not specify any interpretation of the contents of the string. For example _quote.literal "He said, 'We're going in circles'" The string value is He said, 'We're going in circles', including the embedded single-quotes. (3) A string delimited by ASCII newline semi-colon (\n;). The string is initiated by an ASCII newline semi-colon sequence, consists of any of the allowed characters, and is terminated by the first subsequent ASCII newline semi-colon sequence. Clearly, the strings within cannot contain an ASCII newline semi-colon sequence. The string is interpreted according to the CIF Line-Folding Protocol (Change 11) when its signature is present, and according to the CIF Text Prefix Protocol (Change 12) when its signature is present. Otherwise, CIF2 does not specify any interpretation of the contents of the string. For example _recipe.ingredients ;Sugar Flour Butter ; The string value is Sugar\nFlour\nButter, where \n is the literal newline. When the Line-Folding Protocol and Text Prefix Protocol are applied to the same value, the value shall appear in CIF as if line-folding had been performed first, followed by prefixing. With respect to changes 8, 9, and 10, the newline in the opening delimiter is considered to separate a delimited string of this kind from the preceding syntax element. CHANGE 7 NEW Triple quote delimited strings. The ASCII """ sequence (alternatively ASCII ''') delimits the beginning of a string that may contain any printable character and whitespace and is terminated by the first subsequent """ sequence (alternatively '''). CIF2 does not specify any interpretation of the contents of the string. The string can contain separable " and "" characters, (alternatively ' and ''). Clearly, the string within cannot contain an ASCII """ (or alternatively ASCII '''). For example """He said "His name is O'Hearly".""" '''In {\bf \TeX} the accents are \' and \".''' The string values are, He said "His name is O'Hearly". and In {\bf \TeX} the accents are \' and \".. No interpretation of any elides is undertaken; this is the responsibility of the calling application. The triple quote string supports embedded newlines, which are considered part of the string.

5 CHANGE 8 NEW List data type. The ASCII square bracket ([]) is accepted in STAR for delimiting the List compound data type. A List is an ordered sequence of values. A data value of type List is initiated by an ASCII left square bracket([) and terminated by the pair-matching ASCII right square bracket (]). The List elements are separated from each other by whitespace. For example loop colour_name _colour_value_rgb red [1 0 0] green [0 1 0] The elements of a List can be any CIF2 data values, and hence it is a recursive data type. For example _refln.hklfofc [[1 3 4] 23.32(9) 22.97(11)] Since List elements are whitespace separated (and may include \n;-delimited strings), a List can span more than one physical line. For example _refln.hklfofc [[1 3 4] 23.32(9) 22.97(11)] is identical to the previous example list. Whitespace is allowed, but not required, between the [] delimiters and the values within, and between the opening and closing delimiters of an empty List. CHANGE 9 NEW Table data type. The ASCII curly brace ({}) is accepted in STAR for delimiting the Table (Associative Array) compound data type. A Table is an unordered mapping from string labels ( indices ) to CIF2 data values. Labels must be unique within each Table (excluding Tables nested within it). A data value of type Table is initiated by an ASCII left curly brace ({) and terminated by the pair-matching ASCII right curly brace (}). The Table elements are separated from each other by whitespace, and consist of: an index label as a single- or double-quote delimited string, or as a triple-quoted string; an ASCII colon(:); and a CIF2 data value. For example {"symm":"p 4n 2 3 1n" 'avec':[ ] 'bvec':[ ] 'cvec':[ ] "description":"""cubic space group and metric cell vectors"""}

6 A Table is a recursive data type. Since Table elements are whitespace separated (and also since values may be \n;-delimited strings), a Table may span more than one physical line. Whitespace is allowed, but not required, between the {} delimiters and the elements within, and between the opening and closing delimiters of an empty Table. CHANGE 10 REFINEMENT to CIF1 Separating syntax elements. CIF2 keywords, data block headers, save frame headers, data names, and data values must all be separated from each other by whitespace. Whitespace not otherwise part of a CIF2 syntax element is significant only for this purpose. Reasoning: The CIF1 specification relies implicitly on the syntactic structure of the language to require whitespace separators between syntax elements. The CIF2 syntax no longer implicitly provides whitespace separators in some cases (notably, after most types of data values), therefore the requirement is now made explicit. CHANGE 11 REFINEMENT to CIF1 Line Folding Protocol. The CIF line-folding protocol is a mechanism for splitting logical lines of text across two or more physical lines of a CIF semicolon-delimited data value ("text field"). A version of this protocol appears among the "Common Semantic Features" of CIF 1.1 and is in wide use in that context; in CIF 2.0, line-folding is part of the CIF syntax, as described below. The protocol applies to text fields whose contents (after interpretation of the text prefix protocol, if applicable) begin with a backslash, followed by any number of whitespace characters other than newline, followed by newline or the end of the text field; that sequence is designated \<ws>*\n below. There must not be any whitespace preceding the initial backslash. Given un-prefixed (Change 12) text field contents to which the line-folding protocol applies, the logical text it represents is derived from it by removing each occurrence of \<ws>*\n, including the initial one. Different lines may have different amounts of whitespace between the trailing backslash and newline. Note that the line-folding protocol cannot elide the terminating \n; of a text field because the \n of that delimiter is not accounted part of the field contents. It follows from the definition of \<ws>*\n, however, that if the last line ends with \<ws>* then that will not appear in the unfolded value. CHANGE 12 NEW Text Prefix Protocol. The CIF text-prefix protocol is a mechanism for formatting the logical content of a CIF semicolon-delimited value ("text field") so as to avoid misinterpretation of embedded appearances of \n;. It may also be useful for improving human readability of some CIFs.

7 The protocol applies to text fields whose physical contents begin with a prefix (see below), followed by one or two backslashes (\), optionally followed by any amount of whitespace other than a newline, followed by a newline. A prefix consists of a sequence of one or more characters that are permitted in a text field, except for backslash or newline, and does not begin with a semicolon. The second and all subsequent physical lines of the contents of a prefixed text field must begin with the designated prefix for that field. The line containing the terminating semicolon is not part of the contents for this purpose. The logical (i.e. "un-prefixed") contents of the field is derived from the physical contents by the following procedure: 1) Remove the prefix from each line, including the first. 2) If the (remaining part of the) first line starts with two backslashes then remove the first of them; otherwise remove the whole line. Example 1: data_providing_example _example ;CIF>\ CIF>data_example CIF>_text CIF>;This is an embedded multiline value CIF>; ; # here the field terminates. Example 2: data_providing_example _example ; \ data_example _text ;This is an embedded multiline value ; ; # here the field terminates. Reasoning: together with the line-folding protocol, the text-prefix protocol provides a mechanism for representing arbitrary strings as CIF2 data values.

ECMA-404. The JSON Data Interchange Syntax. 2 nd Edition / December Reference number ECMA-123:2009

ECMA-404. The JSON Data Interchange Syntax. 2 nd Edition / December Reference number ECMA-123:2009 ECMA-404 2 nd Edition / December 2017 The JSON Data Interchange Syntax Reference number ECMA-123:2009 Ecma International 2009 COPYRIGHT PROTECTED DOCUMENT Ecma International 2017 Contents Page 1 Scope...

More information

Typescript on LLVM Language Reference Manual

Typescript on LLVM Language Reference Manual Typescript on LLVM Language Reference Manual Ratheet Pandya UNI: rp2707 COMS 4115 H01 (CVN) 1. Introduction 2. Lexical Conventions 2.1 Tokens 2.2 Comments 2.3 Identifiers 2.4 Reserved Keywords 2.5 String

More information

Filter Query Language

Filter Query Language 1 2 3 4 Document Number: DSP0212 Date: 2012-12-13 Version: 1.0.0 5 6 7 8 Document Type: Specification Document Status: DMTF Standard Document Language: en-us 9 DSP0212 10 11 Copyright notice Copyright

More information

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines. Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

Full file at

Full file at Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class

More information

Expr Language Reference

Expr Language Reference Expr Language Reference Expr language defines expressions, which are evaluated in the context of an item in some structure. This article describes the syntax of the language and the rules that govern the

More information

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides

More information

Chapter 2: Basic Elements of C++

Chapter 2: Basic Elements of C++ Chapter 2: Basic Elements of C++ Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates

More information

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program Objectives Chapter 2: Basic Elements of C++ In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates

More information

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual Contents 1 Introduction...2 2 Lexical Conventions...2 3 Types...3 4 Syntax...3 5 Expressions...4 6 Declarations...8 7 Statements...9

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a

More information

Chapter 2: Basic Elements of C++ Objectives. Objectives (cont d.) A C++ Program. Introduction

Chapter 2: Basic Elements of C++ Objectives. Objectives (cont d.) A C++ Program. Introduction Chapter 2: Basic Elements of C++ C++ Programming: From Problem Analysis to Program Design, Fifth Edition 1 Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers

More information

UNIT- 3 Introduction to C++

UNIT- 3 Introduction to C++ UNIT- 3 Introduction to C++ C++ Character Sets: Letters A-Z, a-z Digits 0-9 Special Symbols Space + - * / ^ \ ( ) [ ] =!= . $, ; : %! &? _ # = @ White Spaces Blank spaces, horizontal tab, carriage

More information

Honu. Version November 6, 2010

Honu. Version November 6, 2010 Honu Version 5.0.2 November 6, 2010 Honu is a family of languages built on top of Racket. Honu syntax resembles Java. Like Racket, however, Honu has no fixed syntax, because Honu supports extensibility

More information

Microsoft Power Query for Excel Formula Language Specification

Microsoft Power Query for Excel Formula Language Specification Microsoft Power Query for Excel Formula Language Specification August, 2015 2015 Microsoft Corporation. All rights reserved. This specification is provided as is and Microsoft disclaims all warranties

More information

Java Notes. 10th ICSE. Saravanan Ganesh

Java Notes. 10th ICSE. Saravanan Ganesh Java Notes 10th ICSE Saravanan Ganesh 13 Java Character Set Character set is a set of valid characters that a language can recognise A character represents any letter, digit or any other sign Java uses

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler

More information

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

CSC Web Technologies, Spring Web Data Exchange Formats

CSC Web Technologies, Spring Web Data Exchange Formats CSC 342 - Web Technologies, Spring 2017 Web Data Exchange Formats Web Data Exchange Data exchange is the process of transforming structured data from one format to another to facilitate data sharing between

More information

Using the Command-Line Interface

Using the Command-Line Interface This chapter describes how to use the CLI on the Cisco ASA. The CLI uses similar syntax and other conventions to the Cisco IOS CLI, but the ASA operating system is not a version of Cisco IOS software.

More information

TECkit version 2.0 A Text Encoding Conversion toolkit

TECkit version 2.0 A Text Encoding Conversion toolkit TECkit version 2.0 A Text Encoding Conversion toolkit Jonathan Kew SIL Non-Roman Script Initiative (NRSI) Abstract TECkit is a toolkit for encoding conversions. It offers a simple format for describing

More information

CSC 467 Lecture 3: Regular Expressions

CSC 467 Lecture 3: Regular Expressions CSC 467 Lecture 3: Regular Expressions Recall How we build a lexer by hand o Use fgetc/mmap to read input o Use a big switch to match patterns Homework exercise static TokenKind identifier( TokenKind token

More information

Digital Imaging and Communications in Medicine (DICOM) Part 10: Media Storage and File Format for Media Interchange

Digital Imaging and Communications in Medicine (DICOM) Part 10: Media Storage and File Format for Media Interchange PS 3.10-2000 Digital Imaging and Communications in Medicine (DICOM) Part 10: Media Storage and File Format for Media Interchange Warning: This copyrighted electronic document is a final draft document,

More information

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010 IPCoreL Programming Language Reference Manual Phillip Duane Douglas, Jr. 11/3/2010 The IPCoreL Programming Language Reference Manual provides concise information about the grammar, syntax, semantics, and

More information

ucalc Patterns Examine the following example which can be pasted into the interpreter (paste the entire block at the same time):

ucalc Patterns Examine the following example which can be pasted into the interpreter (paste the entire block at the same time): [This document is far from complete but discusses some pattern essentials. Check back for more in the future; ask questions for clarity] ucalc Patterns ucalc patterns represent a key element of all current

More information

Full file at C How to Program, 6/e Multiple Choice Test Bank

Full file at   C How to Program, 6/e Multiple Choice Test Bank 2.1 Introduction 2.2 A Simple Program: Printing a Line of Text 2.1 Lines beginning with let the computer know that the rest of the line is a comment. (a) /* (b) ** (c) REM (d)

More information

INTRODUCTION 1 AND REVIEW

INTRODUCTION 1 AND REVIEW INTRODUTION 1 AND REVIEW hapter SYS-ED/ OMPUTER EDUATION TEHNIQUES, IN. Programming: Advanced Objectives You will learn: Program structure. Program statements. Datatypes. Pointers. Arrays. Structures.

More information

Objectives. In this chapter, you will:

Objectives. In this chapter, you will: Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates arithmetic expressions Learn about

More information

CSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI

CSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI CSCI 2010 Principles of Computer Science Data and Expressions 08/09/2013 CSCI 2010 1 Data Types, Variables and Expressions in Java We look at the primitive data types, strings and expressions that are

More information

Linear Tape File System (LTFS) Format Specification

Linear Tape File System (LTFS) Format Specification Linear Tape File System (LTFS) Format Specification April 12, 2010 LTFS Format Version 1.0 This document presents the requirements for an interchanged tape conforming to a self describing format. This

More information

The PCAT Programming Language Reference Manual

The PCAT Programming Language Reference Manual The PCAT Programming Language Reference Manual Andrew Tolmach and Jingke Li Dept. of Computer Science Portland State University September 27, 1995 (revised October 15, 2002) 1 Introduction The PCAT language

More information

RTL Reference 1. JVM. 2. Lexical Conventions

RTL Reference 1. JVM. 2. Lexical Conventions RTL Reference 1. JVM Record Transformation Language (RTL) runs on the JVM. Runtime support for operations on data types are all implemented in Java. This constrains the data types to be compatible to Java's

More information

Electronic data interchange for administration, commerce and Transport (EDIFACT) - Application level syntax rules

Electronic data interchange for administration, commerce and Transport (EDIFACT) - Application level syntax rules ISO 9735 : 1988 (E) Electronic data interchange for administration, commerce and Transport (EDIFACT) - Application level syntax rules 1 Scope This International Standard gives syntax rules for the preparation

More information

Parser Design. Neil Mitchell. June 25, 2004

Parser Design. Neil Mitchell. June 25, 2004 Parser Design Neil Mitchell June 25, 2004 1 Introduction A parser is a tool used to split a text stream, typically in some human readable form, into a representation suitable for understanding by a computer.

More information

c) Comments do not cause any machine language object code to be generated. d) Lengthy comments can cause poor execution-time performance.

c) Comments do not cause any machine language object code to be generated. d) Lengthy comments can cause poor execution-time performance. 2.1 Introduction (No questions.) 2.2 A Simple Program: Printing a Line of Text 2.1 Which of the following must every C program have? (a) main (b) #include (c) /* (d) 2.2 Every statement in C

More information

Language Reference Manual

Language Reference Manual ALACS Language Reference Manual Manager: Gabriel Lopez (gal2129) Language Guru: Gabriel Kramer-Garcia (glk2110) System Architect: Candace Johnson (crj2121) Tester: Terence Jacobs (tj2316) Table of Contents

More information

The Logical Design of the Tokeniser

The Logical Design of the Tokeniser Page 1 of 21 The Logical Design of the Tokeniser Purpose 1. To split up a character string holding a RAQUEL statement expressed in linear text, into a sequence of character strings (called word tokens),

More information

d-file Language Reference Manual

d-file Language Reference Manual Erwin Polio Amrita Rajagopal Anton Ushakov Howie Vegter d-file Language Reference Manual COMS 4115.001 Thursday, October 20, 2005 Fall 2005 Columbia University New York, New York Note: Much of the content

More information

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things.

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things. A Appendix Grammar There is no worse danger for a teacher than to teach words instead of things. Marc Block Introduction keywords lexical conventions programs expressions statements declarations declarators

More information

Contents ISO/IEC :2003/FDAM-1. Page

Contents ISO/IEC :2003/FDAM-1. Page Contents Page Annex C (normative) RELAX NG Compact Syntax... 1 C.1 Introduction... 1 C.2 Syntax... 1 C.3 Lexical structure... 4 C.4 Declarations... 5 C.5 Annotations... 6 C.5.1 Support for annotations...

More information

Decaf Language Reference

Decaf Language Reference Decaf Language Reference Mike Lam, James Madison University Fall 2016 1 Introduction Decaf is an imperative language similar to Java or C, but is greatly simplified compared to those languages. It will

More information

CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon

CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon Groups of up to three students may submit common solutions for each problem in this homework and in all future homeworks You are responsible

More information

4 Programming Fundamentals. Introduction to Programming 1 1

4 Programming Fundamentals. Introduction to Programming 1 1 4 Programming Fundamentals Introduction to Programming 1 1 Objectives At the end of the lesson, the student should be able to: Identify the basic parts of a Java program Differentiate among Java literals,

More information

Internet Engineering Task Force (IETF) Request for Comments: 5987 Category: Standards Track August 2010 ISSN:

Internet Engineering Task Force (IETF) Request for Comments: 5987 Category: Standards Track August 2010 ISSN: Internet Engineering Task Force (IETF) J. Reschke Request for Comments: 5987 greenbytes Category: Standards Track August 2010 ISSN: 2070-1721 Abstract Character Set and Language Encoding for Hypertext

More information

DICOM Correction Item

DICOM Correction Item DICOM Correction Item Correction Number CP-601 Log Summary: Type of Modification Addition Name of Standard PS 3.4 2006 + CP 620 Rationale for Correction Several aspects of matching during queries are either

More information

The SPL Programming Language Reference Manual

The SPL Programming Language Reference Manual The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming

More information

1. Lexical Analysis Phase

1. Lexical Analysis Phase 1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,

More information

Network Working Group. Obsoletes: 1342 September 1993 Category: Standards Track

Network Working Group. Obsoletes: 1342 September 1993 Category: Standards Track Network Working Group K. Moore Request for Comments: 1522 University of Tennessee Obsoletes: 1342 September 1993 Category: Standards Track MIME (Multipurpose Internet Mail Extensions) Part Two: Message

More information

Universal Format Plug-in User s Guide. Version 10g Release 3 (10.3)

Universal Format Plug-in User s Guide. Version 10g Release 3 (10.3) Universal Format Plug-in User s Guide Version 10g Release 3 (10.3) UNIVERSAL... 3 TERMINOLOGY... 3 CREATING A UNIVERSAL FORMAT... 5 CREATING A UNIVERSAL FORMAT BASED ON AN EXISTING UNIVERSAL FORMAT...

More information

Oli Language Documentation

Oli Language Documentation Oli Language Documentation Release 0.0.1 Tomas Aparicio Sep 27, 2017 Contents 1 Project stage 3 2 Document stage 5 2.1 Table of Contents............................................. 5 2.1.1 Overview............................................

More information

Learning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104)

Learning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104) Learning Language Reference Manual 1 George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104) A. Introduction Learning Language is a programming language designed

More information

Linear Tape File System (LTFS) Format Specification

Linear Tape File System (LTFS) Format Specification Linear Tape File System (LTFS) Format Specification Publication of this Working Draft for review and comment has been approved by the LTFS TWG. This draft represents a best effort attempt by the LTFS TWG

More information

JME Language Reference Manual

JME Language Reference Manual JME Language Reference Manual 1 Introduction JME (pronounced jay+me) is a lightweight language that allows programmers to easily perform statistic computations on tabular data as part of data analysis.

More information

The New C Standard (Excerpted material)

The New C Standard (Excerpted material) The New C Standard (Excerpted material) An Economic and Cultural Derek M. Jones derek@knosof.co.uk Copyright 2002-2008 Derek M. Jones. All rights reserved. 1456 6.7.2.3 Tags 6.7.2.3 Tags type contents

More information

Representing text on the computer: ASCII, Unicode, and UTF 8

Representing text on the computer: ASCII, Unicode, and UTF 8 Representing text on the computer: ASCII, Unicode, and UTF 8 STAT/CS 287 Jim Bagrow Question: computers can only understand numbers. In particular, only two numbers, 0 and 1 (binary digits or bits). So

More information

ISO/IEC JTC1/SC22/WG20 N

ISO/IEC JTC1/SC22/WG20 N Reference number of working document: ISO/IEC JTC1/SC22/WG20 N Date: 2001-12-25 Reference number of document: ISO/IEC DTR2 14652 Committee identification: ISO/IEC JTC1/SC22 Secretariat: ANSI Information

More information

IDP : conga/minisat(id)

IDP : conga/minisat(id) IDP : conga/minisat(id) What is conga? Grounder developed here at the University of Kentucky that implements a subset of the IDP language Creates output for MinisatID to solve 2 7/28/2011 What is IDP?

More information

Direct Functions in Dyalog APL

Direct Functions in Dyalog APL Direct Functions in Dyalog APL John Scholes Dyalog Ltd. john@dyalog.com A Direct Function (dfn) is a new function definition style, which bridges the gap between named function expressions such as and

More information

The MaSH Programming Language At the Statements Level

The MaSH Programming Language At the Statements Level The MaSH Programming Language At the Statements Level Andrew Rock School of Information and Communication Technology Griffith University Nathan, Queensland, 4111, Australia a.rock@griffith.edu.au June

More information

Gabriel Hugh Elkaim Spring CMPE 013/L: C Programming. CMPE 013/L: C Programming

Gabriel Hugh Elkaim Spring CMPE 013/L: C Programming. CMPE 013/L: C Programming 1 Literal Constants Definition A literal or a literal constant is a value, such as a number, character or string, which may be assigned to a variable or a constant. It may also be used directly as a function

More information

Linear Tape File System (LTFS) Format Specification

Linear Tape File System (LTFS) Format Specification Linear Tape File System (LTFS) Format Specification Version 2.2.0 This document has been released and approved by the SNIA. The SNIA believes that the ideas, methodologies and technologies described in

More information

User Commands sed ( 1 )

User Commands sed ( 1 ) NAME sed stream editor SYNOPSIS /usr/bin/sed [-n] script [file...] /usr/bin/sed [-n] [-e script]... [-f script_file]... [file...] /usr/xpg4/bin/sed [-n] script [file...] /usr/xpg4/bin/sed [-n] [-e script]...

More information

Touchstone File Format Specification

Touchstone File Format Specification Touchstone File Format Specification Version 2. Touchstone File Format Specification Version 2. Ratified by the IBIS Open Forum April 24, 29 Copyright 29 by TechAmerica. This specification may be distributed

More information

Data and Expressions. Outline. Data and Expressions 12/18/2010. Let's explore some other fundamental programming concepts. Chapter 2 focuses on:

Data and Expressions. Outline. Data and Expressions 12/18/2010. Let's explore some other fundamental programming concepts. Chapter 2 focuses on: Data and Expressions Data and Expressions Let's explore some other fundamental programming concepts Chapter 2 focuses on: Character Strings Primitive Data The Declaration And Use Of Variables Expressions

More information

Regexator. User Guide. Version 1.3

Regexator. User Guide. Version 1.3 Regexator User Guide Version 1.3 Regexator User Guide C O N T E N T S 1 INTRODUCTION 5 1.1 Main Window 5 1.2 Regex Categories 6 1.3 Switcher 6 1.4 Tab Reordering 6 2 PROJECT EXPLORER 7 2.1 Project 7 2.2

More information

Crayon (.cry) Language Reference Manual. Naman Agrawal (na2603) Vaidehi Dalmia (vd2302) Ganesh Ravichandran (gr2483) David Smart (ds3361)

Crayon (.cry) Language Reference Manual. Naman Agrawal (na2603) Vaidehi Dalmia (vd2302) Ganesh Ravichandran (gr2483) David Smart (ds3361) Crayon (.cry) Language Reference Manual Naman Agrawal (na2603) Vaidehi Dalmia (vd2302) Ganesh Ravichandran (gr2483) David Smart (ds3361) 1 Lexical Elements 1.1 Identifiers Identifiers are strings used

More information

Babu Madhav Institute of Information Technology, UTU 2015

Babu Madhav Institute of Information Technology, UTU 2015 Five years Integrated M.Sc.(IT)(Semester 5) Question Bank 060010502:Programming in Python Unit-1:Introduction To Python Q-1 Answer the following Questions in short. 1. Which operator is used for slicing?

More information

Category: Standards Track January 2008

Category: Standards Track January 2008 Network Working Group P. Guenther, Ed. Request for Comments: 5228 Sendmail, Inc. Obsoletes: 3028 T. Showalter, Ed. Category: Standards Track January 2008 Status of This Memo Sieve: An Email Filtering Language

More information

C++ Basics. Lecture 2 COP 3014 Spring January 8, 2018

C++ Basics. Lecture 2 COP 3014 Spring January 8, 2018 C++ Basics Lecture 2 COP 3014 Spring 2018 January 8, 2018 Structure of a C++ Program Sequence of statements, typically grouped into functions. function: a subprogram. a section of a program performing

More information

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

Object oriented programming. Instructor: Masoud Asghari Web page:   Ch: 3 Object oriented programming Instructor: Masoud Asghari Web page: http://www.masses.ir/lectures/oops2017sut Ch: 3 1 In this slide We follow: https://docs.oracle.com/javase/tutorial/index.html Trail: Learning

More information

Chapter 2. Lexical Elements & Operators

Chapter 2. Lexical Elements & Operators Chapter 2. Lexical Elements & Operators Byoung-Tak Zhang TA: Hanock Kwak Biointelligence Laboratory School of Computer Science and Engineering Seoul National Univertisy http://bi.snu.ac.kr The C System

More information

UNIT-2 Introduction to C++

UNIT-2 Introduction to C++ UNIT-2 Introduction to C++ C++ CHARACTER SET Character set is asset of valid characters that a language can recognize. A character can represents any letter, digit, or any other sign. Following are some

More information

INTERNATIONAL TELECOMMUNICATION UNION

INTERNATIONAL TELECOMMUNICATION UNION INTERNATIONAL TELECOMMUNICATION UNION ITU-T X.680 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Corrigendum 1 (06/99) SERIES X: DATA NETWORKS AND OPEN SYSTEM COMMUNICATIONS OSI networking and system

More information

Operators. Java operators are classified into three categories:

Operators. Java operators are classified into three categories: Operators Operators are symbols that perform arithmetic and logical operations on operands and provide a meaningful result. Operands are data values (variables or constants) which are involved in operations.

More information

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...]

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...] NAME SYNOPSIS DESCRIPTION OPTIONS psed - a stream editor psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...] s2p [-an] [-e script] [-f script-file] A stream editor reads the input

More information

Python in 10 (50) minutes

Python in 10 (50) minutes Python in 10 (50) minutes https://www.stavros.io/tutorials/python/ Python for Microcontrollers Getting started with MicroPython Donald Norris, McGrawHill (2017) Python is strongly typed (i.e. types are

More information

PART I. Part II Answer to all the questions 1. What is meant by a token? Name the token available in C++.

PART I.   Part II Answer to all the questions 1. What is meant by a token? Name the token available in C++. Unit - III CHAPTER - 9 INTRODUCTION TO C++ Choose the correct answer. PART I 1. Who developed C++? (a) Charles Babbage (b) Bjarne Stroustrup (c) Bill Gates (d) Sundar Pichai 2. What was the original name

More information

DICOM Correction Proposal

DICOM Correction Proposal DICOM Correction Proposal STATUS Final Text Date of Last Update 2017/09/17 Person Assigned Submitter Name Ulrich Busch (ulrich.busch@varian.com) Ulrich Busch (ulrich.busch@varian.com) Submission Date 2016/05/03

More information

Language Fundamentals Summary

Language Fundamentals Summary Language Fundamentals Summary Claudia Niederée, Joachim W. Schmidt, Michael Skusa Software Systems Institute Object-oriented Analysis and Design 1999/2000 c.niederee@tu-harburg.de http://www.sts.tu-harburg.de

More information

Variables, Constants, and Data Types

Variables, Constants, and Data Types Variables, Constants, and Data Types Strings and Escape Characters Primitive Data Types Variables, Initialization, and Assignment Constants Reading for this lecture: Dawson, Chapter 2 http://introcs.cs.princeton.edu/python/12types

More information

HAWK Language Reference Manual

HAWK Language Reference Manual HAWK Language Reference Manual HTML is All We Know Created By: Graham Gobieski, George Yu, Ethan Benjamin, Justin Chang, Jon Adelson 0. Contents 1 Introduction 2 Lexical Convetions 2.1 Tokens 2.2 Comments

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1 Regular Expressions Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 POSIX character classes Some Regular Expression gotchas Regular Expression Resources Assignment 3 on Regular Expressions

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Variables, Data Types, Data Structures, Control Structures Janyl Jumadinova February 3, 2016 Data Type Data types are the basic unit of information storage. Instances of

More information

A A B U n i v e r s i t y

A A B U n i v e r s i t y A A B U n i v e r s i t y Faculty of Computer Sciences O b j e c t O r i e n t e d P r o g r a m m i n g Week 4: Introduction to Classes and Objects Asst. Prof. Dr. M entor Hamiti mentor.hamiti@universitetiaab.com

More information

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

Weiss Chapter 1 terminology (parenthesized numbers are page numbers) Weiss Chapter 1 terminology (parenthesized numbers are page numbers) assignment operators In Java, used to alter the value of a variable. These operators include =, +=, -=, *=, and /=. (9) autoincrement

More information

Introduction to: Computers & Programming: Strings and Other Sequences

Introduction to: Computers & Programming: Strings and Other Sequences Introduction to: Computers & Programming: Strings and Other Sequences in Python Part I Adam Meyers New York University Outline What is a Data Structure? What is a Sequence? Sequences in Python All About

More information

Reference C0.2. bool The type of booleans has just two elements, true and false. They are used for conditions, as well as contracts.

Reference C0.2. bool The type of booleans has just two elements, true and false. They are used for conditions, as well as contracts. C0 Reference 15-122: Principles of Imperative Computation Frank Pfenning August 2010 Compiler revision 546 (Sat Aug 28 2010) 1 Introduction The programming language C0 is a carefully crafted subset of

More information

TCL - STRINGS. Boolean value can be represented as 1, yes or true for true and 0, no, or false for false.

TCL - STRINGS. Boolean value can be represented as 1, yes or true for true and 0, no, or false for false. http://www.tutorialspoint.com/tcl-tk/tcl_strings.htm TCL - STRINGS Copyright tutorialspoint.com The primitive data-type of Tcl is string and often we can find quotes on Tcl as string only language. These

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CMPS Introduction to Computer Science Lecture Notes Binary Numbers Until now we have considered the Computing Agent that executes algorithms to be an abstract entity. Now we will be concerned with techniques

More information

Topic 2. Big C++ by Cay Horstmann Copyright 2018 by John Wiley & Sons. All rights reserved

Topic 2. Big C++ by Cay Horstmann Copyright 2018 by John Wiley & Sons. All rights reserved Topic 2 1. Reading and writing text files 2. Reading text input 3. Writing text output 4. Parsing and formatting strings 5. Command line arguments 6. Random access and binary files Reading Words and Characters

More information

Wiki markup is a core feature for Topic pages, tightly integrating all the other parts of Trac into a flexible and powerful whole.

Wiki markup is a core feature for Topic pages, tightly integrating all the other parts of Trac into a flexible and powerful whole. Help: Wiki Formatting Wiki Formatting Wiki markup is a core feature for Topic pages, tightly integrating all the other parts of Trac into a flexible and powerful whole. This site has a built in small and

More information

Overview: Programming Concepts. Programming Concepts. Names, Values, And Variables

Overview: Programming Concepts. Programming Concepts. Names, Values, And Variables Chapter 18: Get With the Program: Fundamental Concepts Expressed in JavaScript Fluency with Information Technology Third Edition by Lawrence Snyder Overview: Programming Concepts Programming: Act of formulating

More information

Overview: Programming Concepts. Programming Concepts. Chapter 18: Get With the Program: Fundamental Concepts Expressed in JavaScript

Overview: Programming Concepts. Programming Concepts. Chapter 18: Get With the Program: Fundamental Concepts Expressed in JavaScript Chapter 18: Get With the Program: Fundamental Concepts Expressed in JavaScript Fluency with Information Technology Third Edition by Lawrence Snyder Overview: Programming Concepts Programming: Act of formulating

More information

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 Assured and security Deep-Secure XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 This technical note describes the extensible Data

More information

CellML Specification Documentation

CellML Specification Documentation CellML Specification Documentation Release The CellML Editorial Board July 23, 2014 Contents 1 Definitions 3 2 General matters 5 2.1 CellML and XML............................................ 5 2.2 Equivalent

More information

Flex and lexical analysis. October 25, 2016

Flex and lexical analysis. October 25, 2016 Flex and lexical analysis October 25, 2016 Flex and lexical analysis From the area of compilers, we get a host of tools to convert text files into programs. The first part of that process is often called

More information

Cookie/TCP Protocol Specification

Cookie/TCP Protocol Specification CS 471G Computer Networks Spring 2013 Cookie/TCP Protocol Specification Version 1.3.0 11 April 2013 1 Overview This document specifies Version 1.3.0 of the Cookie Protocol, which runs over TCP. This toy

More information

Homework 1 Due Tuesday, January 30, 2018 at 8pm

Homework 1 Due Tuesday, January 30, 2018 at 8pm CSECE 374 A Spring 2018 Homework 1 Due Tuesday, January 30, 2018 at 8pm Starting with this homework, groups of up to three people can submit joint solutions. Each problem should be submitted by exactly

More information