XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013
|
|
- Stuart Nicholas O’Brien’
- 6 years ago
- Views:
Transcription
1 Assured and security Deep-Secure XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 This technical note describes the extensible Data Structure (XDS), which is a format specifically designed for presenting the business information found in all kinds of documents to the verification component of a high assurance guard. Document Content Verification Attackers are often able to gain control of a system by exploiting mistakes in the way applications handle unusual or malformed structures in documents or other data. So verifying that a document only contains structures that can be safely handled by applications is an important part of defending a system from attack. However, to be an effective defence, the verification system must itself be resistant to attacks involving malformed structures. This is difficult because most document formats are highly complex so it is hard to be sure that a verifier will work correctly under all conditions. Worse still, there are many document formats in common usage and a separate robust verifier will be required for each of them. Consequently it will take significant time to introduce support for new formats and the overall cost of this approach will be prohibitive. Converting all documents into a single common format before verification can reduce costs. Documents are first converted to the common format, the data is verified using a common verification component and then a new document is constructed in the appropriate format for delivery. This process is referred to as Transshipment. New formats can be supported without changing the security critical verification component, so this scales. However the solution is only effective if the format is simple enough to verify easily and flexible enough to handle the wide variety of information conveyed by common document formats. Deep- Secure have designed a format specifically to meet these goals. Called the extensible Data Structure (XDS), this is a way of encoding arbitrary structured data that is rich enough to be used to represent complex documents and yet simple enough that trustable software can be produced to check that an encoding adheres to some defined structures. Deep Secure is using XDS as an intermediate format in its next generation high assurance Transshipment Guards. The main complexity of a guard is in its Deep- Secure Ltd
2 protocol proxy software and the parsers and renderers for the complex data formats it must handle. To avoid this complex functionality becoming security critical, the parser and renderer are kept separate from the verifier. The parser converts the complex formats into an XDS representation that is handed to the security critical data verifier. If the verifier passes the data, it is then given to the renderer for conversion into an appropriate complex format needed for delivery. Essential Characteristics of a Common Format for Verification The common format needs to be capable of representing a wide range of documents, including word processing, spreadsheets, imagery, and structured data, so it must be extensible and general purpose. In a guard, the structure representing a document needs to be passed from source proxy to verifier and from verifier to destination proxy, but if the proxies and verifier share memory to hold the structure it is difficult to be sure they are unable to communicate in other ways. So the structure must be serialised to pass it as a byte stream from one component to another, which means it must be easy to produce trustable serialisation software for use within the security critical verifier. The verifier is security critical and so any configuration errors need to be trapped before any damage is done. Strong type enforcement within the verifier will achieve this, which means the data and the schemas that define what structures are acceptable need to support a variety of data types. The overriding requirement is for simplicity, since some security critical software will need to handle the common format. Is XML a Good Candidate? XML lacks the essential characteristics needed to act as a common format for trustworthy document content verification. Its main virtue is its extensibility, but in other regards it is problematic. The main issues are that XML, and its related toolset, is complicated to understand, use effectively and implement. Most of this complexity arises from it being serialised as a mark- up language, rather than a data structure, but its use of namespaces adds further complication. However the principle of using a tagged data structure for extensibility is sound and formed the basis of Deep- Secure s work to create XDS. Another major disadvantage of XML is that it only supports the string data type. This not only makes XML inefficient, because binary data must be encoded as text in some way, but it also means there is no opportunity for intrinsic type checking. The toolset associated with XML is also complex, in particular it is too difficult to assure the correctness of implementations of the path language and schema 2
3 definition languages for XML, making them unsuitable for use in a high assurance verifier. The XDS equivalents are similar but have been carefully designed to have simple well- defined semantics that can be implemented easily. So XML is not the candidate of choice for a common method of representing data for format verification. XDS Structures An XDS structure is made up of tags that logically form a tree 1. There are different types of tag, with all tags having a name and possibly some attributes. Tags are either: Empty, a Container, Text or Binary. Empty tags are leaf nodes in the tree and contain nothing. Container tags contain a, possibly empty, sequence of tags. Text tags contain a, possibly empty, Unicode text string. Binary tags contain a, possibly empty, byte sequence. Note that Empty tags, Container tags with an empty sequence of children, Text tags containing the empty string and Binary tags containing an empty sequence of bytes are all different and distinguished. There is no equivalent distinction in XML because of that language s mark- up roots. Each of a tag s attributes has a unique name and a typed value. The types are Unicode text string, binary (sequence of bytes), Boolean, unsigned integer (64 ), signed integer (64 ) and floating point numbers. Tags and attributes have simple case- sensitive names, with characters taken from the set A- Z, a- z, 0-9 and underscore. This limitation allows an implementation to avoid the complexities and expense of Unicode when handling these names. It is not expected that any application level string data will be encoded as tag or attribute names, rather it will be held as the values of attributes and Text tags. Representing an XDS Structure as Text Since XDS is a data structure capable of handling typed data it is difficult to show examples in a document such as this. Consequently a text encoding is also defined. This is primarily for use within documentation, but it could be used to create editable text representations of XDS structures used for configuration data or similar purposes. An XDS structure can be serialised to a sequence of either 7- bit ASCII characters or Unicode characters using one of the standard representations (UTF- 8, UTF- 16BE/LE, UTF- 32BE/LE) indicated by a Byte Order Mark. 1 Strictly XDS is defined in terms of an acyclic graph with a single root node, meaning that a tree that has common sub- trees need only store them once. 3
4 Empty tags are rendered as <TAG/> and Container tags as <TAG> </TAG>. Note that <TAG></TAG> is not equivalent to <TAG/> the former is a container tag with no children and the latter is an empty tag. Text tags are rendered as <TAG>text:text</TAG> and binary tags as <TAG>base64:binary</TAG>. Text tags may also be rendered as <TAG>text</TAG> where there is no ambiguity (that is, the text is not empty and does not start with a less- than character). With an ASCII encoding, characters in the text that have no representation in ASCII must be escaped using a hexadecimal representation of their Unicode character code, for example &20AC; must be used to represent the Euro character. Control characters, including tab, newline and carriage return, must also be escaped in all encodings. Similar escaping is used with Unicode encodings to represent the less- than character, in order to distinguish a less- than in the text from the less- than that terminates the text. Also, in any encoding, since ampersand is chosen as the herald of an escape sequence it must always be escaped as &26;. Character escaping is permitted even if not necessary, so for example &40; can be used to represent character even though this is not necessary. Since tag and attribute names are from a simple character set these all render without escaping whatever the representation chosen for characters. The text inside text and binary tags may contain newlines that are ignored. Also, any leading or trailing whitespace surrounding the lines of text is ignored. Should any such whitespace be significant it must be escaped. Attributes are listed as name=value pairs after the tag name. String values are rendered as "string", Boolean values as true or false, unsigned integers as digits, signed integers as +digits or - digits and binary values as base64:binary. Each representation of the different types starts with a different set of characters, thus the type of the attribute s value can be determined from the first character of its representation. Below is the text representation of an example XDS structure: <DOC Width=320> <PARA>Title:&9;An Example Document &20;About XDS</PARA> <PARA>Author:&9;Deep-Secure</PARA> </DOC> Here the DOC tag is a Container and the two PARA tags are Text tags. The DOC tag s Width attribute is an unsigned integer. The first PARA tag contains text that is split across two lines. The newline and leading whitespace on the second line is ignored, but a space character before About is escaped as &20; and so is significant. The text also contains tab characters, escaped as &9;. 4
5 Comparison with XML In XDS an empty tag is distinguishable from the tag with no children. This is an important difference as it allows type checking to detect more errors when validating XDS against a schema and when evaluating path expressions. XML supports international characters in tag and attribute names, while XDS only allows simple ASCII alphanumeric names. This simplification allows implementations to be more efficient without needing to introduce complex mechanisms. It does not impact on the applicability of XDS as the names are intended to encode structure not application data. XDS attributes and values are typed, whereas XML only supports character strings. This not only makes the representation more efficient, by avoiding the need to store numeric values as strings, but also allows type checking to be effective. XML allows mixed content, where the sequence of elements contained in a tag can be a mixture of text and tags. The main problem with mixed content is that it makes the type system more complicated, as the type of a tag s element cannot be determined statically. It is relevant when XML is used as a mark- up language, as in XHTML, but is not a particularly useful construct in a data structure. XDS lacks any equivalent of XML namespaces. New attributes or new tags can be defined to extend structures, but different extensions may use the same names for different purposes. Thus XDS is not as easily extended as XML but the framework for representing arbitrary data formats can easily be defined to accommodate extensibility using attribute values, so there is no disadvantage here and the clear advantage is in the simplicity of the XDS design and implementation. Since XDS is a binary data structure rather than a text mark- up language it has no issues regarding the handling of whitespace, hence there is no equivalent of xml:space. XDS does not have the special control attributes xml:lang or xml:id as any information about language and any unique identifiers in a structure are part of the structure and represented using tags and attributes like any other data. XDS does not provide any equivalent of XML s CDATA construct, processing instructions or document type definitions. The textual representation of an XDS structure also differs from the way XML is represented. The character set used to represent XML is not known until part way through the document the charset attribute in the xml declaration partly governs the choice which complicates parsing. In XDS a Byte Order Mark at the start of the text always defines the encoding. 5
6 XML supports textual names, decimal values and hexadecimal values in escape sequences, whereas XDS only supports hexadecimal. This simplification means the parser for textual XDS is easier to test and represents no loss in capability. Leading and trailing whitespace is never significant in textual XDS, while it can be in XML and is a source of much error and confusion, and there is no equivalent of XML s CDATA sections. XDS attributes are typed and the representation of the value determines its type, whereas XML only supports the string type and schemas then impose constraints on the strings to give them a type. Comments in the textual representation of XDS are shown as <!>...</!>, while in XML they are <! >. XDS Binary Serialisation Applications are free to serialise XDS in any way they see fit, but a standard binary serialisation that represents an XDS structure as a byte stream is defined to allow independently developed system components to pass XDS between each another. All integers are serialised in Little Endian format, rather than Big Endian, to reflect the dominance of Intel processors. All characters are represented in Unicode using a 32 bit integer, despite Unicode only requiring 21 bits. This is on the assumption that the receiving application will represent strings as arrays of 32 bit integers to keep string processing simple. As it is common for an XDS structure to use tag and attribute names many times, the names are represented by 4 byte integers in the serialisation. The mapping table that translates the integers to the names is either known a priori to the sender and receiver or is sent once at the start of the structure. Tags are represented by the number of their name, a counted list of attributes, a one byte type code indicating what type of tag they are and the serialisation of the tag s contents, if any. Attributes are represented by the number of their name, a one byte type code indicating the type of the attribute s value and the value itself. The content of Text tags is a counted sequence of Unicode characters, while for Binary tags it is a counted sequence of bytes and for Container tags it is a counted list of Tags. Empty tags have no content. 6
7 The example discussed previously would be serialised as follows: (1) number of attribute names Width attribute name 1 in ASCII (2) number of tag names DOC tag name 1 in ASCII PARA tag name 2 in ASCII (1) DOC tag (1) attr count (1) Width attr U Unsigned type indicator 320 Attr value (unsigned integer) C Container tag indicator (2) count of child tags (2) PARA tag (0) attr count (no attributes) T Text tag indicator (36)Title:(tab)An Example Document About XDS (2) PARA tag (0) attr count (no attributes) T Text tag indicator (19)Author:(tab)Deep-Secure Canonical Representation Neither the binary nor textual representations of XDS are suitable for generating hashes that uniquely identify structures, because both are capable of representing the same document in different ways. Consequently a canonical form of the binary serialisation is defined that adds additional constraints that mean it is only possible to represent an XDS structure in one way. In the canonical form the numeric identifiers are allocated to tag and attribute names in alphabetical order and a tag s set of attributes are ordered into a sequence by their name. With these additional constraints there becomes only one way of representing an XDS structure as a sequence of bytes and hence hashes can be generated to uniquely identify a particular XDS structure. XDSPath XDSPath is a language for calculating values based on an XDS structure. An XDSPath expression defines how a sequence of tags or sequence of scalar values is to be derived from an XDS tag and some context. Superficially, XDSPath is very much like XML s XPath, but it dispenses with the notion of axes, arranges results as sequences not sets and is a strongly typed expression language. The simplest path expression is the name of a tag. Given a container tag this produces the sub- sequence of the tag s child tags that have the given name. For example, the expression PARA applied to the example document above will return a sequence of two PARA tags. Arithmetic expressions produce a scalar value given a tag. The expression can calculate a result from the values of the given tag s attributes using the usual arithmetic and string operators. For example, the returns 7
8 the unsigned integer value of the given tag s Width attribute. If the attribute has a different type the expression is invalid, but if the tag does not have an attribute with this name the result is the special Null value. If the example expression is applied to the example document above it returns a sequence of one unsigned integer whose value is 320. Two path expressions can be combined, using the / operator, so that the second is evaluated with each of the first s results in turn. The resulting sequence- of- sequences is concatenated to produce a single sequence as the overall result. For example, the expression PARA/text() applied to the example structure given above returns a sequence of two Unicode strings: Title:(tab)An Example Document About XDS and Author:(tab)Deep- Secure. A path expression can also be used to filter the results of another, using the syntax path[filter- path]. The filter is evaluated with each of the tags produced by the first path in turn to produce a Boolean value. The overall result is the sub- sequence of the tags produced by the first path for which the second path evaluated to True. For example, the expression PARA[length(text())>20]/text() applied to the example structure given above returns a sequence of one string Title:(tab)An Example Document About XDS. The XDS- Path expression language also supports parameters and external functions, and has many advanced features similar to those in XPath2, but the language design means it has clean simple semantics and can be implemented simply and efficiently. XDS- Schema XDS- Schema is a written language, based on regular expression syntax, for defining a set of conforming XDS structures. It serves the same purpose as XML Schema does for XML, but is more compact and readable. An XDS- Schema is a set of grammar rules that describe all XDS structures that conform to the specification. Each rule has a discrimination part that describes the name and attributes of a conforming tag. For content tags there are additional grammar rules that describe the structure of the tag s content. It is also possible to attach arbitrary XDSPath constraints to the discrimination part of a grammar rule. The path condition is evaluated against the tag and determines whether the rule applies. If a schema contains a choice rule, the discrimination part of each choice is evaluated to determine which choice to take. If no choices match, the structure does not conform to the schema. If more than one choice matches, the input data structure is considered ambiguous and non- conformant to the schema. This means the schema validator ignores content when considering which choice applies, but path conditions can be used to guide the validation explicitly if this is required. 8
9 The following schema is given as an example. The sample data structure shown above conforms to this schema. # Example schema main = TAG DOC : container, ATTR Width : uint / para*; para = TAG PARA : text; XDS- Transform Transformations to be applied to XDS data structures can be defined using the XDS- Transform language. This is an XDS structure that declaratively describes the transformation of one XDS document into another. An XDS- Transform consists of a list of templates that are selected by the input document tags as they are encountered. Each template describes an XDS fragment that is created in the destination document and directs the transformation of subsequently selected input tags. The XDS fragment description composes the transformed text, tag, attribute and binary objects and by copying sections of the input document. XDSPath is used throughout XDS- Transform for selecting and filtering the input document into the output document. The following template example changes the PARA tag's name of the earlier example whilst keeping the textual content the same: <TEMPLATE match="para"> <TAG name="mypara"> <COPYOF select="./text()" /> </TAG> <TEMPLATE> This would output something like: <DOC Width=320> <MYPARA>Title:&9;An Example Document &20;About XDS</MYPARA> <MYPARA>Author:&9;Deep-Secure</MYPARA> </DOC> Note that the select and match attribute are XDSPath based. Many places where literal values are used, such as the name attribute in the template example above, can be replaced with quick selector by using the '?' character as the first letter. Quick selectors generate literal values as the result of an XDSPath expression evaluated against the input document. The XDSPath expression is placed after the '?' character. Some limited flow control is also supplied by the IF, FOREACH and CHOOSE constructs. The operands for these also use XDS- Path expressions evaluated over the input. To attain high assurance in a verifier it must be kept simple, so it is unlikely that XDS- Transform will be used in such a verifier. However the sub- systems that 9
10 surround the verifier may well need to apply transformations to XDS data. For example a parser may extract all possible information from an input document, but in a particular deployment only a subset of the data may be required or permitted to pass through the guard. The parser could be made configurable as to what data to include, but this complicates its implementation and will never be fully general in the options it offers. The alternative is to apply a transformation after parsing to trim the data back to that needed, and this is one role of XDS- Transform. Summary The extensible Data Structure (XDS) has been devised as a means of representing the information found in all kinds of documents in a way that means simple software can verify its structure. XDS is similar to XML but there are significant differences, in particular the use of strong typing. Two languages accompany the data structure definition, XDSPath for searching XDS structures and calculating values based on the data and XDS- Transform for defining transformations from one structure to another. 10
CSC Web Technologies, Spring Web Data Exchange Formats
CSC 342 - Web Technologies, Spring 2017 Web Data Exchange Formats Web Data Exchange Data exchange is the process of transforming structured data from one format to another to facilitate data sharing between
More informationThe Logical Design of the Tokeniser
Page 1 of 21 The Logical Design of the Tokeniser Purpose 1. To split up a character string holding a RAQUEL statement expressed in linear text, into a sequence of character strings (called word tokens),
More informationIntroduction to XML. An Example XML Document. The following is a very simple XML document.
Introduction to XML Extensible Markup Language (XML) was standardized in 1998 after 2 years of work. However, it developed out of SGML (Standard Generalized Markup Language), a product of the 1970s and
More informationCOSC 3311 Software Design Report 2: XML Translation On the Design of the System Gunnar Gotshalks
Version 1.0 November 4 COSC 3311 Software Design Report 2: XML Translation On the Design of the System Gunnar Gotshalks 1 Introduction This document describes the design work and testing done in completing
More informationUniversal Format Plug-in User s Guide. Version 10g Release 3 (10.3)
Universal Format Plug-in User s Guide Version 10g Release 3 (10.3) UNIVERSAL... 3 TERMINOLOGY... 3 CREATING A UNIVERSAL FORMAT... 5 CREATING A UNIVERSAL FORMAT BASED ON AN EXISTING UNIVERSAL FORMAT...
More informationX Language Definition
X Language Definition David May: November 1, 2016 The X Language X is a simple sequential programming language. It is easy to compile and an X compiler written in X is available to simplify porting between
More informationAvro Specification
Table of contents 1 Introduction...2 2 Schema Declaration... 2 2.1 Primitive Types... 2 2.2 Complex Types...2 2.3 Names... 5 3 Data Serialization...6 3.1 Encodings... 6 3.2 Binary Encoding...6 3.3 JSON
More informationFull file at
Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class
More informationAvro Specification
Table of contents 1 Introduction...2 2 Schema Declaration... 2 2.1 Primitive Types... 2 2.2 Complex Types...2 2.3 Names... 5 2.4 Aliases... 6 3 Data Serialization...6 3.1 Encodings... 7 3.2 Binary Encoding...7
More informationCSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML
CSI 3140 WWW Structures, Techniques and Standards Representing Web Data: XML XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) Guy-Vincent
More informationVariables, Constants, and Data Types
Variables, Constants, and Data Types Strings and Escape Characters Primitive Data Types Variables, Initialization, and Assignment Constants Reading for this lecture: Dawson, Chapter 2 http://introcs.cs.princeton.edu/python/12types
More informationChapter 2 Basic Elements of C++
C++ Programming: From Problem Analysis to Program Design, Fifth Edition 2-1 Chapter 2 Basic Elements of C++ At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class Discussion
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationDecaf Language Reference
Decaf Language Reference Mike Lam, James Madison University Fall 2016 1 Introduction Decaf is an imperative language similar to Java or C, but is greatly simplified compared to those languages. It will
More informationTree Parsing. $Revision: 1.4 $
Tree Parsing $Revision: 1.4 $ Compiler Tools Group Department of Electrical and Computer Engineering University of Colorado Boulder, CO, USA 80309-0425 i Table of Contents 1 The Tree To Be Parsed.........................
More informationStandard 11. Lesson 9. Introduction to C++( Up to Operators) 2. List any two benefits of learning C++?(Any two points)
Standard 11 Lesson 9 Introduction to C++( Up to Operators) 2MARKS 1. Why C++ is called hybrid language? C++ supports both procedural and Object Oriented Programming paradigms. Thus, C++ is called as a
More information1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationCommon JSON/RPC transport
Version Date Author Description 0.1 2.3.2010 SV First draft. 0.2 4.3.2010 SV Feedback incorporated, various fixes throughout the document. 0.3 5.3.2010 SV Clarification about _Keepalive processing, changed
More informationStating the obvious, people and computers do not speak the same language.
3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what
More informationOli Language Documentation
Oli Language Documentation Release 0.0.1 Tomas Aparicio Sep 27, 2017 Contents 1 Project stage 3 2 Document stage 5 2.1 Table of Contents............................................. 5 2.1.1 Overview............................................
More informationForeword... v Introduction... vi. 1 Scope Normative references Terms and definitions Extensible Datatypes schema overview...
Contents Page Foreword... v Introduction... vi 1 Scope... 1 2 Normative references... 1 3 Terms and definitions... 1 4 Extensible Datatypes schema overview... 2 5 Common constructs... 3 5.1 Common types...
More informationTypescript on LLVM Language Reference Manual
Typescript on LLVM Language Reference Manual Ratheet Pandya UNI: rp2707 COMS 4115 H01 (CVN) 1. Introduction 2. Lexical Conventions 2.1 Tokens 2.2 Comments 2.3 Identifiers 2.4 Reserved Keywords 2.5 String
More informationJME Language Reference Manual
JME Language Reference Manual 1 Introduction JME (pronounced jay+me) is a lightweight language that allows programmers to easily perform statistic computations on tabular data as part of data analysis.
More informationCS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)
CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going
More informationFlat triples approach to RDF graphs in JSON
Flat triples approach to RDF graphs in JSON Dominik Tomaszuk Institute of Computer Science, University of Bialystok, Poland Abstract. This paper describes a syntax that can be used to write Resource Description
More informationGBIL: Generic Binary Instrumentation Language. Language Reference Manual. By: Andrew Calvano. COMS W4115 Fall 2015 CVN
GBIL: Generic Binary Instrumentation Language Language Reference Manual By: Andrew Calvano COMS W4115 Fall 2015 CVN Table of Contents 1) Introduction 2) Lexical Conventions 1. Tokens 2. Whitespace 3. Comments
More informationSDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5
2 Basics of XML and XML documents 2.1 XML and XML documents Survivor's Guide to XML, or XML for Computer Scientists / Dummies 2.1 XML and XML documents 2.2 Basics of XML DTDs 2.3 XML Namespaces XML 1.0
More informationZheng-Liang Lu Java Programming 45 / 79
1 class Lecture2 { 2 3 "Elementray Programming" 4 5 } 6 7 / References 8 [1] Ch. 2 in YDL 9 [2] Ch. 2 and 3 in Sharan 10 [3] Ch. 2 in HS 11 / Zheng-Liang Lu Java Programming 45 / 79 Example Given a radius
More informationLBSC 690: Information Technology Lecture 05 Structured data and databases
LBSC 690: Information Technology Lecture 05 Structured data and databases William Webber CIS, University of Maryland Spring semester, 2012 Interpreting bits "my" 13.5801 268 010011010110 3rd Feb, 2014
More informationAbout the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design
i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target
More informationOverview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54
Overview Lecture 16 Introduction to XML Boriana Koleva Room: C54 Email: bnk@cs.nott.ac.uk Introduction The Syntax of XML XML Document Structure Document Type Definitions Introduction Introduction SGML
More informationType Checking and Type Equality
Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.
More informationChapter 2: Using Data
Chapter 2: Using Data TRUE/FALSE 1. A variable can hold more than one value at a time. F PTS: 1 REF: 52 2. The legal integer values are -2 31 through 2 31-1. These are the highest and lowest values that
More informationLanguage Reference Manual simplicity
Language Reference Manual simplicity Course: COMS S4115 Professor: Dr. Stephen Edwards TA: Graham Gobieski Date: July 20, 2016 Group members Rui Gu rg2970 Adam Hadar anh2130 Zachary Moffitt znm2104 Suzanna
More informationObjectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program
Objectives Chapter 2: Basic Elements of C++ In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates
More informationChapter 2: Basic Elements of C++
Chapter 2: Basic Elements of C++ Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates
More information9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation
Data Representation II CMSC 313 Sections 01, 02 The conversions we have so far presented have involved only unsigned numbers. To represent signed integers, computer systems allocate the high-order bit
More informationChapter 2: Basic Elements of C++ Objectives. Objectives (cont d.) A C++ Program. Introduction
Chapter 2: Basic Elements of C++ C++ Programming: From Problem Analysis to Program Design, Fifth Edition 1 Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a
More informationMIB BROADCAST STREAM SPECIFICATION
MIB BROADCAST STREAM SPECIFICATION November 5, 2002, Version 1.0 This document contains a specification for the MIB broadcast stream. It will be specified in a language independent manner. It is intended
More informationUnit 3. Constants and Expressions
1 Unit 3 Constants and Expressions 2 Review C Integer Data Types Integer Types (signed by default unsigned with optional leading keyword) C Type Bytes Bits Signed Range Unsigned Range [unsigned] char 1
More informationChapter 2: Introduction to C++
Chapter 2: Introduction to C++ Copyright 2010 Pearson Education, Inc. Copyright Publishing as 2010 Pearson Pearson Addison-Wesley Education, Inc. Publishing as Pearson Addison-Wesley 2.1 Parts of a C++
More informationIntermediate Code Generation
Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target
More information.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..
.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. XML in a Nutshell XML, extended Markup Language is a collection of rules for universal markup of data. Brief History
More informationChapter 2: Special Characters. Parts of a C++ Program. Introduction to C++ Displays output on the computer screen
Chapter 2: Introduction to C++ 2.1 Parts of a C++ Program Copyright 2009 Pearson Education, Inc. Copyright 2009 Publishing Pearson as Pearson Education, Addison-Wesley Inc. Publishing as Pearson Addison-Wesley
More informationMaciej Sobieraj. Lecture 1
Maciej Sobieraj Lecture 1 Outline 1. Introduction to computer programming 2. Advanced flow control and data aggregates Your first program First we need to define our expectations for the program. They
More informationRDGL Reference Manual
RDGL Reference Manual COMS W4115 Programming Languages and Translators Professor Stephen A. Edwards Summer 2007(CVN) Navid Azimi (na2258) nazimi@microsoft.com Contents Introduction... 3 Purpose... 3 Goals...
More informationCS52 - Assignment 8. Due Friday 4/15 at 5:00pm.
CS52 - Assignment 8 Due Friday 4/15 at 5:00pm https://xkcd.com/859/ This assignment is about scanning, parsing, and evaluating. It is a sneak peak into how programming languages are designed, compiled,
More informationNumber Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation
Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur 1 Number Representation 2 1 Topics to be Discussed How are numeric data items actually
More informationB oth element and attribute declarations can use simple types
Simple types 154 Chapter 9 B oth element and attribute declarations can use simple types to describe the data content of the components. This chapter introduces simple types, and explains how to define
More informationFeatures of C. Portable Procedural / Modular Structured Language Statically typed Middle level language
1 History C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. C was originally first implemented on the DEC
More informationChapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.
Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of
More informationINTERNATIONAL TELECOMMUNICATION UNION
INTERNATIONAL TELECOMMUNICATION UNION ITU-T X.691 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (07/2002) SERIES X: DATA NETWORKS AND OPEN SYSTEM COMMUNICATIONS OSI networking and system aspects Abstract
More information1. Describe History of C++? 2. What is Dev. C++? 3. Why Use Dev. C++ instead of C++ DOS IDE?
1. Describe History of C++? The C++ programming language has a history going back to 1979, when Bjarne Stroustrup was doing work for his Ph.D. thesis. One of the languages Stroustrup had the opportunity
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler
More informationprintf( Please enter another number: ); scanf( %d, &num2);
CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful
More informationCHAPTER 3 LITERATURE REVIEW
20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations
More information\n is used in a string to indicate the newline character. An expression produces data. The simplest expression
Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of
More informationCMPT 125: Lecture 3 Data and Expressions
CMPT 125: Lecture 3 Data and Expressions Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 3, 2009 1 Character Strings A character string is an object in Java,
More informationCompiler Design. Subject Code: 6CS63/06IS662. Part A UNIT 1. Chapter Introduction. 1.1 Language Processors
Compiler Design Subject Code: 6CS63/06IS662 Part A UNIT 1 Chapter 1 1. Introduction 1.1 Language Processors A compiler is a program that can read a program in one language (source language) and translate
More informationObjectives. In this chapter, you will:
Objectives In this chapter, you will: Become familiar with functions, special symbols, and identifiers in C++ Explore simple data types Discover how a program evaluates arithmetic expressions Learn about
More informationLECTURE 02 INTRODUCTION TO C++
PowerPoint Slides adapted from *Starting Out with C++: From Control Structures through Objects, 7/E* by *Tony Gaddis* Copyright 2012 Pearson Education Inc. COMPUTER PROGRAMMING LECTURE 02 INTRODUCTION
More informationXML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11
!important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...
More informationThe SPL Programming Language Reference Manual
The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming
More informationXPath Expression Syntax
XPath Expression Syntax SAXON home page Contents Introduction Constants Variable References Parentheses and operator precedence String Expressions Boolean Expressions Numeric Expressions NodeSet expressions
More informationStarting with a great calculator... Variables. Comments. Topic 5: Introduction to Programming in Matlab CSSE, UWA
Starting with a great calculator... Topic 5: Introduction to Programming in Matlab CSSE, UWA! MATLAB is a high level language that allows you to perform calculations on numbers, or arrays of numbers, in
More informationInput And Output of C++
Input And Output of C++ Input And Output of C++ Seperating Lines of Output New lines in output Recall: "\n" "newline" A second method: object endl Examples: cout
More informationChapter 3. Describing Syntax and Semantics
Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:
More informationPART I. Part II Answer to all the questions 1. What is meant by a token? Name the token available in C++.
Unit - III CHAPTER - 9 INTRODUCTION TO C++ Choose the correct answer. PART I 1. Who developed C++? (a) Charles Babbage (b) Bjarne Stroustrup (c) Bill Gates (d) Sundar Pichai 2. What was the original name
More informationThe Specification Xml Failed To Validate Against The Schema Whitespace
The Specification Xml Failed To Validate Against The Schema Whitespace go-xsd - A package that loads XML Schema Definition (XSD) files. Its *makepkg* tool generates a Go package with struct type-defs to
More informationThe PCAT Programming Language Reference Manual
The PCAT Programming Language Reference Manual Andrew Tolmach and Jingke Li Dept. of Computer Science Portland State University September 27, 1995 (revised October 15, 2002) 1 Introduction The PCAT language
More informationXML: Parsing and Writing
XML: Parsing and Writing Version 5.1 Paul Graunke and Jay McCarthy February 14, 2011 (require xml) The xml library provides functions for parsing and generating XML. XML can be represented as an instance
More informationVisual C# Instructor s Manual Table of Contents
Visual C# 2005 2-1 Chapter 2 Using Data At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class Discussion Topics Additional Projects Additional Resources Key Terms
More informationCPS122 Lecture: From Python to Java last revised January 4, Objectives:
Objectives: CPS122 Lecture: From Python to Java last revised January 4, 2017 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.
More informationM359 Block5 - Lecture12 Eng/ Waleed Omar
Documents and markup languages The term XML stands for extensible Markup Language. Used to label the different parts of documents. Labeling helps in: Displaying the documents in a formatted way Querying
More informationASN2XML. ASN.1 to XML Translator. Version 2.1. Reference Manual. Objective Systems July 2010
ASN2XML ASN.1 to XML Translator Version 2.1 Reference Manual Objective Systems July 2010 The software described in this document is furnished under a license agreement and may be used only in accordance
More informationJava EE 7: Back-end Server Application Development 4-2
Java EE 7: Back-end Server Application Development 4-2 XML describes data objects called XML documents that: Are composed of markup language for structuring the document data Support custom tags for data
More informationPart VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153
Part VII Querying XML The XQuery Data Model Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153 Outline of this part 1 Querying XML Documents Overview 2 The XQuery Data Model The XQuery
More informationXML Information Set. Working Draft of May 17, 1999
XML Information Set Working Draft of May 17, 1999 This version: http://www.w3.org/tr/1999/wd-xml-infoset-19990517 Latest version: http://www.w3.org/tr/xml-infoset Editors: John Cowan David Megginson Copyright
More informationUnderstanding the Business Rules Method Palette. Sun Microsystems, Inc Network Circle Santa Clara, CA U.S.A.
Understanding the Business Rules Method Palette Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 820 3779 02/05/2008 Copyright 2008 Sun Microsystems, Inc. 4150 Network Circle,
More informationObject oriented programming. Instructor: Masoud Asghari Web page: Ch: 3
Object oriented programming Instructor: Masoud Asghari Web page: http://www.masses.ir/lectures/oops2017sut Ch: 3 1 In this slide We follow: https://docs.oracle.com/javase/tutorial/index.html Trail: Learning
More informationXML: Parsing and Writing
XML: Parsing and Writing Version 7.2.0.2 Paul Graunke and Jay McCarthy January 17, 2019 (require xml) package: base The xml library provides functions for parsing and generating XML. XML can be represented
More informationBits, Words, and Integers
Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are
More informationExcerpt from: Stephen H. Unger, The Essence of Logic Circuits, Second Ed., Wiley, 1997
Excerpt from: Stephen H. Unger, The Essence of Logic Circuits, Second Ed., Wiley, 1997 APPENDIX A.1 Number systems and codes Since ten-fingered humans are addicted to the decimal system, and since computers
More informationA Short Summary of Javali
A Short Summary of Javali October 15, 2015 1 Introduction Javali is a simple language based on ideas found in languages like C++ or Java. Its purpose is to serve as the source language for a simple compiler
More informationSingle-pass Static Semantic Check for Efficient Translation in YAPL
Single-pass Static Semantic Check for Efficient Translation in YAPL Zafiris Karaiskos, Panajotis Katsaros and Constantine Lazos Department of Informatics, Aristotle University Thessaloniki, 54124, Greece
More informationIntroduction to C# Applications
1 2 3 Introduction to C# Applications OBJECTIVES To write simple C# applications To write statements that input and output data to the screen. To declare and use data of various types. To write decision-making
More informationECMA-404. The JSON Data Interchange Syntax. 2 nd Edition / December Reference number ECMA-123:2009
ECMA-404 2 nd Edition / December 2017 The JSON Data Interchange Syntax Reference number ECMA-123:2009 Ecma International 2009 COPYRIGHT PROTECTED DOCUMENT Ecma International 2017 Contents Page 1 Scope...
More informationCPS122 Lecture: From Python to Java
Objectives: CPS122 Lecture: From Python to Java last revised January 7, 2013 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.
More informationLearning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104)
Learning Language Reference Manual 1 George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104) A. Introduction Learning Language is a programming language designed
More informationJSON-LD 1.0 Processing Algorithms and API
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group;
More information[MS-PICSL]: Internet Explorer PICS Label Distribution and Syntax Standards Support Document
[MS-PICSL]: Internet Explorer PICS Label Distribution and Syntax Standards Support Document Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft
More informationDecaf Language Reference Manual
Decaf Language Reference Manual C. R. Ramakrishnan Department of Computer Science SUNY at Stony Brook Stony Brook, NY 11794-4400 cram@cs.stonybrook.edu February 12, 2012 Decaf is a small object oriented
More informationCS113: Lecture 3. Topics: Variables. Data types. Arithmetic and Bitwise Operators. Order of Evaluation
CS113: Lecture 3 Topics: Variables Data types Arithmetic and Bitwise Operators Order of Evaluation 1 Variables Names of variables: Composed of letters, digits, and the underscore ( ) character. (NO spaces;
More informationCSc 10200! Introduction to Computing. Lecture 2-3 Edgardo Molina Fall 2013 City College of New York
CSc 10200! Introduction to Computing Lecture 2-3 Edgardo Molina Fall 2013 City College of New York 1 C++ for Engineers and Scientists Third Edition Chapter 2 Problem Solving Using C++ 2 Objectives In this
More informationSprite an animation manipulation language Language Reference Manual
Sprite an animation manipulation language Language Reference Manual Team Leader Dave Smith Team Members Dan Benamy John Morales Monica Ranadive Table of Contents A. Introduction...3 B. Lexical Conventions...3
More informationTML Language Reference Manual
TML Language Reference Manual Jiabin Hu (jh3240) Akash Sharma (as4122) Shuai Sun (ss4088) Yan Zou (yz2437) Columbia University October 31, 2011 1 Contents 1 Introduction 4 2 Lexical Conventions 4 2.1 Character
More informationChapter 2. Data Representation in Computer Systems
Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting
More informationGetting started with Java
Getting started with Java Magic Lines public class MagicLines { public static void main(string[] args) { } } Comments Comments are lines in your code that get ignored during execution. Good for leaving
More informationENGINEERING COMMITTEE Digital Video Subcommittee
ENGINEERING COMMITTEE Digital Video Subcommittee SCTE 164 2010 Emergency Alert Metadata Descriptor NOTICE The Society of Cable Telecommunications Engineers (SCTE) Standards are intended to serve the public
More information