Developping of Character Object Technology with Character Databases
|
|
- Melissa Lucas
- 5 years ago
- Views:
Transcription
1 Developping of Character Object Technology with Character Databases 1) 2) MORIOKA Tomohiko Christian Wittern 1) ) ABSTRACT. The CHISE (CHaracter Information Service Environment) project is a character processing system which is based on the proposed character object model. This model is based on character property databases instead of coded character sets. Currently the system consists of two subsystems: XEmacs UTF-2000 and a prototype of Maps engine using Zope. XEmacs UTF-2000 is an extensible editor into which a character database has been embedded. Within XEmacs UTF-2000 each character is created as an object which is defined by a set of character-attributes. In order to achieve a higher expressive power, a topic map of characters based on the ISO Maps standard (ISO/IEC 13250) is under development. For the maintenance of this topic map, the prototype of a topic map engine has been developed based on the Zope object database server. In addition to that, a database of glyph expressions using IDS sequences for the more than Chinese characters contained in ISO/IEC :2000 has been developed. 1 2 CHISE (CHaracter Information Service Environment) SGML/XML UTF-2000 Maps [3] XEmacs [9] XEmacs UTF-2000 WWW Zope [10] XEmacs UTF-2000
2 1 / / / / / / 6 / / 1: XEmacs UTF-2000 XEmacs UTF-2000 XEmacs UTF-2000 UTF-2000 UTF XEmacs UTF-2000 UTF-2000 XEmacs-UTF ) XEmacs [9] XEmacs GNU Emacs [7] Emacs Lisp [6] Emacs Lisp GNU Emacs/XEmacs 2 WWW XEmacs UTF-2000 UTF-2000 XEmacs UTF-2000 GNU Emacs, XEmacs 30 bit 1 UTF-2000 id id id id XEmacs UTF-2000 define-char /a/ 1
3 coded-charset char-table 3 lazyloading UTF-2000 lazy-loading XEmacs UTF-2000 XEmacs-Mule XEmacs UTF define-char Emacs Lisp 1 XEmacs UTF-2000 UTF-2000 XEmacs UTF-2000 XEmacs UTF-2000 XEmacs-Mule define-char CHISE XEmacs UTF-2000 i386 Linux XEmacs- Mule 10 MB 5 40 MB Unicode Database XEmacs CNS UTF CDP (Chinese Document Processing) 27 MB CBETA (Chinese Buddhist Electronic Text Association) CHINA3 7 UTF-2000 UTF- Unicode [8] 2000 UTF-2000 morohashi-daikanwa The Unicode Standard [8] XEmacs UTF- Unicode =>ucs 2000 (lazy-loading) 2 XEmacs database Berkeley DB Debian GNU/Linux (sid) Berkeley DB Version 3 UTF-2000 lazy-loading dump TM 5800 Debian GNU/Linux 1970 (sid) 7 XEmacs UTF-2000 dump ISO/IEC : MB (strip 22 MB) [4] IDS lazy-loading 15 MB (strip 10 (Ideographic Description Sequence) MB) XEmacs mule IDC (Ideographic Description Char- 2 IDS lazy-loading XEmacs- S Lisp 10 MB (strip 6 MB) acters) Mule 5 MB XEmacs-Mule ideographic-structure Emacs Lisp code coded-charset Maps char-table XEmacs UTF-2000 char-table char-id-table CDP coded-charset (CBETA)
4 ASCII ( ) CDP / A B A B A B + C A B C + (((( ) ) + ( / )) )/ IDS IDS c CHISE IDS IDS XEmacs UTF-2000 ideographicstructure ideographic-structure 2: Ideographic Description Characters XML Maps CDP CBETA GPL CDP CBETA GT IDS IDS CDP Unicode a CDP ISO/IEC A, ISO/IEC CDP (Chinese Document Processing) [5] B, A B CDP Big5 CDP 14 ISO/IEC ,2 quail GNU (1) (2) (3) Emacs/XEmacs 8 CDP IDS 5 Zope Maps (3) IDC IDS (script) b CBETA (CBETA) UTF CBETA Big5 UTF Christian Wittern 1994 XML
5 1 Maps Character Maps (CTM) Maps 4 define-char 4: U+8AAA define-char Character Maps 5 Lisp 3: <occurrence> Maps topic SGML/XML (occurence) Maps ideographic-structure topic ( ) 6 topic topic <association> topic 2 Zope Maps topic map Zope (Zope Object Publishing Environment) Zope topic map Corporation Digital Creations 1) topic ( Web (occurrence)) Zope C 2) (associations) Python Zope Web 2000 [3] SGML [1] Zope HyTime [2] Architectural Forms DTD Maps (Document Type Definition) XML Zope Web XTM (XML Maps) Zope HTTP WebDAV ISO (amendment) XML-RPC Zope XML XEmacs
6 5: U+8AAA Character Map 7: Zope Maps UTF-2000 Map Maps topic map Maps Maps Query Language (TMQL) topic map Map topic map Basename Constraint (TMBC) topic Map topic map XEmacs UTF-2000 Map Maps 6: U+8AAA Character Map 7 Zope Maps Maps Maps Maps Maps CHISE Zope Map
7 Applications Maps, January Maps XEmacs UTF-2000 ISO/IEC 13250:2000. [4] International Organization for Standardization Zope Map (ISO). Information technology Universal Multiple-Octet Coded Character Set (UCS) Part 1: Architecture Basic Multilingual Plane (BMP), March and Map ISO/IEC :2000. [5] International Organization for Standard- PostgreSQL Map ization (ISO). Information technology XEmacs UTF-2000 Universal Multiple-Octet Coded Character Set (UCS) Part 2: Supplementary Planes, November Zope Map ISO/IEC : [6] Bil Lewis, Dan LaLiberte, and Richard Stallman. GNU Emacs Lisp Reference Manual. Free Software Foundation, 2.5 edition, May. for Emacs Version [7] Richard M. Stallman et al. GNU Emacs version ftp://ftp.gnu.org/gnu/emacs-21.2.tar.gz, March [8] The Unicode Consortium. The Unicode Standard, Version 3.0, February [9] XEmacs. [10] Zope. 8: Communication between XEmacs UTF- 2000, Zope and a relational database server Maps XEmacs UTF Map 6 7 [1] International Organization for Standardization (ISO). Information processing Text and office systems Standard Generalized Markup Language (SGML), ISO 8879:1986. [2] International Organization for Standardization (ISO). Information processing Text and office systems Hypermedia/Time-based Structuring Language (HyTime), ISO 10744:1997. [3] International Organization for Standardization (ISO). Information technology SGML
Universal Multiple-Octet Coded Character Set
Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation Me/dunarodna[ organizaci[ po standartizacii ISO/IEC JTC 1/SC 2/WG
More informationUnicode character. Unicode JIS X 0213 GB *2. Unicode character *3. John Mauchly Short Order Code character. Unicode Unicode ASCII.
Unicode character 2004 2 19 1 ( ) John Mauchly Short Order Code 1949 *1 1967 ASCII ASCII (ISO 2022 Mule ) (Unicode ISO/IEC 10646 ) (IBM NEC ) (e (s-moro@hanazono.ac.jp) *1 Fortran 1957 GT ) Unicode JIS
More informationOn the Missing-Characters (Gaiji) of the Taisho Tripitaka Text Database Published by SAT
On the Missing-Characters (Gaiji) of the Taisho Tripitaka Text Database Published by SAT Shigeki Moro The Association for the Computerization of Buddhist Texts, Japan 0 ABSTRACT In March of 1998, the Association
More information2011 Martin v. Löwis. Data-centric XML. Character Sets
Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers
More information2007 Martin v. Löwis. Data-centric XML. Character Sets
Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers
More informationThis manual describes utf8gen, a utility for converting Unicode hexadecimal code points into UTF-8 as printable characters for immediate viewing and
utf8gen Paul Hardy This manual describes utf8gen, a utility for converting Unicode hexadecimal code points into UTF-8 as printable characters for immediate viewing and as byte sequences suitable for including
More informationISO/IEC JTC 1/SC 2 N 3332/WG2 N 2057
ISO/IEC JTC 1/SC 2 N 3332/WG2 N 2057 Date: 1999-06-22 ISO/IEC JTC 1/SC 2 CODED CHARACTER SETS SECRETARIAT: JAPAN (JISC) DOC TYPE: TITLE: SOURCE: Other document National Body Comments on SC 2 N 3297, WD
More informationThe Unicode Standard Version 11.0 Core Specification
The Unicode Standard Version 11.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers
More informationPart III: Survey of Internet technologies
Part III: Survey of Internet technologies Content (e.g., HTML) kinds of objects we re moving around? References (e.g, URLs) how to talk about something not in hand? Protocols (e.g., HTTP) how do things
More informationExtended Character Sets for UCAS Systems
Extended Character Sets for UCAS Systems Admissions Conference 2010 Mike Gwyer ASCII The American Standard Code for Information Interchange A character-encoding scheme based on the ordering of the English
More informationRecent Trends in Standardization of Japanese Character Codes
Recent Trends in Standardization of Japanese Character Codes Taichi Kawabata Abstract Character encodings are a basic and fundamental layer of digital text that are necessary for exchanging information
More informationCategory: Informational 1 April 2005
Network Working Group M. Crispin Request for Comments: 4042 Panda Programming Category: Informational 1 April 2005 Status of This Memo UTF-9 and UTF-18 Efficient Transformation Formats of Unicode This
More informationISO/IEC JTC1/SC2/WG2 N 2490
ISO/IEC JTC1/SC2/WG2 N 2490 Date: 2002-05-21 ISO/IEC JTC1/SC2/WG2 Coded Character Set Secretariat: Japan (JISC) Doc. Type: Disposition of comments Title: Proposed Disposition of comments on SC2 N 3585
More informationIntroduction 1. Chapter 1
This PDF file is an excerpt from The Unicode Standard, Version 5.2, issued and published by the Unicode Consortium. The PDF files have not been modified to reflect the corrections found on the Updates
More informationWAP Binary XML Content Format Proposed Version 15-Aug-1999
WAP Binary XML Content Format Proposed Version 15-Aug-1999 Wireless Application Protocol Binary XML Content Format Specification Version 1.2 Disclaimer: This document is subject to change without notice.
More informationChapter 10: Understanding the Standards
Disclaimer: All words, pictures are adopted from Learning Web Design (3 rd eds.) by Jennifer Niederst Robbins, published by O Reilly 2007. Chapter 10: Understanding the Standards CSc2320 In this chapter
More informationExtensions for the programming language C to support new character data types VERSION FOR PDTR APPROVAL BALLOT. Contents
Extensions for the programming language C to support new character data types VERSION FOR PDTR APPROVAL BALLOT Contents 1 Introduction... 2 2 General... 3 2.1 Scope... 3 2.2 References... 3 3 The new typedefs...
More informationFree & Open Source Software: The Academic Future
Free & Open Source Software: The Academic Future Paul E. Johnson University of Kansas http://lark.cc.ku.edu/~pauljohn Presentation at Ukrainian National University of L'viv May 27, 2005
More informationThis PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consortium and published by Addison-Wesley.
This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consortium and published by Addison-Wesley. The material has been modified slightly for this online edition, however
More informationUnicode: What is it and how do I use it?
Abstract: The rationale for Unicode and its design goals and detailed design principles are presented. The correspondence between Unicode and ISO/IEC 10646 is discussed, the scripts included or planned
More informationGraphical Notation for Topic Maps (GTM)
Graphical Notation for Topic Maps (GTM) 2005.11.12 Jaeho Lee University of Seoul jaeho@uos.ac.kr 1 Outline 2 Motivation Requirements for GTM Goals, Scope, Constraints, and Issues Survey on existing approaches
More informationTex with Unicode Characters
Tex with Unicode Characters 7/10/18 Presented by: Yuefei Xiang Agenda ASCII Code Unicode Unicode in Tex Old Style Encoding -Inputenc, -ucs Morden Encoding -XeTeX -LuaTeX Unicode bi-direction in Tex -Emacs-AucTeX
More informationNetwork Working Group. September 24, XML Media Types draft-murata-xml-00.txt. Status of this Memo
Network Working Group Internet-Draft Expires: March 24, 2000 M. Murata Fuji Xerox Information Systems S. St.Laurent September 24, 1999 XML Media Types draft-murata-xml-00.txt Status of this Memo This document
More informationdraft-ietf-idn-idna-02.txt Internationalizing Host Names In Applications (IDNA) Status of this Memo
Internet Draft draft-ietf-idn-idna-02.txt June 16, 2001 Expires in six months Patrik Faltstrom Cisco Paul Hoffman IMC & VPNC Status of this Memo Internationalizing Host Names In Applications (IDNA) This
More informationJava Multilingual Elementary Tool
November 28, 2004 Outline Designing Outline Multilingual system: refer to computer programs which permit user interaction with the computer in one or more languages A Java multilingual elementary tool
More informationThe Adobe-CNS1-6 Character Collection
Adobe Enterprise & Developer Support Adobe Technical Note # bc The Adobe-CNS- Character Collection Introduction The purpose of this document is to define and describe the Adobe-CNS- character collection,
More informationMONTHLY TEST MAY 2017 QUESTION BANK FOR AVERAGE STUDENTS. Q.2 What is free software? How is it different from Open Source Software?
MONTHLY TEST MAY 2017 QUESTION BANK FOR AVERAGE STUDENTS Q.1. What is OSS? It refers to Open Source Software, which are modifiable, redistributable but may or may not be available free of cost. Source
More informationISO/IEC INTERNATIONAL STANDARD. Information technology ASN.1 encoding rules: XML Encoding Rules (XER)
INTERNATIONAL STANDARD ISO/IEC 8825-4 First edition 2002-12-15 Information technology ASN.1 encoding rules: XML Encoding Rules (XER) Technologies de l'information Règles de codage ASN.1: Règles de codage
More informationRequest for Comments: 2482 Category: Informational Spyglass January Language Tagging in Unicode Plain Text. Status of this Memo
Network Working Group Request for Comments: 2482 Category: Informational K. Whistler Sybase G. Adams Spyglass January 1999 Status of this Memo Language Tagging in Unicode Plain Text This memo provides
More informationLloyd Rutledge, Lynda Hardman, Jacco van Ossenbruggen* and Dick C.A. Bulterman
Lloyd Rutledge, Lynda Hardman, Jacco van Ossenbruggen* and Dick C.A. Bulterman CWI P.O. Box 94079 1090 GB Amsterdam, The Netherlands E-mail: {lloyd,lynda,dcab}@cwi.nl *Vrije Universiteit Dept. of Math.
More informationSTANDARD ST.66 DECEMBER 2007 CHANGES
Ref.: Standards - ST.66 Changes STANDARD ST.66 DECEMBER 2007 CHANGES Pages REFERENCES... 2 Editorial changes... 2 REQUIREMENTS OF THE STANDARD... 3 Paragraph 17, revised November 2007... 3 Paragraph 22,
More informationWAP Binary XML Content Format Document id WAP-192-WBXML Version 1.3 Approved Version 15 th May 2000
WAP Binary XML Content Format Document id WAP-192-WBXML-20000515 Version 1.3 Approved Version 15 th May 2000 This Document Document Identifier 192 Date 15 th May 2000 Subject: Version 1.3 WBXML Wireless
More informationISO/IEC TR This is a preview - click here to buy the full publication TECHNICAL REPORT. First edition
This is a preview - click here to buy the full publication TECHNICAL REPORT ISO/IEC TR 19769 First edition 2004-07-15 Information technology Programming languages, their environments and system software
More informationObsoletes: 2070, 1980, 1942, 1867, 1866 Category: Informational June 2000
Network Working Group Request for Comments: 2854 Obsoletes: 2070, 1980, 1942, 1867, 1866 Category: Informational D. Connolly World Wide Web Consortium (W3C) L. Masinter AT&T June 2000 The text/html Media
More information2009 Martin v. Löwis. Data-centric XML. XML Syntax
Data-centric XML XML Syntax 2 What Is XML? Extensible Markup Language Derived from SGML (Standard Generalized Markup Language) Two goals: large-scale electronic publishing exchange of wide variety of data
More informationRequest for Comments: 2277 BCP: 18 January 1998 Category: Best Current Practice. IETF Policy on Character Sets and Languages. Status of this Memo
Network Working Group H. Alvestrand Request for Comments: 2277 UNINETT BCP: 18 January 1998 Category: Best Current Practice Status of this Memo IETF Policy on Character Sets and Languages This document
More informationAustralian Standard. Industrial automation systems and integration Open systems application integration framework
AS ISO 15745.2 2004 ISO 15745-2:2003 AS ISO 15745.2 Australian Standard Industrial automation systems and integration Open systems application integration framework Part 2: Reference description for ISO
More informationTutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION
Tutorial 1 Getting Started with HTML5 HTML, CSS, and Dynamic HTML 5 TH EDITION Objectives Explore the history of the Internet, the Web, and HTML Compare the different versions of HTML Study the syntax
More informationMultilingual vi Clones: Past, Now and the Future
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference Monterey, California, USA, June
More informationATypI Hongkong Development of a Pan-CJK Font
ATypI Hongkong 2012 Development of a Pan-CJK Font What is a Pan-CJK Font? Pan (greek: ) means "all" or "involving all members" of a group Pan-CJK means a Unicode based font which supports different countries
More informationIntroduction to Informatics
Introduction to Informatics Lecture : Encoding Numbers (Part II) Readings until now Lecture notes Posted online @ http://informatics.indiana.edu/rocha/i The Nature of Information Technology Modeling the
More informationCS108 Software Systems: UNIX. Fall 2011
CS108 Software Systems: UNIX Fall 2011 CS108 Fall 2011 2 Course Info cs.utexas.edu/ edwardsj/teaching/2011fall/cs108 CS108 Fall 2011 3 Why Linux? Multi-user, multi-process operating system Open-source
More informationXSLT-process minor mode
XSLT-process minor mode for version 2.2 January 2003 by Ovidiu Predescu and Tony Addyman Copyright c 2000, 2001, 2002, 2003 Ovidiu Predescu. Copyright c 2002, 2003 Tony Addyman. All rights reserved. Distributed
More informationThe unified ideograph U+5FF9 that has two sources (G and T3-2623) is shown below: (see ISO/IEC 10646:2003, p.477)
Universal Multiple-Octet Coded Character Set UCS ISO/IEC JTC1/SC2/WG2 IRG N 1666 Date: 2010-6-15 Doc. Type: Member body contribution Title: Error report on U+225D6 AND U+2F89F Source: TCA and China Status:
More informationUnicode definition list
abstract character D3 3.3 2 abstract character sequence D4 3.3 2 accent mark alphabet alphabetic property 4.10 2 alphabetic sorting annotation ANSI Arabic digit 1 Arabic-Indic digit 3.12 1 ASCII assigned
More informationUsing Unicode with MIME
Network Working Group Request for Comments: 1641 Category: Experimental Using Unicode with MIME D. Goldsmith M. Davis July 1994 Status of this Memo This memo defines an Experimental Protocol for the Internet
More informationSpecification Information Note
Specification Information Note WAP-183_005-ProvCont-20020411-a Version 11-Apr-2002 for Wireless Application Protocol WAP-183-ProvCont-20010724-a WAP Provisioning Content Version 24-July-2001 A list of
More informationä + ñ ISO/IEC JTC1/SC2/WG2 N
ISO/IEC JTC1/SC2/WG2 N3727 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation Международная организация по стандартизации
More informationXML Introduction 1. XML Stands for EXtensible Mark-up Language (XML). 2. SGML Electronic Publishing challenges -1986 3. HTML Web Presentation challenges -1991 4. XML Data Representation challenges -1996
More informationContents. Topics. 01. WWW 02. WWW Documents 03. Web Service 04. Web Technologies. Management of Technology. C01-1. Documents
Management of Technology Topics C01-1. Documents Code: 166125-01 Course: Management of Technology Period: Spring 2013 Professor: Sync Sangwon Lee, Ph. D 1 Contents 01. WWW 03. Web Service 04. Web Technologies
More informationINTERNATIONALIZATION IN GVIM
INTERNATIONALIZATION IN GVIM A PROJECT REPORT Submitted by Ms. Nisha Keshav Chaudhari Ms. Monali Eknath Chim In partial fulfillment for the award of the degree Of B. Tech Computer Engineering UNDER THE
More informationUniTerm Formats and Terminology Exchange
Wolfgang Zenk UniTerm Formats and Terminology Exchange Abstract This article presents UniTerm, a typical representative of terminology management systems (TMS). The first part will highlight common characteristics
More informationISO/IEC JTC1/SC2/WG2 N3787
Universal Multiple-Octet Coded Character Set UCS ISO/IEC JTC1/SC2/WG2 N3787 ISO/IEC JTC1/SC2/WG2 IRG N 1666 Date: 2010-3-25 Doc. Type: Member body contribution Title: Request for disunifying U+2F89F from
More informationGNU EPrints 2 Overview
GNU EPrints 2 Overview Christopher Gutteridge 14th October 2002 Abstract An overview of GNU EPrints 2. EPrints is free software which creates a web based archive and database of scholarly output and is
More informationdraft-hoffman-i18n-terms-02.txt July 18, 2001 Expires in six months Terminology Used in Internationalization in the IETF Status of this memo
Internet Draft draft-hoffman-i18n-terms-02.txt July 18, 2001 Expires in six months Paul Hoffman IMC & VPNC Status of this memo Terminology Used in Internationalization in the IETF This document is an Internet-Draft
More informationThe HTTP protocol. Fulvio Corno, Dario Bonino. 08/10/09 http 1
The HTTP protocol Fulvio Corno, Dario Bonino 08/10/09 http 1 What is HTTP? HTTP stands for Hypertext Transfer Protocol It is the network protocol used to delivery virtually all data over the WWW: Images
More informationAFP Support for TrueType/Open Type Fonts and Unicode
AFP Support for TrueType/Open Type Fonts and Unicode Reinhard Hohensee Distinguished Engineer October 24, 2003 Ricoh Topics What is Unicode? What are TrueType and OpenType fonts? Why have we extended the
More informationEditor s Concrete Syntax (ECS): a Profile of SGML for Editors
Editor s Concrete Syntax (ECS): a Profile of SGML for Editors Topologi Technical Note. August 13, 2002 Rick Jelliffe SGML and XML Editing Concrete Syntax (ECS) This draft paper formalizes the lexical
More informationISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 1: Systems
INTERNATIONAL STANDARD ISO/IEC 15938-1 First edition 2002-07-01 Information technology Multimedia content description interface Part 1: Systems Technologies de l'information Interface de description du
More informationAdministrative Notes February 9, 2017
Administrative Notes February 9, 2017 Feb 10: Project proposal resubmission (optional) Feb 13: Art and Images reading quiz Feb 17: In the News call #2 Data Representation: Part 2 Text representation Colour
More informationPrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps
PrepAwayExam http://www.prepawayexam.com/ High-efficient Exam Materials are the best high pass-rate Exam Dumps Exam : I10-003 Title : XML Master Professional Database Administrator Vendors : XML Master
More informationAlis Technologies. UTF-16, an encoding of ISO Status of this Memo
Internet Draft December 13, 1998 Paul Hoffman Internet Mail Consortium Francois Yergeau Alis Technologies UTF-16, an encoding of ISO 10646 Status of this Memo This document
More informationXML Metadata Standards and Topic Maps
XML Metadata Standards and Topic Maps Erik Wilde 16.7.2001 XML Metadata Standards and Topic Maps 1 Outline what is XML? a syntax (not a data model!) what is the data model behind XML? XML Information Set
More informationChapter 2: Open Source Concepts
Chapter 2: Open Source Concepts Informatics Practices Class XII (CBSE Board) Revised as per CBSE Curriculum 2015 Visit www.ip4you.blogspot.com for more. Authored By:- Rajesh Kumar Mishra, PGT (Comp.Sc.)
More informationYong Kyu Lee, Keum Suk Lee, Young Sik Hong Computer Engineering Dept. and Electronic Buddhist Text Institute (EBTI)
The Hanguk Pulgyo Chonso and the Hangul Tripitaka (the Korean Ancient Buddhist Corpus and the Korean Translation of the Koryo Buddhist Canon) on the WWW Yong Kyu Lee, Keum Suk Lee, Young Sik Hong Computer
More informationISO INTERNATIONAL STANDARD. Document management Electronic document file format for long-term preservation Part 1: Use of PDF 1.
INTERNATIONAL STANDARD ISO 19005-1 First edition 2005-10-01 Document management Electronic document file format for long-term preservation Part 1: Use of PDF 1.4 (PDF/A-1) Gestion de documents Format de
More informationRepresenting Characters, Strings and Text
Çetin Kaya Koç http://koclab.cs.ucsb.edu/teaching/cs192 koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.cs.ucsb.edu Fall 2016 1 / 19 Representing and Processing Text Representation of text predates the use
More informationuptex Unicode version of ptex with CJK extensions
uptex Unicode version of ptex with CJK extensions Takuji Tanaka uptex project Oct 26, 2013 Takuji Tanaka (uptex project) uptex Unicode version of ptex with CJK extensions Oct 26, 2013 1 / 42 Outline /
More informationFrom SGML to HTML and back. From SGML to HTML
Surfing inside the Web From SGML to HTML and back Hans C. Arents Office Future International Services Atlas Park, Weiveldlaan 41 B. 32, B-1930 Zaventem, Belgium Tel: +32 (0)2 725 40 25 -Fax: +32 (0)2 725
More informationPractical character sets
Practical character sets In MySQL, on the web, and everywhere Domas Mituzas MySQL @ Sun Microsystems Wikimedia Foundation It seems simple a b c d e f a ą b c č d e ę ė f а б ц д е ф פ ע ד צ ב א... ---...
More informationIntroduction Introduction to XML
Introduction Introduction to XML Lecture "XML in Communication Systems" Chapter 1 Dr.-Ing. Jesper Zedlitz Research Group for Communication Systems Dept. of Computer Science Christian-Albrechts-University
More informationx ide xml Integrated Development Environment Specifications Document 1 Project Description 2 Specifi fications
x ide xml Integrated Development Environment Specifications Document Colin Hartnett (cphartne) 7 February 2003 1 Project Description There exist many integrated development environments that make large
More informationThe Use of Unicode in MARC 21 Records. What is MARC?
# The Use of Unicode in MARC 21 Records Joan M. Aliprand Senior Analyst, RLG What is MARC? MAchine-Readable Cataloging MARC is an exchange format Focus on MARC 21 exchange format An implementation may
More informationTECkit version 2.0 A Text Encoding Conversion toolkit
TECkit version 2.0 A Text Encoding Conversion toolkit Jonathan Kew SIL Non-Roman Script Initiative (NRSI) Abstract TECkit is a toolkit for encoding conversions. It offers a simple format for describing
More informationD16 Code sets, NLS and character conversion vs. DB2
D16 Code sets, NLS and character conversion vs. DB2 Roland Schock ARS Computer und Consulting GmbH 05.10.2006 11:45 a.m. 12:45 p.m. Platform: DB2 for Linux, Unix, Windows Code sets and character conversion
More information<draft-freed-charset-reg-02.txt> IANA Charset Registration Procedures. July Status of this Memo
HTTP/1.1 200 OK Date: Mon, 08 Apr 2002 23:58:19 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Thu, 24 Jul 1997 17:22:00 GMT ETag: "2e9992-4021-33d78f38" Accept-Ranges: bytes Content-Length: 16417 Connection:
More informationNetwork Working Group Request for Comments: 3508 Category: Informational April H.323 Uniform Resource Locator (URL) Scheme Registration
Network Working Group O. Levin Request for Comments: 3508 RADVISION Category: Informational April 2003 H.323 Uniform Resource Locator (URL) Scheme Registration Status of this Memo This memo provides information
More informationUnicode. Standard Alphanumeric Formats. Unicode Version 2.1 BCD ASCII EBCDIC
Standard Alphanumeric Formats Unicode BCD ASCII EBCDIC Unicode Next slides 16-bit standard Developed by a consortia Intended to supercede older 7- and 8-bit codes Unicode Version 2.1 1998 Improves on version
More informationMRK260. Week Two. Graphic and Web Design
MRK260 Week Two Graphic and Web Design This weeks topics BASIC HTML AND CSS MRK260 - Graphic & Web Design - Week Two 2 Lesson Summary What is HTML? Introduction to HTML Basics Introduction to CSS Introduction
More informationCreating an Oracle Database Using DBCA. Copyright 2009, Oracle. All rights reserved.
Creating an Oracle Database Using DBCA Objectives After completing this lesson, you should be able to do the following: Create a database by using the Database Configuration Assistant (DBCA) Generate database
More informationIntroduction to Linux Overview and Some History
Introduction to Linux Overview and Some History Computational Science and Engineering North Carolina A&T State University Instructor: Dr. K. M. Flurchick Email: kmflurch@ncat.edu Operating Systems and
More informationXML APIs Testing Using Advance Data Driven Techniques (ADDT) Shakil Ahmad August 15, 2003
XML APIs Testing Using Advance Data Driven Techniques (ADDT) Shakil Ahmad August 15, 2003 Table of Contents 1. INTRODUCTION... 1 2. TEST AUTOMATION... 2 2.1. Automation Methodology... 2 2.2. Automated
More informationISO/IEC JTC 1/SC 2 N 3555 DATE: REPLACES: SC 2N3472
ISO/IEC JTC 1/SC 2 N 3555 DATE: 2001-10-04 REPLACES: SC 2N3472 ISO/IEC JTC 1/SC 2 Coded Character Sets Secretariat: Japan (JISC) DOC. TYPE Business Plan TITLE SC 2 Business Plan (Period Covered: October
More informationNetwork Working Group. Updates: 5228 January 2008 Category: Standards Track
Network Working Group K. Homme Request for Comments: 5229 University of Oslo Updates: 5228 January 2008 Category: Standards Track Status of This Memo Sieve Email Filtering: Variables Extension This document
More informationRequest for Comments: 3536 Category: Informational May Terminology Used in Internationalization in the IETF
Network Working Group P. Hoffman Request for Comments: 3536 IMC & VPNC Category: Informational May 2003 Status of this Memo Terminology Used in Internationalization in the IETF This memo provides information
More informationXML Update. Royal Society of the Arts London, December 8, Jon Bosak Sun Microsystems
XML Update Royal Society of the Arts London, December 8, 1998 Jon Bosak Sun Microsystems XML Basics...A-1 The XML Concept...B-1 XML in Context...C-1 XML and Open Standards...D-1 XML Update XML Basics XML
More informationNetwork Working Group Request for Comments: Category: Best Current Practice January IANA Charset Registration Procedures
Network Working Group Request for Comments: 2278 BCP: 19 Category: Best Current Practice N. Freed Innosoft J. Postel ISI January 1998 IANA Charset Registration Procedures Status of this Memo This document
More informationJournal of Digital Information, Vol 3, No 2 (2002)
Journal of Digital Information, Vol 3, No 2 (2002) Chinese Buddhist texts for the new Millenium - The Chinese Buddhist Electronic Text Association (CBETA) and its Digital Tripitaka Christian Wittern Institute
More informationBUDDHIST STONE SCRIPTURES FROM SHANDONG, CHINA
BUDDHIST STONE SCRIPTURES FROM SHANDONG, CHINA Heidelberg Academy of Sciences and Humanities Research Group Buddhist Stone Scriptures in China Hauptstraße 113 69117 Heidelberg Germany marnold@zo.uni-heidelberg.de
More informationRequest for Comments: 2218 Category: Standards Track Sandia National Laboratory October A Common Schema for the Internet White Pages Service
Network Working Group Request for Comments: 2218 Category: Standards Track T. Genovese Microsoft B. Jennings Sandia National Laboratory October 1997 A Common Schema for the Internet White Pages Service
More informationCODEV-NIC free registry software
CODEV-NIC free registry software Stéphane Bortzmeyer AFNIC (".fr" registry) bortzmeyer@nic.fr 2 March 2006 1 CODEV-NIC free registry software Permission is granted to copy, distribute and/or modify this
More informationHow Emacs Evolves to Suit Your Needs p. 1 How Emacs Differs from Other Software p. 3 Software and the User p. 4 Emacs Vocabulary and Conventions p.
Introduction p. xxix How Emacs Evolves to Suit Your Needs p. 1 How Emacs Differs from Other Software p. 3 Software and the User p. 4 Emacs Vocabulary and Conventions p. 7 Key Conventions p. 9 Emacs and
More informationThis PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consortium and published by Addison-Wesley.
This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consortium and published by Addison-Wesley. The material has been modified slightly for this online edition, however
More informationThis document is to be used together with N2285 and N2281.
ISO/IEC JTC1/SC2/WG2 N2291 2000-09-25 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по
More informationHong Kong Cantonese Data Retrieval in Multilingual Perspectives: The Case of a Cantonese-Dagaare-English e-lexicon
Hong Kong Cantonese Data Retrieval in Multilingual Perspectives: The Case of a Cantonese-Dagaare-English e-lexicon Sally Y.K. MOK Department of Linguistics, The University of Hong Kong Pokfulam, Hong Kong
More informationISO/IEC INTERNATIONAL STANDARD. Information technology ECMAScript for XML (E4X) specification
INTERNATIONAL STANDARD ISO/IEC 22537 First edition 2006-02-15 Information technology ECMAScript for XML (E4X) specification Technologies de l'information ECMAScript pour spécification XML (E4X) Reference
More informationISO/IEC JTC1/SC2/WG2. Universal Multiple-Octet Coded Character Set (UCS) - ISO/IEC Secretariat: ANSI
1 ISO/IEC JTC 1/SC 2/WG 2 N3246 DATE: 2007-04-20 ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) - ISO/IEC 10646 Secretariat: ANSI TITLE: SOURCE: STATUS: ACTION: DISTRIBUTION:
More informationISO International Organization for Standardization Organisation Internationale de Normalisation
ISO International Organization for Standardization Organisation Internationale de Normalisation ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) ISO/IEC JTC 1/SC 2/WG 2 N2381R
More informationIntroduction to XML. An Example XML Document. The following is a very simple XML document.
Introduction to XML Extensible Markup Language (XML) was standardized in 1998 after 2 years of work. However, it developed out of SGML (Standard Generalized Markup Language), a product of the 1970s and
More information