Infrastructure for High-Quality Arabic

Size: px
Start display at page:

Download "Infrastructure for High-Quality Arabic"

Transcription

1 TUG 06 Marrakech Infrastructure for High-Quality Arabic Yannis Haralambous École Nationale Supérieure des Télécommunications de Bretagne Technopôle Brest Iroise, CS 83818, Brest Cedex TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 1/39

2 About Characters and Glyphs The Incessant Ballet TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 2/39

3 Characters and Glyphs A A AA 0041 latin capital letter a TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 3/39

4 Characters and Glyphs A A AA 0041 latin capital letter a A 0391 greek capital letter alpha TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 3/39

5 Characters and Glyphs 0643 arabic letter kaf, 062A arabic letter teh, 0627 arabic letter alef, 0628 arabic letter beh TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 4/39

6 Characters and Glyphs 0939 devanagari letter ha, 093F devanagari vowel sign i, 0928 devanagari letter na, [virama], 0926 devanagari letter da, 0940 devanagari vowel sign ii h E³ n [virama]d ³F Eh df TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 5/39

7 The TEX Way TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 6/39

8 From Input to DVI Input: character a (code 97); token: a11 (code 97, category 11); char_node: <fontcmr10, glyph index 97>; opcode: set_char_97 (implicit glyph index 97); result: a. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 7/39

9 From Input to DVI Input: characters fi (codes 102, 105); tokens f11 i 11 (codes 102 and 105, category 12); lig_node: <font cmr10, glyph index 12, hlist: f i >; opcode: set_char_12 (implicit glyph index 12); result: fi. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 8/39

10 From Input to DVI Input: characters \discretionary{k-}{k}{ck}; tokens: discretionary {1 k } 2 { 1 k 11 } 2 { 1 c 11 k 11 } 2 ; disc_node: <pre-break hlist: k-, post-break hlist: k, nodes to remove: 2>, char_nodes: <fontcmr10, 63>, <fontcmr10, 107>; opcodes: set_char_63, set_char_107 if no hyphenation, set_char_107, set_char_45, line break, set_char_45 otherwise; result: ck or k- + line break + k. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 9/39

11 TEX Reads Characters, Writes Glyphs T EX reads characters; a character token is a character and a catcode; a char_node is a font ID and a glyph index; a DVI set_char opcode is a glyph index (in the current font). Example: in the DVI file, fi is glyph 12 of fontcmr10 (no reference to characters f, i). TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 10/39

12 Arabic Typeset With Omega 1 Input: characters ; tokens: ; context analysis OTP reads, replaces by and pushes INSIDE mode, reads, replaces by, reads, replaces by, pops mode, reads followed by EOB, leaves unchanged; tokens: ; char_nodes and DVI opcodes as usual, no reference to initial characters. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 11/39

13 Our Vision Through mighty textemes and wonders TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 12/39

14 Textemes I m a character TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 13/39

15 Textemes I m a character I m your glyph TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 13/39

16 Textemes Let us be a texteme! TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 14/39

17 Textemes Let us be a texteme!... and they lived happily ever after and had many properties... TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 14/39

18 The Texteme Notion A texteme is a set {c,p1 = v 1,...,p n = v n,g} where: c is a Unicode character (can be void); for 0 i n, pi is a property name and v i a property value (n can be 0); g is a glyph reference (can be void); all texteme components can be inserted, modified and/or deleted by the user, by OTPs, and by modules; they can also be locked and unlocked by the user and by OTPs. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 15/39

19 Notation c=w kern=-0.28pt g= w c=a hyph=1 g= a c=t color=yellow g= t c=e g= e c=r g= r to obtain: water c=f gdef=f g= fi c=i gdef=i g= c=l g= l c=l g= l c=e g= e to obtain: fille TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 16/39

20 Typesetting Arabic With Textemes 1) c=0643 g= c=062a color=red g= c=0627 g= c=0628 g= 2) c=0643 form=1 g= c=062a color=red form=2 g= c=0627 form=3 g= c=0628 form=0 g= 3) c=0643 form=1 g= c=062a color=red form=2 g= c=0627 form=3 g= c=0628 form=0 g= and the result: TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 17/39

21 Typesetting Hindi With Textemes 1) c=0939h g= c=093fe³ g= c=0928n virama=1 g= c=0926d g= c=0940 ³F g= 2) c= link E³ g=e c=0939h g=h c=093fe³ g= c=0928n virama=1 g= c=0926d g=d c=0940 ³F g=f and the result:eh df TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 18/39

22 Dynamic Typesetting T EX uses character, ligature, discretionary nodes; Gutenberg used ligatures and variant forms for optimizing justification: Omnia per ip um facta unt: et ine ip o factum e t n Oµia «iπm facta unt: & ~ne iπo factum eá ni il. Arabic can also take advantage of dynamic typesetting to minimize the use of black glue (keshideh). TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 19/39

23 Extended Graph T EX builds a directed graph of all feasible line breaks of a given paragraph. Arcs of this graph carry a metric: the badness of lines. The best choice of line breaks is the shortest path for that metric. Ω 2 will use an extended TEX graph where there will be more than one arcs between two nodes, depending on variant glyphs and activation of ligatures. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 20/39

24 Dynamicity of Arabic Script Arabic text justification can be obtained by (in order of priority): using blank spaces of variable width (as in other scripts); enabling or disabling ligatures; choosing between alternative forms of glyphs; inserting keshideh connections between letters (preferably curvilinear ones), also called black glue. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 21/39

25 The Typesetting Process TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 22/39

26 Step 0: Arabic forms most systems consider that forms are contextually calculated; ZWJ is a clumsy solution; Arabic form can become a texteme property. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 23/39

27 Step 0: Short vowels very useful for NLP but not written; Ahmed Lakhdar Ghazal said:... ; there is software for vowelizing (Sakhr Diacritizer, etc.); short vowels can become texteme properties; visibility of short vowels also; same problem for all-caps Greek. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 24/39

28 Step 0: Semitic roots and dots Semitic roots are very useful for indexation; there are also prefixes and postfixes; old manuscripts do not have dots on letters; this information can be inserted in texteme properties. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 25/39

29 Step 0: Semantic keshideh Keshideh (black glue) can be semantic; variant forms can have semantic overload; this information can be inserted in texteme properties. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 26/39

30 Step 0: Color color can be very useful, but it must not interfere with contextual analysis; this information can be inserted in texteme properties; the mandatory ligature lam-alif is a problem: TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 27/39

31 Step 1: Hyphenation Arabic language is not hyphenated; Arabic script can be hyphenated! TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 28/39

32 Step 2: Contextual analysis Calculate forms when the property is not locked; can be done either at OTP or at module level. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 29/39

33 Step 2 : Hamza rules the idea belongs to Klaus Lagally; in fact, the Unicode Arabic table is not grammatical enough; Arabic grammar mentions letter hamza; hamza can be: arabic letter hamza, arabic letter alef with hamza, arabic letter yeh with hamza or arabic letter waw with hamza, depending on context (including short vowel information). TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 30/39

34 Step 3: Bidi My friend said! or My friend said! ; RLE (RIGHT-TO-LEFT EMBEDDING), LRE (LEFT-TO-RIGHT EMBEDDING): push a level ; RLO (RIGHT-TO-LEFT OVERRIDE), LRO (LEFT-TO-RIGHT OVERRIDE): push a behaviour ; PDF (POP DIRECTIONAL FORMAT): pop ; RLM (RIGHT-TO-LEFT MARK), LRM (LEFT-TO-RIGHT MARK): cheat. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 31/39

35 Step 4: OpenType and Arabic 1. supply the glyph corresponding to the pair (character, contextual form), the form being provided as an OpenType feature [GSUB table, lookup of type 1 single substitution ]; 2. supply grammatical (lam-alif) and esthetic ligatures [GSUB table, lookup of type 4 ligature ]; 3. supply alternative forms for glyphs [GSUB table, lookup of type 3 variant selection ]; 4. kerning between single or ligatured glyphs [GPOS table, lookup of type 2 positioning of a pair of glyphs ]; 5. place short vowels and other diacritics on isolated glyphs or on ligature components [GPOS table, lookups of types 4 diacritical marks and 5 diacritical marks on ligatures ]. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 32/39

36 Step 4: OpenType vs. Super-OpenType once the longest glyph string is found, the substitution or positioning is applied and we move on; special case: backtrack and lookahead; in Super-OpenType we store all cases. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 33/39

37 Step 5: Fine-Tuning move apart touching glyphs: letters, short vowels, diacritics. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 34/39

38 Step 6: Extended Graph Knuth dixit (p. 143 of Digital Typography ): It is interesting to consider how to extend the total-fit algorithm so that it could handle cases like the dropping of m s and n s in Figure 22. The badness function of a line would then depend not only on its natural width, stretchability, and shrinkability; it would also depend on the number of m s and the number of n s on that line. A similar technique could be used to typeset biblical Hebrew, which is never hyphenated; Hebrew fonts intended for sacred texts usually include wide variants of several letters, so that individual characters on a line can be replaced by their wider counterparts in order to avoid wide spaces between words. TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 35/39

39 Step 6: Extended Graph an extended T EX graph is a graph where every arc is itself a graph with respect to variants and ligatures activations; this can lead to combinatorial explosion: ten lines of 60 glyphs which three forms each = combinations; we need a strateg y TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 36/39

40 Step 6: Strategies 1. Classify glyphs by width (the number of classes is inverse proportional to amount of white and black glue authorized); 2. give an order to classes; 3. calculate badnesses of random combinations until one is less than our treshold; 4. other ideas... TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 37/39

41 Step 7: Post-processing 1. Mainly for keshideh: change glyphs surrounding it and draw a nice curve (see paper by Elyaakoubi et al. 2. sometimes keshideh may carry a glyph: TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 38/39

42 Conclusion arabic is a beautiful script let us hope that some day computer science will be worthy of it! TUG 06 Marrakech Infrastructure for High-Quality Arabic p. 39/39

Infrastructure for High-Quality Arabic Typesetting

Infrastructure for High-Quality Arabic Typesetting Département Informatique, ENST Bretagne, CS 83 818, 29 238 BREST Cedex 3, France yannis.haralambous@enst-bretagne.fr Introduction This paper presents what we consider to be the ideal 1 (or at least a first

More information

Omega Becomes a Sign Processor

Omega Becomes a Sign Processor Proceedings EuroT E X2005 Pont-à-Mousson, France MOT02 Yannis Haralambous ENST Bretagne yannis.haralambous@enst-bretagne.fr http://omega.enstb.org/yannis Gábor Bella ENST Bretagne gabor.bella@enst-bretagne.fr

More information

What s new since TEX?

What s new since TEX? Based on Frank Mittelbach Guidelines for Future TEX Extensions Revisited TUGboat 34:1, 2013 Raphael Finkel CS Department, UK November 20, 2013 All versions of TEX Raphael Finkel (CS Department, UK) What

More information

Edinburgh Research Explorer

Edinburgh Research Explorer dinburgh Research xplorer Open-belly surgery in 2 Citation for published version: Haralambous, Y, Bella, G & Gulzar, A 2006, 'Open-belly surgery in 2' TUGboat, vol. 27, no. 1, pp. 91-97. Link: Link to

More information

Rendering in Dzongkha

Rendering in Dzongkha Rendering in Dzongkha Pema Geyleg Department of Information Technology pema.geyleg@gmail.com Abstract The basic layout engine for Dzongkha script was created with the help of Mr. Karunakar. Here the layout

More information

Arabic document composition with T E X

Arabic document composition with T E X Arabic document composition with T E X Azzeddine LAZREK University Cadi Ayyad, Faculty of Sciences Department of Computer Science Marrakesh - Morocco lazrek@ucam.ac.ma http://www.ucam.ac.ma/fssm/rydarab

More information

OpenType Font by Harsha Wijayawardhana UCSC

OpenType Font by Harsha Wijayawardhana UCSC OpenType Font by Harsha Wijayawardhana UCSC Introduction The OpenType font format is an extension of the TrueType font format, adding support for PostScript font data. The OpenType font format was developed

More information

Basics of the Unicode BiDirectional Algorithm (UBDA)

Basics of the Unicode BiDirectional Algorithm (UBDA) Basics of the Unicode BiDirectional Algorithm (UBDA) The formatting system implied in (and used to typeset) the slides for the brief bidirectional text reading lesson assumes: Every character has a direction,

More information

Nastaleeq: A challenge accepted by Omega

Nastaleeq: A challenge accepted by Omega Nastaleeq: A challenge accepted by Omega Atif Gulzar, Shafiq ur Rahman Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, Lahore, Pakistan atif dot

More information

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs by Adil Allawi, Diwan Software Limited adil@diwan.com Introduction One of the big problems for Arabic text

More information

Multilingual mathematical e-document processing

Multilingual mathematical e-document processing Multilingual mathematical e-document processing Azzeddine LAZREK University Cadi Ayyad, Faculty of Sciences Department of Computer Science Marrakech - Morocco lazrek@ucam.ac.ma http://www.ucam.ac.ma/fssm/rydarab

More information

Tex with Unicode Characters

Tex with Unicode Characters Tex with Unicode Characters 7/10/18 Presented by: Yuefei Xiang Agenda ASCII Code Unicode Unicode in Tex Old Style Encoding -Inputenc, -ucs Morden Encoding -XeTeX -LuaTeX Unicode bi-direction in Tex -Emacs-AucTeX

More information

Fabricator Font File Spec. David Rutten (Robert McNeel & Associates) February 26, 2015

Fabricator Font File Spec. David Rutten (Robert McNeel & Associates) February 26, 2015 Fabricator Font File Spec David Rutten (Robert McNeel & Associates) February 26, 2015 1 About This document contains the specifications for the *.fabfont and *.symbol file formats. These formats were created

More information

Extensible Rendering for Complex Writing Systems

Extensible Rendering for Complex Writing Systems Extensible Rendering for Complex Writing Systems Sharon Correll SIL International 1 Introduction Those needing to work with multilingual text, particularly using any kind of complex script, commonly run

More information

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek JTC1/SC2/WG2 N 3088 Others Symbols, Additional characters proposed to Unicode Azzeddine Lazrek lazrek@ucam.ac.ma Cadi Ayyad University, Faculty of Sciences P.O. Box 2390, Marrakech, Morocco Phone: +212

More information

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant)

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ICANN IDN TLD Variant Issues Project Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ajs@anvilwalrusden.com I m a consultant Blame me for mistakes here, not staff or ICANN

More information

Bookmarks for PDF Output(Outline-Group)

Bookmarks for PDF Output(Outline-Group) Bookmarks for PDF Output(Outline-Group) The axf:outline-group groups bookmark items of PDF, and outputs them collectively. Value: Initial: empty string Applies to: block-level formatting objects

More information

Unicode Standard Annex #9

Unicode Standard Annex #9 http://www.unicode.org/reports/tr9/tr9-24.html 1 of 30 Technical Reports Unicode Standard Annex #9 Version Unicode 6..0 Editors Date This Version Previous Version Latest Version Latest Proposed Update

More information

How to use text. Adding a text frame

How to use text. Adding a text frame How to use text Because Adobe InDesign CS6 is a page layout tool, working with text is an important skill. With InDesign, you add all text (and all content) into frames. Frames are shapes (called paths)

More information

The Unicode Standard Version 10.0 Core Specification

The Unicode Standard Version 10.0 Core Specification The Unicode Standard Version 10.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

A multidimensional approach to typesetting

A multidimensional approach to typesetting A multidimensional approach to typesetting John Plaice, Paul Swoboda School of Computer Science and Engineering The University of New South Wales UNSW Sydney NSW 2052, Australia plaice@cse.unsw.edu.au,

More information

Characters, Glyphs and Beyond

Characters, Glyphs and Beyond Kyoto University 21st Century COE Program Characters, Glyphs and Beyond Tereza Haralambous and Yannis Haralambous Abstract The distinction between characters and glyphs is a fundamental issue of computing.

More information

Font Forum. TUGboat, Volume 21 (2000), No Thai fonts Werner Lemberg

Font Forum. TUGboat, Volume 21 (2000), No Thai fonts Werner Lemberg TUGboat, Volume 21 (2000), No. 2 113 Font Forum Thai fonts Werner Lemberg Abstract This article describes how the Thai script works and how to implement the necessary ligatures for TEX using afm2tfm. 1

More information

Unicode definition list

Unicode definition list abstract character D3 3.3 2 abstract character sequence D4 3.3 2 accent mark alphabet alphabetic property 4.10 2 alphabetic sorting annotation ANSI Arabic digit 1 Arabic-Indic digit 3.12 1 ASCII assigned

More information

PLATYPUS FUNCTIONAL REQUIREMENTS V. 2.02

PLATYPUS FUNCTIONAL REQUIREMENTS V. 2.02 PLATYPUS FUNCTIONAL REQUIREMENTS V. 2.02 TABLE OF CONTENTS Introduction... 2 Input Requirements... 2 Input file... 2 Input File Processing... 2 Commands... 3 Categories of Commands... 4 Formatting Commands...

More information

IT82: Mul timedia. Practical Graphics Issues 20th Feb Overview. Anti-aliasing. Fonts. What is it How to do it? History Anatomy of a Font

IT82: Mul timedia. Practical Graphics Issues 20th Feb Overview. Anti-aliasing. Fonts. What is it How to do it? History Anatomy of a Font IT82: Mul timedia Practical Graphics Issues 20th Feb 2003 1 Anti-aliasing What is it How to do it? Lines Shapes Fonts History Anatomy of a Font Overview Types of Fonts ( which do I choose? ) How to make

More information

Neatroff. Ali Gholami Rudi. Updated in April 2018

Neatroff. Ali Gholami Rudi. Updated in April 2018 Neatroff Ali Gholami Rudi Updated in April 2018 Neatroff is a new implementation of Troff typesetting system in C programming language, which tries to address, as neatly as possible, some of the shortcomings

More information

Proposed Update Unicode Standard Annex #9

Proposed Update Unicode Standard Annex #9 Technical Reports Proposed Update Unicode Standard Annex #9 Version Unicode 6.2.1 (draft 3) Editors Date 2012-10-26 This Version Previous Version Latest Version Latest Proposed Update Revision 28 Summary

More information

The Unicode Standard Version 6.0 Core Specification

The Unicode Standard Version 6.0 Core Specification The Unicode Standard Version 6.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

SPAREPARTSCATALOG: CONNECTORS SPARE CONNECTORS KTM ART.-NR.: 3CM EN

SPAREPARTSCATALOG: CONNECTORS SPARE CONNECTORS KTM ART.-NR.: 3CM EN SPAREPARTSCATALOG: CONNECTORS ART.-NR.: 3CM3208201EN CONTENT SPARE CONNECTORS AA-AN SPARE CONNECTORS AO-BC SPARE CONNECTORS BD-BQ SPARE CONNECTORS BR-CD 3 4 5 6 SPARE CONNECTORS CE-CR SPARE CONNECTORS

More information

How a Font Can Respect Basic Rules of Arabic Calligraphy

How a Font Can Respect Basic Rules of Arabic Calligraphy International Arab Journal of e-technology, Vol. 1, No. 1, January 2009 1 How a Font Can Respect Basic Rules of Arabic Calligraphy Abdelouahad BAYAR Ecole Supérieure de Technologie - Sa Cadi Ayyad University

More information

A Multidimensional Approach to Typesetting

A Multidimensional Approach to Typesetting A Multidimensional Approach to Typesetting John Plaice Paul Swoboda School of Computer Science and Engineering The University of New South Wales unsw sydney nsw 2052, Australia http://www.cse.unsw.edu.au/

More information

SPARE CONNECTORS KTM 2014

SPARE CONNECTORS KTM 2014 SPAREPARTSCATALOG: // ENGINE ART.-NR.: 3208201EN CONTENT CONNECTORS FOR WIRING HARNESS AA-AN CONNECTORS FOR WIRING HARNESS AO-BC CONNECTORS FOR WIRING HARNESS BD-BQ CONNECTORS FOR WIRING HARNESS BR-CD

More information

Omega and OpenType Fonts

Omega and OpenType Fonts Kyoto University 21st Century COE Program Omega and OpenType Fonts Yannis Haralambous and John Plaice Abstract The time has come for Omega to break its bounds with TFM/VF fonts and move forward to font

More information

Thai Printing Support in FOSS

Thai Printing Support in FOSS Thai Printing Support in FOSS Theppitak Karoonboonyanan 23 January 2006 1 Requirements Summary 1.1 Rendering As summarized in [1] and [2], rendering or typesetting Thai texts involves the following issues:

More information

The Unicode Standard Version 6.1 Core Specification

The Unicode Standard Version 6.1 Core Specification The Unicode Standard Version 6.1 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

ARABIC LETTER BEH WITH SMALL ARABIC LETTER MEEM ABOVE MBEH (BEH WITH MEEM ABOVE) Represents mba sound

ARABIC LETTER BEH WITH SMALL ARABIC LETTER MEEM ABOVE MBEH (BEH WITH MEEM ABOVE) Represents mba sound Universal Multiple-Octet Coded Character Set International Organization for Standardization FOrganisation internationale de normalisation Международная организация по стандартизации Doc Type: Working Group

More information

Low-level Devanāgarī Support for Omega Adapting devnag

Low-level Devanāgarī Support for Omega Adapting devnag Yannis Haralambous Département Informatique École Nationale Supérieure des Télécommunications de Bretagne BP. 832, 29285 Brest, France yannis.haralambous@enst-bretagne.fr http://omega.enstb.org/yannis

More information

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet A. Sean Pue South Asia Language Resource Center Pre-SASLI Workshop 6/7/09 1 Objectives To understand how

More information

The proposer gratefully acknowledges the help of Jony Rosenne in preparing this proposal.

The proposer gratefully acknowledges the help of Jony Rosenne in preparing this proposal. Title: Source: Status: Action: On the Hebrew vowel HOLAM Peter Kirk Date: 2004-06-05 Individual Contribution For consideration by the UTC The proposer gratefully acknowledges the help of Jony Rosenne in

More information

ZWJ requests that glyphs in the highest available category be used; ZWNJ requests that glyphs in the lowest available category be used.

ZWJ requests that glyphs in the highest available category be used; ZWNJ requests that glyphs in the lowest available category be used. ISO/IEC JTC1/SC2/WG2 N2317 2001-01-19 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по

More information

Script for Interview about LATEX and Friends

Script for Interview about LATEX and Friends Script for Interview about LATEX and Friends M. R. C. van Dongen July 13, 2012 Contents 1 Introduction 2 2 Typography 3 2.1 Typeface Selection................................. 3 2.2 Kerning.......................................

More information

ISO/IEC JTC 1/SC 2/WG 2 N3086 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1

ISO/IEC JTC 1/SC 2/WG 2 N3086 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1 TP PT Form for PT ISO/IEC JTC 1/SC 2/WG 2 N3086 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1 FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646TP Please fill all the sections A, B and C below. Please

More information

Pan-Unicode Fonts. Text Layout Summit 2007 Glasgow, July 4-6. Ben Laenen, DejaVu Fonts

Pan-Unicode Fonts. Text Layout Summit 2007 Glasgow, July 4-6. Ben Laenen, DejaVu Fonts Pan-Unicode Fonts Text Layout Summit 2007 Glasgow, July 4-6 Ben Laenen, DejaVu Fonts Introduction Feature request last Friday for DejaVu: Request for Khmer characters U+1780-17DD, 17E0-17E9, 17F0-17F9:

More information

BASIC ABOUT TYPE TYPO GRAPHY

BASIC ABOUT TYPE TYPO GRAPHY BASIC ABOUT TYPE TYPO GRAPHY TYPOGRAPHY BASIC DESIGN Relative & Absolute measurements Absolute measurements Inche : Millimetres : Points : Pica 3 Inches 76.2 mm 216 Points 18 Picas 1 Inches = 3 Picas A

More information

FLT: Font Layout Table

FLT: Font Layout Table FLT: Font Layout Table Kenichi Handa, Mikiko Nishikimi, Naoto Takahashi and Satoru Tomura Abstract Rendering a complex text such as one written in Indic scripts, or Complex Text Layout requires many kinds

More information

Font, Typeface, Typeface Family. Selected Typographical Variables

Font, Typeface, Typeface Family. Selected Typographical Variables Font, Typeface, Typeface Family Font: A font is a set of printable or displayable text character in a specific style, weight, and size. E.g. Helvetica Italic 10 Point. Typeface: The type design for a set

More information

Adapting Ω to OpenType Fonts

Adapting Ω to OpenType Fonts Anish Mehta Département Informatique École Nationale Supérieure des Télécommunications de Bretagne CS 83818, 29238 Brest Cédex, France anish_mca@yahoo.com Gábor Bella Gabor.Bella@enst-bretagne.fr Yannis

More information

CSS3 Text Extensions. 1 Summary. 2 Contents. Michel Suignard. Microsoft Corporation

CSS3 Text Extensions. 1 Summary. 2 Contents. Michel Suignard. Microsoft Corporation Michel Suignard Microsoft Corporation 1 Summary This document presents new text extensions considered for CSS3 (Cascading Style Sheet). The main topics presented are layout flow, text justification, baseline

More information

Hyphenation. A tutorial for T E X users. Petr Sojka. & Masaryk University, Faculty of Informatics. Brno, Czech Republic.

Hyphenation. A tutorial for T E X users. Petr Sojka. & Masaryk University, Faculty of Informatics. Brno, Czech Republic. Hyphenation A tutorial for T E X users CST U G Petr Sojka & Masaryk University, Faculty of Informatics Brno, Czech Republic April 30, 2002 Overview of the Tutorial ➀ Overview, Motivation ➁ Basic Notions,

More information

Tribunal. ewjduhiz tvnsgfq. Typotheque type specimen & OpenType feature specification. Please read before using the fonts.

Tribunal. ewjduhiz tvnsgfq. Typotheque type specimen & OpenType feature specification. Please read before using the fonts. Typotheque type specimen & OpenType feature specification. Please read before using the fonts. Tribunal OpenType font family supporting Latin based languages with their own small caps, with extensive typographic

More information

7 TyPOGrAPHiC DESiGn lesson overview

7 TyPOGrAPHiC DESiGn lesson overview 7 typographic design Lesson overview In this lesson, you ll learn how to do the following: Use guides to position text in a composition. Make a clipping mask from type. Merge type with other layers. Use

More information

This is a stylish eased slab designed for use in books. It can be loosened a bit for text, or tightened a bit for headers. But, it works quite well

This is a stylish eased slab designed for use in books. It can be loosened a bit for text, or tightened a bit for headers. But, it works quite well h s i o B k o This is a stylish eased slab designed for use in books. It can be loosened a bit for text, or tightened a bit for headers. But, it works quite well the way it is designed. The Bookish font

More information

Keyboard Version 1.1 designed with Manual version 1.2, June Prepared by Vincent M. Setterholm, Logos Research Systems, Inc.

Keyboard Version 1.1 designed with Manual version 1.2, June Prepared by Vincent M. Setterholm, Logos Research Systems, Inc. Keyboard Version 1.1 designed with Manual version 1.2, June 2010 Prepared by Vincent M. Setterholm, Logos Research Systems, Inc. Logos Research Systems, Inc., 2005, 2010 Installation Windows 7 / Windows

More information

Proposal to encode MALAYALAM SIGN PARA

Proposal to encode MALAYALAM SIGN PARA Proposal to encode MALAYALAM SIGN PARA Introduction Cibu Johny, cibu@google.com 2014-Jan-16 Historically Paṟa has been an important measurement unit in Kerala, for measuring rice grain. The word also described

More information

My Mathematical Thesis

My Mathematical Thesis My Mathematical Thesis A. Student September 1, 2018 Abstract An abstract is a paragraph or few that gives the reader an overview of the document. Abstracts are commonly found on research articles, but

More information

Proposed Draft: Unicode Technical Report #53 UNICODE ARABIC MARK ORDERING ALGORITHM

Proposed Draft: Unicode Technical Report #53 UNICODE ARABIC MARK ORDERING ALGORITHM UNICODE ARABIC MARK ORDERING ALGORITHM Authors Roozbeh Pournader ( roozbeh@unicode.org ), Bob Hallissy ( bob_hallissy@sil.org ), Lorna Evans ( lorna_evans@sil.org ) Date 2017-10-06 This version Previous

More information

Understanding the æsthetics of math typesetting

Understanding the æsthetics of math typesetting Understanding the æsthetics of math typesetting Ulrik Vieth Vaihinger Straße 69 70567 Stuttgart Germany ulrik dot vieth (at) arcor dot de Abstract One of the core strengths of TEX is the ability to typeset

More information

uptex Unicode version of ptex with CJK extensions

uptex Unicode version of ptex with CJK extensions uptex Unicode version of ptex with CJK extensions Takuji Tanaka uptex project Oct 26, 2013 Takuji Tanaka (uptex project) uptex Unicode version of ptex with CJK extensions Oct 26, 2013 1 / 42 Outline /

More information

Formatting Text. 05_Format rd July 2000

Formatting Text. 05_Format rd July 2000 05_Format 1.00 23rd July 2000 5 Formatting Text 5.1... Applying Format Effects................................. 52 5.2... Alignment............................................ 53 5.3... Leading..............................................

More information

Proposal to encode the DOGRA VOWEL SIGN VOCALIC RR

Proposal to encode the DOGRA VOWEL SIGN VOCALIC RR Proposal to encode the DOGRA VOWEL SIGN VOCALIC RR Srinidhi A and Sridatta A Tumakuru, India srinidhi.pinkpetals24@gmail.com, sridatta.jamadagni@gmail.com June 25, 2017 1 Introduction This is a proposal

More information

A. Administrative. B. Technical General

A. Administrative. B. Technical General ISO/IEC JTC1/SC2/WG2 N2241 2000-08-27 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по

More information

Unicode in Education. Adil Allawi Technical Director

Unicode in Education. Adil Allawi Technical Director Unicode in Education Adil Allawi Technical Director adil@diwan.com Why Education and Unicode: Cost of printing and distribution is high - paper is becoming a scarce resource Many classes only need part

More information

Proposed Update Unicode Standard Annex #9

Proposed Update Unicode Standard Annex #9 1 of 52 1/30/2015 11:23 AM Technical Reports Proposed Update Unicode Standard Annex #9 Version Unicode 8.0.0 (draft 4) Editors Date 2015-01-07 This Version Previous Version Latest Version Latest Proposed

More information

Proposal on Handling Reph in Gurmukhi and Telugu Scripts

Proposal on Handling Reph in Gurmukhi and Telugu Scripts Proposal on Handling Reph in Gurmukhi and Telugu Scripts Nagarjuna Venna August 1, 2006 1 Introduction Chapter 9 of the Unicode standard [1] describes the representational model for encoding Indic scripts.

More information

SOFTWARE ARCHITECTURE 4. TEXT FORMATTING SYSTEM

SOFTWARE ARCHITECTURE 4. TEXT FORMATTING SYSTEM 1 SOFTWARE ARCHITECTURE 4. TEXT FORMATTING SYSTEM Tatsuya Hagino hagino@sfc.keio.ac.jp slides URL https://vu5.sfc.keio.ac.jp/sa/login.php 2 Text Formatting System Text Formatting Print out document nicely

More information

Level instruments. Communications and Displays SITRANS RD200. 5/306 Siemens FI Technical specifications. Overview

Level instruments. Communications and Displays SITRANS RD200. 5/306 Siemens FI Technical specifications. Overview Overview The is a universal input, panel mount remote digital display for process instrumentation. Benefits Easy setup and programming via front panel buttons or remotely using RD software Display readable

More information

7 TYPOGRAPHIC DESIGN Lesson overview

7 TYPOGRAPHIC DESIGN Lesson overview 7 TYPOGRAPHIC DESIGN Lesson overview In this lesson, you ll learn how to do the following: Use guides to position text in a composition. Make a clipping mask from type. Merge type with other layers. Format

More information

Proposed Update Unicode Standard Annex #9

Proposed Update Unicode Standard Annex #9 1 of 52 Technical Reports Proposed Update Unicode Standard Annex #9 Version Unicode 6.3.0 (draft 12) Editors Mark Davis (markdavis@google.com), Aharon Lanin (aharon@google.com), and Andrew Glass (andrew.glass@microsoft.com)

More information

New Font Offerings: Cochineal, Nimbus15, LibertinusT1Math

New Font Offerings: Cochineal, Nimbus15, LibertinusT1Math New Font Offerings: Cochineal, Nimbus15, LibertinusT1Math Michael Sharpe, UCSD TUG Toronto, July 2016 Cochineal an oldstyle text font family with Roman, Greek and Cyrillic alphabets derived from Sebastian

More information

Experiences typesetting OpenType math

Experiences typesetting OpenType math Experiences typesetting OpenType math with LuaLaTEX and XeLaTEX Dr. Ulrik Vieth Stuttgart, Germany 4th International ConTEXt Meeting, Brejlow, 2010 Overview of this talk Review of OpenType math support

More information

UNICODE BIDIRECTIONAL ALGORITHM

UNICODE BIDIRECTIONAL ALGORITHM Technical Reports Proposed Update Unicode Standard Annex #9 UNICODE BIDIRECTIONAL ALGORITHM Version Unicode 11.0.0 (draft 1) Editors Mark Davis (markdavis@google.com), Aharon Lanin (aharon@google.com),

More information

Typesetting ancient Greek using Ibycus-encoded fonts with the Babel system

Typesetting ancient Greek using Ibycus-encoded fonts with the Babel system Typesetting ancient Greek using Ibycus-encoded fonts with the Babel system Peter Heslin Walter Schmidt v3.0 2005/11/23 1 Overview The present document describes a new interface for Greek fonts with the

More information

The l3galley package Galley code

The l3galley package Galley code The l3galley package Galley code The L A TEX3 Project Released 2019-03-05 1 Introduction In L A TEX3 terminology a galley is a rectangular area which receives text and other material filling it from top.

More information

13 th Annual Johns Hopkins Math Tournament Saturday, February 19, 2011 Automata Theory EUR solutions

13 th Annual Johns Hopkins Math Tournament Saturday, February 19, 2011 Automata Theory EUR solutions 13 th Annual Johns Hopkins Math Tournament Saturday, February 19, 011 Automata Theory EUR solutions Problem 1 (5 points). Prove that any surjective map between finite sets of the same cardinality is a

More information

from The Elements of Typographic Style by Robert Bringhurst, page 171

from The Elements of Typographic Style by Robert Bringhurst, page 171 from The Elements of Typographic Style by Robert Bringhurst, page 171 from The Elements of Typographic Style by Robert Bringhurst, page 148 h Dunt augue et, sum ad dolore do od estionse feum iure magna

More information

Managing fonts with Corel Font Manager

Managing fonts with Corel Font Manager Managing fonts with Corel Font Manager By Anand Dixit Since version X8, CorelDRAW Graphics Suite comes with a standalone application called the Corel Font Manager, a very convenient way to manage fonts

More information

Standardizing the order of Arabic combining marks

Standardizing the order of Arabic combining marks UTC Document Register L2/14-127 Standardizing the order of Arabic combining marks Roozbeh Pournader, Google Inc. May 2, 2014 Summary The combining class of the combining characters used in the Arabic script

More information

MONGOLIAN SCRIPT IN UNICODE SAN JOSE 2018

MONGOLIAN SCRIPT IN UNICODE SAN JOSE 2018 MONGOLIAN SCRIPT IN UNICODE SAN JOSE 2018 Agenda Improvement of Mongolian Script Description in Unicode o Single control character instead of many o U1800 Separation of the font matter and the Unicode

More information

WATER (No kerning) WATER (Automatic Kerning) WATER (Manual Kerning).

WATER (No kerning) WATER (Automatic Kerning) WATER (Manual Kerning). Styles Learning to use styles is a very effective way to save time and improve the consistency of publications. A style is a group of attributes that can be applied at once, to one or more paragraphs,

More information

PCL Greek-8 - Code Page 869

PCL Greek-8 - Code Page 869 PCL Greek-8 - Code Page 869 Page 1 of 5 PCL Symbol Se t: 8G Unicode glyph correspondence tables. Contact:help@redtitan.com http://pcl.to $20 U0020 Space $90 U038A Ê Greek capita l letter iota with tonos

More information

Astonishing Soufflé $14.95 Astonishing Soufflé...$ Working with Type in Adobe Illustrator CS2

Astonishing Soufflé $14.95 Astonishing Soufflé...$ Working with Type in Adobe Illustrator CS2 Working with Type in Adobe Illustrator CS2 whitepaper TABLE OF CONTENTS 1 Executive overview 1 Introduction to the Illustrator CS2 text engine 3 OpenType features 6 Advanced typography 12 Asian text support

More information

adorn frames about with personality, along with classic straight edges to form rectangles and squares. Adorn Frames lauraworthingtontype.

adorn frames about with personality, along with classic straight edges to form rectangles and squares. Adorn Frames lauraworthingtontype. adorn frames about Q Adorn Frames Adorn Frames is a highly customizable set of elements offering a multitude of approaches to creating frames of any width, height, and style. Use it for corner elements,

More information

Irma Text Round Pro Irma Text Round Std

Irma Text Round Pro Irma Text Round Std Typotheque type specimen & OpenType feature specification. Please read before using the fonts. Irma Text Round Pro Irma Text Round Std OpenType font family supporting Latin, Cyrillic and Greek, with their

More information

****This proposal has not been submitted**** ***This document is displayed for initial feedback only*** ***This proposal is currently incomplete***

****This proposal has not been submitted**** ***This document is displayed for initial feedback only*** ***This proposal is currently incomplete*** 1 of 5 3/3/2003 1:25 PM ****This proposal has not been submitted**** ***This document is displayed for initial feedback only*** ***This proposal is currently incomplete*** ISO INTERNATIONAL ORGANIZATION

More information

Review of InDesign CS

Review of InDesign CS Review of InDesign CS Wayne Dirks May 2004 Last year I tackled the challenge of typesetting the Giryama New Testament trial edition in InDesign version 2 and was impressed with the results. With the release

More information

Arabic text justification

Arabic text justification Arabic text justification Mohamed Jamal Eddine Benatia, Mohamed Elyaakoubi and Azzeddine Lazrek Department of Computer Science, Faculty of Science, University Cadi Ayyad P.O. Box 2390, Marrakesh, Morocco

More information

Lumin Lumin Sans Lumin Sans Condensed Lumin Display

Lumin Lumin Sans Lumin Sans Condensed Lumin Display Typotheque type specimen & OpenType feature specification. Please read before using the fonts. Lumin Lumin Sans Lumin Sans Condensed Lumin Display OpenType font family supporting Latin based languages

More information

Location of Talk/Slides/Software/Demos

Location of Talk/Slides/Software/Demos Implementing Better Source Editing for Bidirectional HTML and XML in the Text Editor 35 th Internationalization and Unicode Conference October 18, 2011 Shunsuke Oshima Martin J. Dürst Aoyama Gakuin University,

More information

A question of character Typographic tips for technophobes II

A question of character Typographic tips for technophobes II In this fourth Tantamount Guide, we take a close-up look at some special features of characters and punctuation and also touch on some grammar issues. Typographic tips for technophobes II Capital letters

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Proposal to encode the SANDHI MARK for Newa

Proposal to encode the SANDHI MARK for Newa Proposal to encode the SANDHI MARK for Newa Srinidhi A and Sridatta A Tumakuru, India srinidhi.pinkpetals24@gmail.com, sridatta.jamadagni@gmail.com December 23, 2016 1 Introduction This is a proposal to

More information

Nafees Nastaleeq v1.02 beta

Nafees Nastaleeq v1.02 beta Nafees Nastaleeq v1.02 beta Release Notes September 24, 2008 CENTER FOR RESEARCH IN URDU LANGUAGE PROCESSING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES, LAHORE PAKISTAN Table of Contents 1 Introduction...4

More information

PROPOSALS FOR MALAYALAM AND TAMIL SCRIPTS ROOT ZONE LABEL GENERATION RULES

PROPOSALS FOR MALAYALAM AND TAMIL SCRIPTS ROOT ZONE LABEL GENERATION RULES PROPOSALS FOR MALAYALAM AND TAMIL SCRIPTS ROOT ZONE LABEL GENERATION RULES Publication Date: 23 November 2018 Prepared By: IDN Program, ICANN Org Public Comment Proceeding Open Date: 25 September 2018

More information

Bringing ᬅᬓᬱᬭᬩᬮ to ios. Norbert Lindenberg

Bringing ᬅᬓᬱᬭᬩᬮ to ios. Norbert Lindenberg Bringing ᬅᬓᬱᬭᬩᬮ to ios Norbert Lindenberg Norbert Lindenberg 2015 Building blocks for the multilingual Web Internationalization at Wikipedia Alolita Sharma Director of Engineering Internationalization

More information

AFP Support for TrueType/Open Type Fonts and Unicode

AFP Support for TrueType/Open Type Fonts and Unicode AFP Support for TrueType/Open Type Fonts and Unicode Reinhard Hohensee Distinguished Engineer October 24, 2003 Ricoh Topics What is Unicode? What are TrueType and OpenType fonts? Why have we extended the

More information

Natalie Olson - Kisscut design Typography and Internal Book Design

Natalie Olson - Kisscut design Typography and Internal Book Design Natalie Olson - Kisscut design Typography and Internal Book Design www.kisscutdesign.com/blog Typography and the art of setting type in books hasn t changed a lot since the 1500s. The methods have changed.

More information

Proposal to encode three Arabic characters for Arwi

Proposal to encode three Arabic characters for Arwi Proposal to encode three Arabic characters for Arwi Roozbeh Pournader, Google (roozbeh@google.com) June 24, 2013 Requested action I would like to ask the UTC and the WG2 to encode the following three Arabic

More information

COMBINED WARNING EDITING GUIDANCE DOCUMENT. European Commission Health and Consumer Protection Directorate-General

COMBINED WARNING EDITING GUIDANCE DOCUMENT. European Commission Health and Consumer Protection Directorate-General COMBINED WARNING EDITING GUIDANCE DOCUMENT European Commission Health and Consumer Protection Directorate-General CONTENTS > INTRODUCTION 2 he longest line reaches the e > TYPOGRAPHY 3 FONT > THE COMBINED

More information

ENHANCING CONTEXTUAL SUBSTITUTION SUPPORT IN PANGO USING OPENTYPE

ENHANCING CONTEXTUAL SUBSTITUTION SUPPORT IN PANGO USING OPENTYPE ENHANCING CONTEXTUAL SUBSTITUTION SUPPORT IN PANGO USING OPENTYPE MS Thesis for the Degree of Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science (Computer Science)

More information