Üù àõ [tai 2 l 6] (in older orthography Üù àõ»). Tai Le orthography is simple and straightforward:

Similar documents
B. Technical General 1. Choose one of the following: 1a. This proposal is for a new script (set of characters) Yes.

Structure Vowel signs are used in a manner similar to that employed by other Brahmi-derived scripts. Consonants have an inherent /a/ vowel sound.

5c. Are the character shapes attached in a legible form suitable for review?

097B Ä DEVANAGARI LETTER GGA 097C Å DEVANAGARI LETTER JJA 097E Ç DEVANAGARI LETTER DDDA 097F É DEVANAGARI LETTER BBA

A. Administrative. B. Technical General L2/ DATE:

ISO/IEC JTC1/SC2/WG2 N2641

also represented by combnining vowel matras with ē, and ō: ayɯ, eyi, ayi;

A. Administrative. B. Technical General

A. Administrative. B. Technical -- General

transcribing Urdu or Arabic words. Accordingly, the KHHA and GHHA should be considered atomic, as Tibetan TSA, TSHA, and DZA are.

ä + ñ ISO/IEC JTC1/SC2/WG2 N

ISO International Organization for Standardization Organisation Internationale de Normalisation

Doc Type: Working Group Document Title: Recommendations for encoding Xiàngqí game symbols Source: Michael Everson (

A. Administrative. B. Technical General ISO/IEC JTC1/SC2/WG2 N1948

PHOENICIAN NUMBER ONE. More recent research into Imperial Aramaic (N3273R2, L2/07-197R2, 2007-

0CE2 Ä KANNADA VOWEL SIGN VOCALIC L 0CE3 Å KANNADA VOWEL SIGN VOCALIC LL 0CE4 Ç KANNADA DANDA 0CE5 É KANNADA DOUBLE DANDA

If this proposal is adopted, the following three characters would exist: with the following properties (including a change from Ll to Lo for U+0294):

Proposal to Encode Phonetic Symbols with Retroflex Hook in the UCS

JTC1/SC2/WG2 N

Introduction. Acknowledgements

ISO/IEC JTC/1 SC/2 WG/2 N2095

****This proposal has not been submitted**** ***This document is displayed for initial feedback only*** ***This proposal is currently incomplete***

Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation

This proposal is limited to the addition and rearrangement of some of the Korean character part of ISO/IEC (UCS2).

ꟸ A7F8 LATIN SUBSCRIPT SMALL LETTER S ꟹ A7F9 LATIN SUBSCRIPT SMALL LETTER T ꟺ A7FA LATIN LETTER SMALL CAPITAL TURNED M

5 U+1F641 SLIGHTLY FROWNING FACE

Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation

ISO/IEC JTC1/SC2/WG2 N2817

JTC1/SC2/WG2 N4945R

5 U+1F641 SLIGHTLY FROWNING FACE

ISO/IEC JTC1/SC2/WG2 N4445

Ꞑ A790 LATIN CAPITAL LETTER A WITH SPIRITUS LENIS ꞑ A791 LATIN SMALL LETTER A WITH SPIRITUS LENIS

ARABIC LETTER BEH WITH SMALL ARABIC LETTER MEEM ABOVE MBEH (BEH WITH MEEM ABOVE) Represents mba sound

Template for comments and secretariat observations Date: Document: ISO/IEC 10646:2014 PDAM2

Proposal to encode the SANDHI MARK for Newa

JTC1/SC2/WG

Proposal to Encode the Ganda Currency Mark for Bengali in the BMP of the UCS

Proposal to encode Devanagari Sign High Spacing Dot

1 ISO/IEC JTC1/SC2/WG2 N

ISO/IEC JTC1/SC2/WG2. Universal Multiple-Octet Coded Character Set (UCS) - ISO/IEC Secretariat: ANSI

1. Introduction.

ISO/IEC JTC 1/SC 2 N WG2 N2436 DATE:

ISO/IEC JTC1/SC2/WG2 N4078

Form number: N2352-F (Original ; Revised , , , , , , ) N2352-F Page 1 of 7

Although the International Phonetic Alphabet deprecated the letters. Unicode Character Properties. Character properties are proposed here.

ISO/IEC JTC1/SC2/WG2 N4210

This document is to be used together with N2285 and N2281.

1. Introduction. 2. Encoding Considerations

YES (or) More information will be provided later:

A. Administrative. B. Technical -- General. ISO/IEC JTC1/SC2/WG2 N1699 Date:

Fig. 2 of E.161 Fig. 3 of E.161 Fig. 4 of E.161

1.1 The digit forms propagated by the Dozenal Society of Great Britain

ISO/IEC JTC 1/ SC 2 /WG2 Universal Multiple-Octet Coded Character Set (UCS) -ISO/IEC Secretariat: ANSI

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC / UNICODE

PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC / UNICODE

John H. Jenkins If available now, identify source(s) for the font (include address, , ftp-site, etc.) and indicate the tools used:

1. Introduction. 2. Proposed Characters Block: Latin Extended-E Historic letters for Sakha (Yakut)

Yes. Form number: N2652-F (Original ; Revised , , , , , , , )

1. Introduction 2. TAMIL DIGIT ZERO JTC1/SC2/WG2 N Character proposed in this document About INFITT and INFITT WG

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS. Yes (or) More information will be provided later:

1. Introduction 2. TAMIL LETTER SHA Character proposed in this document About INFITT and INFITT WG

Proposal to encode the DOGRA VOWEL SIGN VOCALIC RR

Proposal to Encode Oriya Fraction Signs in ISO/IEC 10646

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC

ISO/IEC JTC1/SC2/WG2 N3734 L2/09-144R

A B Ɓ C D Ɗ Dz E F G H I J K Ƙ L M N Ɲ Ŋ O P R S T Ts Ts' U W X Y Z

Proposal to encode three Arabic characters for Arwi

Proposal to encode the Prishthamatra for Nandinagari

L2/ ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC

ISO/IEC JTC 1/SC 2/WG 2/N2789 L2/04-224

Proposal to encode fourteen Pakistani Quranic marks

ISO/IEC JTC/SC2/WG Universal Multiple Octet Coded Character Set (UCS)

Title: Graphic representation of the Roadmap to the BMP of the UCS

Proposal to encode the END OF TEXT MARK for Malayalam

Universal Multiple-Octet Coded Character Set

Yes 11) 1 Form number: N2652-F (Original ; Revised , , , , , , , 2003-

@ 102B MYANMAR VOWEL SIGN TALL AA π 1039 MYANMAR SIGN VIRAMA [glyph change and note change]

Revised proposal to encode NEPTUNE FORM TWO. Eduardo Marin Silva 02/08/2017

Proposal to Add Four SENĆOŦEN Latin Charaters

Two distinct code points: DECIMAL SEPARATOR and FULL STOP

ISO/IEC JTC1/SC2/WG2 N3563 L2/09-020

Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation

ISO/IEC JTC 1/SC 2/WG 2 Proposal summary form N2652-F accompanies this document.

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC

A. Administrative. B. Technical General

Proposal to encode Kannada Sign Spacing Candrabindu

JTC1/SC2/WG2 N3048. Form number: N2352-F (Original ; Revised , , , , , , ) Page 1 of 5

ISO/IEC/JTC 1/SC2/WG2 N 4066

ISO/IEC JTC1/SC2/WG2 N4599 L2/

L2/ Proposal to encode archaic vowel signs O OO for Kannada. 1. Thanks. 2. Introduction

ZWJ requests that glyphs in the highest available category be used; ZWNJ requests that glyphs in the lowest available category be used.

Form number: N2652-F (Original ; Revised , , , , , , , )

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1

L2/ Universal Multiple-Octet Coded Character Set

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC A.

1. Introduction. Properties: 2E4E;DOUBLE HYPHEN;Pd;0;ON;;;;;N;;;;; Entry in LineBreak.TXT: 2E4E;BA # DOUBLE HYPHEN

ISO/IEC JTC1/SC2/WG2 N4157 L2/12-121

JTC1/SC2/WG2 N3514. Proposal to encode a SOCCER BALL symbol Karl Pentzlin, U+26xx SOCCER BALL

ISO/IEC JTC 1/SC 2/WG 2 N3086 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1

Transcription:

ISO/IEC JTC1/SC2/WG2 N2372 2001-10-05 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по стандартизации Doc Type: Working Group Document Title: Revised proposal for encoding the Tai Le script in the BMP of the UCS Source: Ireland Status: National Body contribution Action: For adoption by JTC1/SC2/WG2 Date: 2001-10-06 Document N966 from China was introduced in 1994 and contained 58 characters. The repertoire of Tai Le characters was discussed (by experts from China, Ireland, the UK, and the US) and modified (to contain 35 characters) at the Athens meeting of WG2, and the results of this were published in N2292. This document contains the tables and character names for the script, and contains the proposal summary form. China has recently published N2371, which contains the same repertoire but gives the tone markers in different positions. We prefer the positions here, but consider that the Tai Le script repertoire and names are mature and ready for encoding apart from verification of ordering requirements. The names of the tone letters in N2371 differ from those given here, but the ones given here are in accordance with WG2 decisions taken at San Jose regarding such names. The Tai Le script has a history of 700-800 years, during which period several orthographic conventions were used. The modern form of the script, which is proposed for encoding here, was developed in the years following 1954; it rationalized the older system and added a systematic representation of tones with the use of combining diacritics. In 1988, the new system was again revised, and spacing tone marks were introduced to replace the combining diacritics. This proposal treats both orthographies. Tai Nüa, Dehong Dai, Tai Mau, Tai Kong, and Chinese Shan are other names by which the Tai Le language is known. This encoding prefers Tai Le, which is a transliteration of the indigenous designation, Üù àõ [tai 2 l 6] (in older orthography Üù àõ»). Tai Le orthography is simple and straightforward: initial consonants precede vowels, vowels precede final consonants, and tone marks, if any, follow the entire syllable. There is a one-to-one correspondence between the tone mark letters now used and existing nonspacing diacritics in the UCS, used prior to 1988. In both orthographies the tone mark is the last character in a syllable-string. Typographically, when one of the combining diacritics follows a tall letter, it is represented as a spacing mark. It is recommended to let the font do this, and not to employ SP or NBSP in the encoded text to force this behaviour, or to use spacing diacritics with tall letters. Ä ka Ä ka Äî ki Äî ki Ä ka 2 Ä ka 2 Äî ki 2 Äîƒ ki 2 Ä ka 3 Ä ka 3 Äî ki 3 Äî ki 3 Ä ka 4 Ä ka 4 Äî ki 4 Äî» ki 4 Ä ka 5 Äfi ka 5 Äî ki 5 Äî ki 5 Ä ka 6 Äfl ka 6 Äî ki 6 Äî~ ki 6 Page 1

Sample of Tai Le in old orthography, 1988. Digits. In China, European digits (U+0030..U+0039) are mainly used, though Myanmar digits, (U+1040..U+1049) are also used with slight glyph variants. ƒ Myanmar digits ± µ π Myanmar digits as used by Tai Le in China Apparently a script-specific set of numerals are used by Tai Le writers in Myanmar; these are not known in China, and are not proposed here pending further information. ß Æ Ø Script-specific Tai Le digits Page 2

Calendar from Myanmar showing script-specific Tai Le digits. The year is given as 2091 (= 1997). Punctuation. Both CJK punctuation and western punctuation are used. Typographically, digits are about the same height and depth as the tall characters ö and õ ; in some fonts, the baseline for punctuation is the depth of those characters. Unicode Character Properties Spacing letters, category Lu (uppercase), bidi category L (strong left to right) xx00-xx1d, xx20-xx24 Bibliography Coulmas, Florian. 1996. The Blackwell encyclopedia of writing systems. Oxford and Cambridge: Blackwell. (s.v. Dehong writing, pp. 118-119.) ISBN 0-631-19446-0 Éç ãì éòç : àü ãï Ñìù Ññç ÅôÇ. Öîí àìí ãîí Ñó Ñó âìí Ñõ. (Tsa va 4 má 3 hó va 3 : la ta 6 mé 2 sá ai 3 seh va 2 xo ŋa 3. Yi na 5 lá na 5 mi na 5 su 4 su 4 pá na 2 se 3.) 1997. ISBN 7-5367-1455-6. àìù ãú àìù Äç ãîƒéõƒàú ÖÇãõíflÄí: ÜìçèìíÄòã Ñìí Üñã. ÖîífiàìífiãîífiÑó Ñó âìí Ñõƒ. (Lá ai 2 maɯ 3 lá ai 2 ka va 3 mi 2 tse 2 laɯ ya pa me na 4 ka na: tá va ʔá na kó ma 6 sá na 2 teh ma 6. Yi na 5 lá na 5 mi na 5 su 4 su 4 pá na 2 se 3.) 1988. ISBN 7-5367-1100-4. Page 3

A. Administrative 1. Title Revised proposal for encoding the Tai Le (Dehong Dai) script in the BMP of the UCS 2. Requester's name Ireland 3. Requester type National Body contribution 4. Submission date 2001-03-19 5. Requester's reference 6a. Completion This is a complete proposal. 6b. More information to be provided? No B. Technical -- General 1a. New script? Name?. Tai Le. 1b. Addition of characters to existing block? Name? 2. Number of characters 35 3. Proposed category Category A 4. Proposed level of implementation and rationale Level 1 for the new orthography; Level 3 for the older orthography (but the characters used in that orthography are already encoded and are not part of this proposal). 5a. Character names included in proposal? 5b. Character names in accordance with guidelines? 5c. Character shapes reviewable? 6a. Who will provide computerized font? Michael Everson, Everson Typography 6b. Font currently available? 6c. Font format? TrueType 7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided?. See bibliography. 7b. Are published examples (such as samples from newspapers, magazines, or other sources) of use of proposed characters attached?. See below. 8. Does the proposal address other aspects of character data processing?. C. Technical -- Justification 1. Contact with the user community?. Page 4

2. Information on the user community? There are approximately 250,000 Tai Le users in China, 120,000 users in Laos, and 72,400 in Myanmar (figures from the SIL Ethnologue). 3a. The context of use for the proposed characters? To write the Tai Le language. 3b. Reference 4a. Proposed characters in current use? 4b. Where? China, Laos, and Myanmar. 5a. Characters should be encoded entirely in BMP?. 5b. Rationale Contemporary use and in keeping with the Roadmap. 6. Should characters be kept in a continuous range?. 7a. Can the characters be considered a presentation form of an existing character or character sequence? 7b. Where? 7c. Reference 8a. Can any of the characters be considered to be similar (in appearance or function) to an existing character? 8b. Where? 8c. Reference 9a. Combining characters or use of composite sequences included? 9b. List of composite sequences and their corresponding glyph images provided? 10. Characters with any special properties such as control function, etc. included? No D. SC2/WG2 Administrative To be completed by SC2/WG2 1. Relevant SC 2/WG 2 document numbers: 2. Status (list of meeting number and corresponding action or disposition) 3. Additional contact to user communities, liaison organizations etc. 4. Assigned category and assigned priority/time frame Other Comments Page 5

Proposed Tai Le (Dehong Dai) encoding, 2001-09-19 China and Ireland TABLE XXX - Row x0: TAI LE xx0 xx1 xx2 0 1 2 3 4 5 6 7 8 9 A B C D E F Ä ê Å ë Ç í É ì Ñ î Ö ï Ü ñ á ó à ò â ô ä ö ã õ å ú ç ù é è ü G = 00 P = 00 6

China and Ireland Proposed Tai Le (Dehong Dai) encoding, 2001-09-19 TABLE XXX - Row x0: TAI LE hex Name hex Name 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF TAI LE LETTER KA TAI LE LETTER XA TAI LE LETTER NGA TAI LE LETTER TSA TAI LE LETTER SA TAI LE LETTER YA TAI LE LETTER TA TAI LE LETTER THA TAI LE LETTER LA TAI LE LETTER PA TAI LE LETTER PHA TAI LE LETTER MA TAI LE LETTER FA TAI LE LETTER VA TAI LE LETTER HA TAI LE LETTER QA TAI LE LETTER KHA TAI LE LETTER TSHA TAI LE LETTER NA TAI LE LETTER A TAI LE LETTER I TAI LE LETTER EE TAI LE LETTER EH TAI LE LETTER U TAI LE LETTER OO TAI LE LETTER O TAI LE LETTER UE TAI LE LETTER E TAI LE LETTER AUE TAI LE LETTER AI TAI LE LETTER TONE-2 TAI LE LETTER TONE-3 TAI LE LETTER TONE-4 TAI LE LETTER TONE-5 TAI LE LETTER TONE-6 Group 00 Plane 00 Row x0 7