Writing Domain Names in Different Arabic Script Based Languages

Size: px
Start display at page:

Download "Writing Domain Names in Different Arabic Script Based Languages"

Transcription

1 Writing Domain Names in Different Arabic Script Based Languages Language VS. Script ICANN Regional Meeting, Dubai April Dr. Abdulaziz H. Al-Zoman Director of SaudiNIC - CITC Chairman of Steering Committee Arabic Domain Name Pilot Project Raed I. Al-Fayez SaudiNIC - CITC Chairman of Technical Committee Arabic Domain Name Pilot Project

2 Agenda Part I: Arabic language Characteristics Basic linguistic issues Recommendations Accepted character set table Part II: What cctld communities are saying? Part III: Let us expand and look at the whole script About the Arabic script Unicode character code chart The Unicode codepoints and IDNA Issues need further investigations Part IV: Conclusions

3 About us Arab Team for Arabic Domain Names Formed under the auspices of Arab League, 2005 Supervising ADNPP: Arabic Domain Name Pilot Project SaudiNIC Administering the Saudi domain name space (.sa) since Operated by Communication and Information Technology Commission (CITC) governmental org. Leading the local community effort towards supporting Arabic language in DNS Head of the ADNPP Steering Committee Head of the ADNPP Technical Committee

4 Part I: Arabic language

5 Arabic Language Characteristics 1. Written from right to left. 2. Most characters have different shapes depending on their position (beginning, middle, or end) within a word probably conjugated with preceding and succeeding characters. These different shapes for a single character do not count as different code points but they are handled using different fonts Letters that can be joined are always joined in both hand-written and printed Arabic (cursive style).

6 Arabic Language Characteristics 3. Two sets of numerals are used: Arabic : 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Arabic-Indic : ٩ ٨ ٧ ٦ ٥ ٤ ٣ ٢ ١ ٠ Numerals are written from left to right 4. Tashkeel (diacritic) A small sing (not a letter) that is usually put on top or under a character for the purpose of correct pronunciation It is not widely used except incase of the possibility of mispronouncing words that have the same letters but with different pronunciations, and hence having different meanings. 5. Abbreviations are not widely used When an abbreviation is written (in domain name) characters will be joined together leads to a different word and pronunciation

7 Arabic Language Basic Linguistic Issues Identified a number of linguistic issues with respect to domain names: Usage of Tashkeel (diacritics). Usage of Kashidah. Character folding. Which numerical digits should be supported. Connecting multiple words. Mixing with other languages. Usage of special characters.

8 Basic Linguistic Issues Tashkeel (Diacritics) Not a letter but a small sing that is usually put on top or under an Arabic letter for the purpose of correct pronunciation It is not widely used except incase of the possibility of mispronouncing words that have the same letters but with different pronunciations, and hence having different meanings. If supported, there will be huge number of combinations when registering a single domain name Good playground for phishing Many combinations Recommendation: Tashkeel can be supported only in the user interface but should not be stored in the zone file. Therefore, it can be striped off at the preparation of internationalized strings ("stringprep") phase.

9 Basic Linguistic Issues Kasheeda (Tatweel Tatweel) Not a letter. It is a horizontal line (like dash) used to lengthen the connection line between letters. Used sometimes to enhance the display of Arabic words on screens or printouts. Good playground for domain name phishing Recommendation: Kasheeda should not be used in Arabic domain names

10 Basic Linguistic Issues Character Folding A character folding is the process where multiple letters (that may have some similarity with respect to their shapes) are folded into one shape. This includes: Folding Teh Marbuta and Heh at the end of a word. Folding different forms of Hamzah. Folding Alif Maksura and Yeh at the end of a word. Folding Waw with Hamzah and Waw. Character folding is unacceptable in the Arabic language because it changes the meaning of the words and it is against the simplest spelling rules. Replacing a character with another character, which may have the same shape but different pronunciation, will give a different meaning. This will lead to have only one form (word) out many other forms of words that are made by all the combination of folded characters. Hence, the other forms will be masked by the common form. Hand writing mixes between different characters (e.g., Heh and Teh-Marbuta) and this is due to laziness or weakness in spelling. However, this is not the case in published and printed materials. One of the motivations to support the Arabic language in domain names is to preserve the language particularly with the spread of the globalization movement. Hence, character folding is working against this motivation since it is going to have a negative effect on the principles and ethics of the language. Recommendation: Character folding should not be allowed

11 Basic Linguistic Issues Numbers In the Arab world, there are two sets of numerical digits used: Set I: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), Mostly used in the western part of the Arab world (al-maghrib alarabi). Set II: ( ٩ ٨ ٧ ٦ ٥ ٤ ٣ ٢ ١ ٠), Mostly used in the eastern part of the Arab world (al-mashriq alarabi). Their writing direction (LTR) differs than the writing direction in the Arabic letters (RTL) Recommendation: Both sets should be supported in the user interface and both are folded to one set (Set I) at the preparation of internationalized strings (e.g., "stringprep") phase.

12 Basic Linguistic Issues Connecting Multiple Words In the Arab language words are separated by spaces. Connecting words without spaces is usually not acceptable. Decrease readability Recommendation: Space should be used to separate words if it is technically visible. Otherwise, it is recommended that multiple words are separated by the character "-" dash. If the space is used as a word separator in Arabic domain names then it should be only a single space and it should not be used at the beginning or at the end of words.

13 Arabic Linguistic Issues Recommendation Summary Tashkeel (Diacritics) Kasheeda Character folding: Teh Marbuta + Heh different forms of Hamzah Alif Maqsura+YaNumbers Numbers (numerical digits) Connecting Multiple Words Mixing Latin and Arabic Characters Special Characters #, $, %,...) Tashkeel should not be allowed. However, if there is a need to allowed users to entered it as part of a domain name then it should be stripped off by nameprep Kasheeda should be disallowed Folding should not be allowed If it is technically possible, it is preferred to support both (Latin and Arabic) sets with folding to one set. Otherwise, Latin set is sufficient It is recommended that multiple words are separated by the character "-". It is recommended that Arabic domain names be pure Arabic and they should not be mixed with other languages. It is recommended that Arabic domain names should follow the standard with respect to the use of special characters.

14 Arabic Language Accepted Character Set Table Development Principles Study and discuss the linguistic issues extensively (more than 3 years) by a team who work under the umbrella of Arab League Adhere to LDH convention Follow the inclusion mode recommended by the new IDNA standards Many characters are disallowed: symbols, punctuations, diacritics (tashkeel), Koranic annotation signs, honorifics, etc are not allowed. Arabic labels are not intended to write sentences or phrases The goal is to develop: A simple character set table that includes ONLY needed letters and digits

15 Arabic Language Accepted Character Set Table The following Unicode characters are accepted only: It is extremely important to standardize this character set table for the Arabic Language

16 Arabic Language Accepted Character Set Table Characters from Unicode Arabic Table ( FF) 0621 (ء ( Arabic Letter HAMZA 0622 (ا ) Arabic Letter ALEF with MADDA above 0623 (أ) Arabic Letter ALEF with HAMZA above 0624 (ؤ) Arabic Letter WAW with HAMZA above 0625 (إ) Arabic Letter ALEF with HAMZA below 0626 (ئ) Arabic Letter YEH with HAMZA above 0627 (ا) Arabic Letter ALEF 0628 (ب) Arabic Letter BEH 0629 (ة) Arabic Letter TEH MARBUTA 062A (ت) Arabic Letter TEH 062B (ث) Arabic Letter THEH 062C ( (ج Arabic Letter JEEM 062D ( (ح Arabic Letter HAH 062E ( (خ Arabic Letter KHAH 062F (د) Arabic Letter DAL 0630 (ذ) Arabic Letter THAL 0631 (ر) Arabic Letter REH 0632 (ز) Arabic Letter ZAIN 0633 (س ( Arabic Letter SEEN 0634 (ش ( Arabic Letter SHEEN 0635 (ص) Arabic Letter SAD 0636 (ض) Arabic Letter DAD 0637 (ط ( Arabic Letter TAH 0638 (ظ ( Arabic Letter ZAH 0639 (ع) Arabic Letter AIN 063A (غ) Arabic Letter GHAIN 0641 (ف) Arabic Letter FEH 0642 (ق ( Arabic Letter QAF 0643 (ك ( Arabic Letter KAF 0644 (ل) Arabic Letter LAM 0645 (م) Arabic Letter MEEM 0646 (ن) Arabic Letter NOON 0647 (ه) Arabic Letter HEH 0648 (و) Arabic Letter WAW 0649 (ى) Arabic Letter ALEF MAKSURA 064A (ي) Arabic Letter YEH 0660 (٠) Arabic-Indic Digit Zero 0661 (١) Arabic-Indic Digit One 0662 (٢) Arabic-Indic Digit Two 0663 (٣) Arabic-Indic Digit Three 0664 (٤) Arabic-Indic Digit Four 0665 (٥) Arabic-Indic Digit Five 0666 (٦) Arabic-Indic Digit Six 0667 (٧) Arabic-Indic Digit Seven 0668 (٨) Arabic-Indic Digit Eight 0669 (٩) Arabic-Indic Digit Nine

17 Arabic Language Accepted Character Set Table Characters from Unicode Basic Latin Table ( F): 0030 (0) Digit Zero 0031 (1) Digit One 0032 (2) Digit Two 0033 (3) Digit Three 0034 (4) Digit Four 0035 (5) Digit Five 0036 (6) Digit Six 0037 (7) Digit Seven 0038 (8) Digit Eight 0039 (9) Digit Nine 002D (-) Hyphen-Minus 002E (.) Full Stop (Dot)

18

19 Part II: What cctld Communities are saying?

20 What cctld Communities are saying? 2005: Chinese Domain Name Consortium (CDNC) The Proposal for Internationalizing cctld Names Give the priority to internationalizing cctlds. To ensure the system stability, it s recommended to IDN cctld before IDN gtld. One form of language character variant of internationalized cctld is accepted. cctld registries make their own choice of which IDN character sets. Register and operate the internationalized cctlds in the root DNS server in the form of IDNA Punycode. Supported by SaudiNIC 2007: APTLD Position on Top Level Internationalised Domain Names Allow each existing cctld to manage one additional country or territory specific cctld in a recognised non-ascii script of their country or territory. Support from CENTR Support from AFTLD

21 What cctld Communities are saying? URLs Chinese Domain Name Consortium (CDNC) The Proposal for Internationalizing cctld Names, June Supported by SaudiNIC APTLD Position on Top Level Internationalised Domain Names - June 21, Level%20Internationalised%20Domain%20Names%20- %20June%2021%202007%20v6.pdf Support from CENTR ( CTR/version/5/part/12/data/CENTR_backs_APTLD_position.pdf?bra nch=main&language=default Support from AFTLD ( D%20Proposal.pdf

22 A Joint Position Statement ADNPP and IRNIC

23 Part III: Let us expand and look at the whole script

24 References New IDNA200x documents PAN10 Allowable Characters (under dot-iran) Unicode ADNPP

25 About Arabic Script The 2 nd most widely used alphabetic writing system in the world Used by many languages such as: Persian, Urdu, Turkish, Kurdish, Pashto, Swahili, It is widely used by more than 43 countries more than one billion potential users could be concerned in using Arabic script domain names.

26 Arabic Script Unicode Character Code Chart Arabic language character codes occupy a very small section of the Arabic script Unicode chart Arabic script Unicode chart is used by many other languages that may add or change characters to represent phonemes that do not appear in Arabic phonology. A new character usually created based on modifying the basic shape of an existing Arabic character, for example, by adding more dots. These additions have meaning to the new language but not to the original language. Therefore, many characters would easily be confused with some other characters from other languages

27 Some languages may add or change characters to represent phonemes that do not appear in Arabic phonology. A new character usually created based on modifying the basic shape of an existing Arabic character, for example, by adding more dots.

28

29 Arabic Script The Unicode Codepoints and IDNA ; CONTEXTO # ARABIC NUMBER SIGN..ARABIC SIGN SAFHA A ; UNASSIGNED # <reserved>..<reserved> 060B..060F ; DISALLOWED # AFGHANI SIGN..ARABIC SIGN MISRA ; PVALID # ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..AR A ; UNASSIGNED # <reserved>..<reserved> 061B ; DISALLOWED # ARABIC SEMICOLON 061C..061D ; UNASSIGNED # <reserved>..<reserved> 061E..061F ; DISALLOWED # ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC Q 0620 ; UNASSIGNED # <reserved> A ; PVALID # ARABIC LETTER HAMZA..ARABIC LETTER GHAIN 063B..063F ; UNASSIGNED # <reserved>..<reserved> E ; PVALID # ARABIC TATWEEL..ARABIC FATHA WITH TWO DOTS 065F ; UNASSIGNED # <reserved> ; PVALID # ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT 066A..066D ; DISALLOWED # ARABIC PERCENT SIGN..ARABIC FIVE POINTED STA 066E ; PVALID # ARABIC LETTER DOTLESS BEH..ARABIC LETTER HIG ; DISALLOWED # ARABIC LETTER HIGH HAMZA ALEF..ARABIC LETTER D3 ; PVALID # ARABIC LETTER TTEH..ARABIC LETTER YEH BARREE 06D4 ; DISALLOWED # ARABIC FULL STOP 06D5..06DC ; PVALID # ARABIC LETTER AE..ARABIC SMALL HIGH SEEN 06DD ; CONTEXTO # ARABIC END OF AYAH 06DE ; DISALLOWED # ARABIC START OF RUB EL HIZB 06DF..06E8 ; PVALID # ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL 06E9 ; DISALLOWED # ARABIC PLACE OF SAJDAH 06EA..06FC ; PVALID # ARABIC EMPTY CENTRE LOW STOP..ARABIC LETTER 06FD..06FE ; DISALLOWED # ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SI 06FF ; PVALID # ARABIC LETTER HEH WITH INVERTED V Source: draft-faltstrom-idnbis-tables-05.txt

30

31 Disallowed Characters by IDNA200x 200 in general, characters are placed into "DISALLOWED" if they fall into one or more of the following groups: The character is a symbol or punctuation form or, more generally, something that is not a letter or digit. Symbols, drawing characters, and various notational characters are permanently excluded John Klensin, Internet-Draft: IDNA200X Rationale, Feb 2008

32 Undesirable Characters Punctuations, Signs, Marks, Annotations, Honorifics Subtending marks Currency sign 060B Punctuations 060C-060D, 061B, 061E- 061F, 066A-066D, 06D4 Poetic marks 060E-060F Honorifics Koranic annotation sign Tatweel , 06D6-06ED Points 064B-0652, 0670

33

34 Issues Need Further Investigations Combining Marks The use of combining marks with some base characters would confuse with other character. Combining maddah and hamza Other combining marks E ا + = ا أ U+0623 U+0654 U+0627

35 Issues Need Further Investigations Combining Marks input[0] = U+0623 input[1] = U+0646 input[2] = U+0627

36 Issues Need Further Investigations Diacritics (Tashkeel ( Tashkeel) Tashkeel Points: about 9 064B-0652, 0670 From user point view, he/she does not know Whether the label is with or without tashkeel Whether the label is totally (i.e., all characters) use tashkeel The correct pronunciations to put the correct taskeel There are many different pronunciations depending on the local dialogue There will be huge number of combinations when registering a single label good playground for phishing

37 Issues Need Further Investigations ZWNJ & ZWJ Control Characters The support of ZWJ and ZWNJ in domain names would result in some confusions Their use and concepts are not known to regular users

38 Issues Need Further Investigations ZWNJ & ZWJ Control Characters ZWNJ: Visually noticed حبل ح بل input[0] = U+062d input[1] = U+0628 input[2] = U+0644

39 Issues Need Further Investigations ZWNJ & ZWJ Control Characters ZWNJ: Visually Unnoticed ط بل طبل input[0] = U+0637 input[1] = U+0628 input[2] = U+0644

40 Issues Need Further Investigations ZWNJ & ZWJ Control Characters ZWJ Visually Unnoticed input[0] = U+0645 input[1] = U+062c input[2] = U+0645 input[3] = U+0639 input[4] = U+200d input[5] = U+002d input[6] = U+0627 input[7] = U+0644 input[8] = U+0631 input[9] = U+0628 input[10] = U+0627 input[11] = U+0637 input[12] = U+200d input[13] = U+002d input[14] = U+0627 input[15] = U+0644 input[16] = U+062f input[17] = U+0648 input[18] = U+0644 input[19] = U+064a ZWJ Visually noticed input[0] = U+0645 input[1] = U+062c input[2] = U+0645 input[3] = U+0639 input[4] = U+002d input[5] = U+0627 input[6] = U+0644 input[7] = U+0631 input[8] = U+0628 input[9] = U+0637 input[10] = U+002d input[11] = U+0627 input[12] = U+0644 input[13] = U+062f input[14] = U+0648 input[15] = U+0644 input[16] = U+064a

41 Issues Need Further Investigations Digits Arabic-Indic VS. Eastern Arabic-Indic digits ٩ ٨ ٧ ٦ ٥ ٤ ٣ ٢ ١ ٠ ٩ ٨ ٧ ۶ ۵ ۴ ٣ ٢ ١ ٠

42 Issues Need Further Investigations Digits Arabic-Indic vs. European-Arabic digits ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ Windows has supported number substitution by allowing the representation of different cultural shapes for the same digits while keeping the internal storage of these digits unified among different locales, for example numbers are stored in their well known hexadecimal values, 0x40, 0x41, but displayed according to the selected language. Source:

43 Issues Need Further Investigations Similar Shape Characters There are a number of groups of characters that have the same shapes, eg. Kaf, Heh, Yeh, Alef, groups

44

45 Issues Need Further Investigations Similar Shape Characters Char. Code Arabic Letter Confusing Letters From Other Languages ء ا أ ؤ إ ئ ا ب ة Hamza Alef With Madda Above Alef With Hamza Above Waw With Hamza Above Alef With Hamza Below Yeh With Hamza Above Alef Beh The Marbuta (06FD) (0674), (0671) ٱ (0675) ٵ,(0672) ٲ,(0671) ٱ (06CF) ۏ,(06C9) ۉ,(06C8) ۈ,(06C6) ۆ,(0677) ٷ,(0676) ٶ (0673) ٳ (06D3) ۓ (06CE), ێ,(0678) ٸ (066E), پ (0679), ٹ (067E) (06C2) ۀ,(06C0) ۀ,(06C3) ة

46 Issues Need Further Investigations Similar Shape Characters Char. Code Arabic Letter Confusing Letters From Other Languages ت ث ج ح خ د ذ ر ز 062A 062B 062C 062D 062E 062F Teh Theh Jeem Hah Khah Dal Thal Reh Zain (067C) ټ,(067A) ٺ (067F) ٿ,(067D) ٽ (0684) ڄ,(0683) ڃ (0681) ځ (0682) ڂ,(0681) ځ (068D) ڍ,(068A) ڊ,(0689) ډ,(0688) ڈ (068C) ڌ,(0688) ڈ (0695) ڕ,(0693) ړ,(0691) ڑ (0697) ڗ,(0692) ڒ,(0691) ڑ

47 Issues Need Further Investigations Similar Shape Characters Char. Code Arabic Letter Confusing Letters From Other Languages س ش ص ض ط ظ ع غ ف A 0641 Seen Sheen Sad Dad Tah Zah Ain Ghain Feh (06FA) ۺ (069D) ڝ (06FB) ۻ,(069E) ڞ (069F) ڟ (06A0) ڠ (06A4) ڤ,(06A3) ڣ,(06A2) ڢ,(06A1) ڡ

48 Issues Need Further Investigations Similar Shape Characters Char. م ن ه و ى ي Lam Meem Noon Heh Waw Alef Maksura Yeh 064A ڲ (06AF), گ (06AC), ڬ (06AB), ګ (06AA), ڪ,(06A9) ک ق ك ل Code Arabic Letter Qaf Kaf Confusing Letters From Other Languages (066F), ڨ (06A8) (06B2), ڳ (06B3) (06B7) ڷ,(06B6) ڶ,(06B5) ڵ (06FE) (06BC) ڼ (06BA), ں,(06B9) ڹ (06D5) ە,(06C1) ہ (06BE), ه (06C8) ۈ,(06C7) ۇ,(06C6) ۆ,(06C5) ۅ,(06C4) ۄ (06CE) ێ (06CD), ۍ (06CC), ی (06D2) ے,(06D1) ۑ,(06D0) ې,(067E) پ,(067B) ٻ

49 Issues Need Further Investigations Similar Shape Characters آلمني کلمني کلمنې input[0] = U+06a9 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+06d0 input[0] = U+06a9 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+064a input[0] = U+0643 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+064a

50 Issues Need Further Investigations Similar Shape Characters بٻٺې بيٺې بيتې بيتي input[0] = U+0628 input[1] = U+067b input[2] = U+067a input[3] = U+06d0 input[0] = U+0628 input[1] = U+064a input[2] = U+067a input[3] = U+06d0 input[0] = U+0628 input[1] = U+064a input[2] = U+062a input[3] = U+06d0 input[0] = U+0628 input[1] = U+064a input[2] = U+062a input[3] = U+064a

51 Issues Need Further Investigations Similar Shape Characters هرفي هرفې هرفې input[0] = U+06be input[1] = U+0631 input[2] = U+06a7 input[3] = U+06d0 input[0] = U+0647 input[1] = U+0631 input[2] = U+0641 input[3] = U+06d0 input[0] = U+0647 input[1] = U+0631 input[2] = U+0641 input[3] = U+064a

52 Issues Need Further Investigations Similar Shape Characters إشارة إشارة ٳشارة input[0] = U+0673 input[1] = U+0634 input[2] = U+0627 input[3] = U+0631 input[4] = U+06c3 input[0] = U+0625 input[1] = U+0634 input[2] = U+0627 input[3] = U+0631 input[4] = U+06c3 input[0] = U+0625 input[1] = U+0634 input[2] = U+0627 input[3] = U+0631 input[4] = U+0629

53 Issues Need Further Investigations Similar Shape Characters کلی آلى کل input[0] = U+06a9 input[1] = U+0644 input[2] = U+06cd input[0] = U+06a9 input[1] = U+0644 input[2] = U+06cc input[0] = U+0643 input[1] = U+0644 input[2] = U+0649

54

55 Issues Need Further Investigations Bidirectional Behavior Arabic script domains will include characters that are LTR (e.g., dash, dot, AE digits) Exiting IDNA2003 (stringprep) assumes that the first and last characters of the label must be RandALCat characters.

56 Language VS. Script Protocol works with scripts except for certain special cases Eliminate undesirable characters (i.e. minimize the table to letters, and digits) Accepted character set table Guideline rules specific for language e.g., character folding Prohibition of mixing languages Unified character set table Guideline rules to Reduce confusions with duplicate characters across multiple languages restrict syntax

57 Part IV: Conclusions

58 Conclusions There are some work needed at the level of IDNA protocol The current PVALID/DISALLOWED list should be updated It should be for the whole script and suit all the languages Speedup the deployment of the fast track (At the TLD/Registry level) Start with IDN cctlds and each one dedicated for only one language (not script) Most cctlds are trying to help their local communities/language groups in a controllable manner that would help to reach a simple, comprehensive and more proper solutions to the users Each language using Arabic script should be treated separately a character set table per language Standardize accepted character set tables per language that would be used for all IDN TLD registries for the same language

59 Conclusions Scrip-based IDN TLD A single script IDN TLD (cctld or gtld) with multiple languages is not yet ready! the current experience accumulated during the last few years is not enough to clear the path for allowing language mixing under a single TLD. It needs to be studied very carefully and develop a universal character set table Devolving a unified accepted character set table for the whole script is a challenging task Needs cooperation among international bodies as well as language communities that are using the Arabic script

60 Thank you input[0] = U+0634 input[1] = U+06a9 input[2] = U+0631 input[3] = U+0627 input[0] = U+0634 input[1] = U+0643 input[2] = U+0631 input[3] = U+0627

Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages

Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages TF-AIDN, June/2014 Presented by AbdulRahman I. Al-Ghadir Researcher in SaudiNIC Content What we have done so far? Problem

More information

Arabic Text Segmentation

Arabic Text Segmentation Arabic Text Segmentation By Dr. Salah M. Rahal King Saud University-KSA 1 OCR for Arabic Language Outline Introduction. Arabic Language Arabic Language Features. Challenges for Arabic OCR. OCR System Stages.

More information

Nafees Nastaleeq v1.01 beta

Nafees Nastaleeq v1.01 beta Nafees Nastaleeq v1.01 beta Release Notes November 07, 2007 CENTER FOR RESEARCH IN URDU LANGUAGE PROCESSING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES, LAHORE PAKISTAN Table of Contents 1 Introduction...4

More information

Character Set Supported by Mehr Nastaliq Web beta version

Character Set Supported by Mehr Nastaliq Web beta version Character Set Supported by Mehr Nastaliq Web beta version Sr. No. Character Unicode Description 1 U+0020 Space 2! U+0021 Exclamation Mark 3 " U+0022 Quotation Mark 4 # U+0023 Number Sign 5 $ U+0024 Dollar

More information

New Features in mpdf v5.6

New Features in mpdf v5.6 New Features in mpdf v5.6 HTML5 tags New tags introduced in HTML5 now have basic support in mpdf, and will thus support CSS style references. The following are treated as block elements similar to :

More information

1 See footnote 2, below.

1 See footnote 2, below. To: UTC From: Azzeddine Lazrek, Cadi Ayyad University, Marrakesh, Morocco (with Debbie Anderson, SEI, UC Berkeley, and with assistance from Murray Sargent, Laurentiu Iancu, and others) RE: Arabic Math

More information

Proposed keyboard layout for Swahili in Arabic script

Proposed keyboard layout for Swahili in Arabic script أ Proposed keyboard layout for Swahili in Arabic script Kevin Donnelly kevin@dotmon.com Version 0.1, March 2010 Introduction Swahili was originally written in Arabic script in its area of origin (the littoral

More information

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs by Adil Allawi, Diwan Software Limited adil@diwan.com Introduction One of the big problems for Arabic text

More information

Nafees Nastaleeq v1.02 beta

Nafees Nastaleeq v1.02 beta Nafees Nastaleeq v1.02 beta Release Notes September 24, 2008 CENTER FOR RESEARCH IN URDU LANGUAGE PROCESSING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES, LAHORE PAKISTAN Table of Contents 1 Introduction...4

More information

About SaudiNIC. What we have done. What is Next. Lessons learned

About SaudiNIC. What we have done. What is Next. Lessons learned About SaudiNIC What we have done What is Next Lessons learned SaudiNIC is a non-profit unit that is operated by Communication and Information Technology Commission (CITC) which is a semigovernmental entity.

More information

Multilingual Internet Arabic IDN

Multilingual Internet Arabic IDN Multilingual Internet Arabic IDN Arab Summer School on Internet Governance Cairo, June 27-30, 2009 Dr. Abdulaziz H. Al-Zoman Director of SaudiNIC - CITC Chairman of Steering Committee - Arabic Domain Name

More information

L2/11-033R 1 Introduction

L2/11-033R 1 Introduction To: UTC and ISO/IEC JTC1/SC2 WG2 Title: Proposal to add ARABIC MARK SIDEWAYS NOON GHUNNA From: Lorna A. Priest (SIL International) Date: 10 February 2011 1 Introduction ARABIC MARK SIDEWAYS NOON GHUNNA

More information

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek JTC1/SC2/WG2 N 3088 Others Symbols, Additional characters proposed to Unicode Azzeddine Lazrek lazrek@ucam.ac.ma Cadi Ayyad University, Faculty of Sciences P.O. Box 2390, Marrakech, Morocco Phone: +212

More information

Modeling Nasta leeq Writing Style

Modeling Nasta leeq Writing Style Modeling Nasta leeq Writing Style Aamir Wali National University of Computer and Emerging Sciences Overview: Urdu اب پ ت ٹ ث ج چ ح خ د ڑ ڈ ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن ه ء ی ے ہ ں و In Urdu, a

More information

Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore

Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore Sarmad Hussain Sarmad Hussain Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore www.cle.org.pk sarmad@cantab.net Arabic Script

More information

Improved Method for Sliding Window Printed Arabic OCR

Improved Method for Sliding Window Printed Arabic OCR th Int'l Conference on Advances in Engineering Sciences & Applied Mathematics (ICAESAM'1) Dec. -9, 1 Kuala Lumpur (Malaysia) Improved Method for Sliding Window Printed Arabic OCR Prof. Wajdi S. Besbas

More information

About SaudiNIC. What we have done.. ( ) What is Next.. ( ) Lessons learned

About SaudiNIC. What we have done.. ( ) What is Next.. ( ) Lessons learned About SaudiNIC What we have done.. ( ) What is Next.. ( ) Lessons learned SaudiNIC is a non-profit unit that is operated by Communication and Information Technology Commission (CITC) which is a semi-governmental

More information

Domain Names in Pakistani Languages. IDNs for Pakistani Languages

Domain Names in Pakistani Languages. IDNs for Pakistani Languages ا ہ 6 5 a ز @ ں ب Domain Names in Pakistani Languages س a ی س a ب او اور را < ہ ر @ س a آف ا ر ا 6 ب 1 Domain name Domain name is the address of the web page pg on which the content is located 2 Internationalized

More information

Font Features for Lateef

Font Features for Lateef Font s for Lateef The Lateef font includes a number of optional features that provide alternative rendering that might be preferable for use in some contexts. The chart below enumerates the details of

More information

Umbrella. Branding & Guideline

Umbrella. Branding & Guideline Umbrella. Branding & Guideline OUR LOGO. OUR COLORS. #FFFFFF Font COLOR #2A3942 #64A0C6 mix color C: 75% M: 68% Y: 67% K: 90% H: 320 S:61% B:0 R:0 G:0 B:0 C: 75% M: 68% Y: 67% K: 90% H: 320 S:61% B:0 R:0

More information

SaudiNIC s Proposed Solution. MENOG 8, Khobar, May 14-18, 2011

SaudiNIC s Proposed Solution. MENOG 8, Khobar, May 14-18, 2011 SaudiNIC s Proposed Solution MENOG 8, Khobar, May 14-18, 2011 Arabic Script Major Issues Confusing Similar Characters Proposed solution Characteristics Language-level required tables Language-level required

More information

Proposal to encode productive Arabic-script modifier marks

Proposal to encode productive Arabic-script modifier marks Proposal to encode productive Arabic-script modifier marks Date: May 16, 2003 Author: Jonathan Kew, SIL International Kamal Mansour, Agfa Monotype Mark Davis, IBM Address: Horsleys Green High Wycombe Bucks

More information

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant)

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ICANN IDN TLD Variant Issues Project Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ajs@anvilwalrusden.com I m a consultant Blame me for mistakes here, not staff or ICANN

More information

REEM READYMIX Brand Guideline

REEM READYMIX Brand Guideline REEM READYMIX Brand Guideline Implementing Reem Readymix brand in communications V.I - February 2018 Introduction Reem Readymix is a leading supplier of all types of readymix concrete and cementbased plastering

More information

Recognition of secondary characters in handwritten Arabic using Fuzzy Logic

Recognition of secondary characters in handwritten Arabic using Fuzzy Logic International Conference on Machine Intelligence (ICMI 05), Tozeur, Tunisia, 2005 Recognition of secondary characters in handwritten Arabic using Fuzzy Logic Mohammed Zeki Khedher1 Ghayda Al-Talib2 1 Faculty

More information

Identity Guidelines. December 2012

Identity Guidelines. December 2012 Identity Guidelines December 2012 Identity Guidelines Contents 1.0 Our Logo Our logo Our wordmark Colour treatments Clear space, large and small sizes Correct logo placement Incorrect logo usage 2.0 Colour

More information

IDN - what s up? Patrik Fältström

IDN - what s up? Patrik Fältström IDN - what s up? Patrik Fältström paf@cisco.com 1 Old stuff (what is IDNA) What is it? What implications do we get? IDNA uses Unicode 3.2 2 Protocol issues Old protocols can only handle a subset of US-

More information

Minutes of Workshop May 15-16, 2009 Version 0.3

Minutes of Workshop May 15-16, 2009 Version 0.3 The second workshop on Internationalized Domain Names for Local Content Development in Pakistani Languages was organized by the Ministry of IT and Telecom on May 15-16, 2009 at the Center for Research

More information

OUR LOGO. SYMBOL LOGO SYMBOL LOGO ORIGINAL STRUCTURE

OUR LOGO. SYMBOL LOGO SYMBOL LOGO ORIGINAL STRUCTURE OUR LOGO. ORIGINAL STRUCTURE SYMBOL LOGO SYMBOL LOGO OUR COLORS. Infographic Color 2A3942 FED708 2A3942 E77525 804D9F CLEAR SPACE. PRINT SAFE AREA MINIMUM SIZE - PRINT H: 30 pt ONLINE SAFE AREA MINIMUM

More information

qatar national day 2017 brand guidelines 2017

qatar national day 2017 brand guidelines 2017 2017 brand guidelines 2017 the following guidelines demonstrate how best to apply the brand 2 CONTENTS 3 contents p5. vision & mission p7. logo p8. logo rationale p9. logo clear space p10. logo do s p11.

More information

ISO/IEC JTC 1/SC 2. Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat JTC 1 Plenary 2012/11/05-10, Jeju

ISO/IEC JTC 1/SC 2. Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat JTC 1 Plenary 2012/11/05-10, Jeju ISO/IEC JTC 1/SC 2 Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat 2012 JTC 1 Plenary 2012/11/05-10, Jeju what is new Work items ISO/IEC 10646 2 nd ed. 3 rd ed. (2012) ISO/IEC 14651 Amd.1-2

More information

Strings 20/11/2018. a.k.a. character arrays. Strings. Strings

Strings 20/11/2018. a.k.a. character arrays. Strings. Strings ECE 150 Fundamentals of Programming Outline 2 a.k.a. character arrays In this lesson, we will: Define strings Describe how to use character arrays for strings Look at: The length of strings Copying strings

More information

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines BRAND GUIDELINES THE LOGO Guidelines LOGO SYMBOL TYPEFACE 2 COLOR SCHEME When do I use the full-color logo? Use the full-color logo as frequently as possible to maximize and strengthen the brand. PRIMARY

More information

Segmentation and Recognition of Arabic Printed Script

Segmentation and Recognition of Arabic Printed Script Institute of Advanced Engineering and Science IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 2, No. 1, March 2013, pp. 20~26 ISSN: 2252-8938 20 Segmentation and Recognition of Arabic

More information

Internationalized Domain Names Variant Issues Project

Internationalized Domain Names Variant Issues Project Internationalized Domain Names Variant Issues Project 1 P a g e 1. Background Internationalized Domain Names Variant Issues Project Arabic Variant TLD Issues and Requirements This document identifies issues

More information

A Proposed UNICODE-Based Extended Romanization System for Persian Texts. M. A. Mahdavi, Ph.D. Imam Khomeini International University, Iran

A Proposed UNICODE-Based Extended Romanization System for Persian Texts. M. A. Mahdavi, Ph.D. Imam Khomeini International University, Iran International Journal of Information Science and Management A Proposed UNICODE-Based Extended Romanization System for Persian Texts M. A. Mahdavi, Ph.D. Imam Khomeini International University, Iran Email:

More information

IDN - the protocol. Patrik Fältström

IDN - the protocol. Patrik Fältström IDN - the protocol Patrik Fältström paf@cisco.com 1 In the beginning 3454 Preparation of Internationalized Strings ("stringprep"). P. Hoffman, M. Blanchet. December 2002. (Format: TXT=138684 bytes) (Status:

More information

Arabic Script IDN Working Group (ASIWG)

Arabic Script IDN Working Group (ASIWG) Arabic Script IDN Working Group (ASIWG) ICANN Paris IDN Workshop 26 Jun08 Spot The Difference They Look The Same To Us But Not To A Computer When 1 is not 1 The Arabic Language is only a part of the Arabic

More information

SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS

SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS 31 st December 016. Vol.94. No. 005-016 JATIT & LLS. All rights reserved. ISSN: 199-8645 www.jatit.org E-ISSN: 1817-3195 SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS

More information

Internationalized Domain Names

Internationalized Domain Names Internationalized Domain Names Introduction & Update MENOG 1 Bahrain April 3-5, 2007 By: Baher Esmat Middle East Liaison IP and DNS Internet 207.248.168.180 ISP icann.org 192.0.34.163 ISP Backbone ISP

More information

VOL. 3, NO. 7, Juyl 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

VOL. 3, NO. 7, Juyl 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Arabic Hand Written Character Recognition Using Modified Multi-Neural Network Farah Hanna Zawaideh Irbid National University, Computer Information System Department dr.farahzawaideh@inu.edu.jo ABSTRACT

More information

Overview & Update. Manager, Regional Relations Middle East. Internet Festival Hammamet, Tunis August 2008

Overview & Update. Manager, Regional Relations Middle East. Internet Festival Hammamet, Tunis August 2008 Internationalized ti Domain Names: Overview & Update Baher Esmat Manager, Regional Relations Middle East Internet Festival Hammamet, Tunis August 2008 1 Introduction to IDNs IDN stands for Internationalized

More information

ICANN November Tina Dam Director, IDN Program

ICANN November Tina Dam Director, IDN Program ICANN 33 6 November 2008 Tina Dam Director, IDN Program IDN SLD registrations since 2001 (testbed) 2003(protocol) 2 IDNs what a year! Fast Track Draft Plan for public comments Outstanding key issues: Relation

More information

JTC1/SC2/WG2 N Introduction

JTC1/SC2/WG2 N Introduction JTC1/SC2/WG2 N3882 To: UTC and ISO/IEC JTC1/SC2 WG2 Title: Proposal to add Arabic script characters for African and Asian languages From: Lorna A. Priest, Martin Hosken (SIL International) Date: 12 August

More information

The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur

The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur Roozbeh Pournader, Google Inc. May 8, 2014 Summary The Arabic letter heh has some variants in the Unicode Standard, which has

More information

BRAND GUIDELINES JANUARY 2017

BRAND GUIDELINES JANUARY 2017 BRAND GUIDELINES JANUARY 2017 GETTING AROUND Page 03 05 06 07 08 09 10 12 14 15 Section 01 - Our Logo 02 - Logo Don ts 03 - Our Colors 04 - Our Typeface 06 - Our Art Style 06 - Pictures 07 - Call to Action

More information

Trusted Future of Internet with IDN and UA

Trusted Future of Internet with IDN and UA Trusted Future of Internet with IDN and UA GFCE Delhi 12 th Oct 2018 Dr. Ajay Data Co-Chair NBGP Coordinator EAI UASG ccnso Council Member Oct25th 1 ASCII Domain Name Label www.cafe-123.com Third Level

More information

The Unicode Standard Version 6.0 Core Specification

The Unicode Standard Version 6.0 Core Specification The Unicode Standard Version 6.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

Universal Acceptance Technical Perspective. Universal Acceptance

Universal Acceptance Technical Perspective. Universal Acceptance Universal Acceptance Technical Perspective Universal Acceptance Warm-up Exercise According to w3techs, which of the following pie charts most closely represents the fraction of websites on the Internet

More information

ISeCure. The ISC Int'l Journal of Information Security. High Capacity Steganography Tool for Arabic Text Using Kashida.

ISeCure. The ISC Int'l Journal of Information Security. High Capacity Steganography Tool for Arabic Text Using Kashida. The ISC Int'l Journal of Information Security July 2010, Volume 2, Number 2 (pp. 107 118) http://www.isecure-journal.org High Capacity Steganography Tool for Arabic Text Using Kashida Adnan Abdul-Aziz

More information

ISO/IEC JTC/SC2/WG Universal Multiple Octet Coded Character Set (UCS)

ISO/IEC JTC/SC2/WG Universal Multiple Octet Coded Character Set (UCS) ISO/IEC JTC/SC2/WG2 ---------------------------------------------------------------------------- Universal Multiple Octet Coded Character Set (UCS) -------------------------------------------------------------------------------

More information

SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY

SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY 1 NUUR ALIFAH ROSLAN, 2 RAMLAN MAHMOD, NUR IZURA UDZIR 3 1 Department of Multimedia, FSKTM, UPM, Malaysia-43400 2 Prof, Department of Multimedia, FSKTM,

More information

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet A. Sean Pue South Asia Language Resource Center Pre-SASLI Workshop 6/7/09 1 Objectives To understand how

More information

It Internationalized ti Domain Names W3C Track: An International Web

It Internationalized ti Domain Names W3C Track: An International Web It Internationalized ti Domain Names W3C Track: An International Web Tina Dam ICANN Director, IDN Program tina.dam@icann.org 17th International World Wide Web Conference, WWW2008 Beijing International

More information

A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY

A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY Ala'a M. Alhusban and Jehad Q. Odeh Alnihoud Computer Science Dept, Al al-bayt University, Mafraq, Jordan ABSTRACT Steganography is an

More information

IDN and Variants. Presented at the Yeti DNS Workshop. Marc Blanchet Viagénie

IDN and Variants. Presented at the Yeti DNS Workshop. Marc Blanchet Viagénie IDN and Variants Presented at the Yeti DNS Workshop Marc Blanchet Viagénie marc.blanchet@viagenie.ca 2016-11-12 Context DNS does strict identifier mapping Humans using languages and scripts do fuzzy mappings

More information

DESIGNING OFFLINE ARABIC HANDWRITTEN ISOLATED CHARACTER RECOGNITION SYSTEM USING ARTIFICIAL NEURAL NETWORK APPROACH. Ahmed Subhi Abdalkafor 1*

DESIGNING OFFLINE ARABIC HANDWRITTEN ISOLATED CHARACTER RECOGNITION SYSTEM USING ARTIFICIAL NEURAL NETWORK APPROACH. Ahmed Subhi Abdalkafor 1* International Journal of Technology (2017) 3: 528-538 ISSN 2086-9614 IJTech 2017 DESIGNING OFFLINE ARABIC HANDWRITTEN ISOLATED CHARACTER RECOGNITION SYSTEM USING ARTIFICIAL NEURAL NETWORK APPROACH Ahmed

More information

USING COMBINATION METHODS FOR AUTOMATIC RECOGNITION OF WORDS IN ARABIC

USING COMBINATION METHODS FOR AUTOMATIC RECOGNITION OF WORDS IN ARABIC USING COMBINATION METHODS FOR AUTOMATIC RECOGNITION OF WORDS IN ARABIC 1 AHLAM MAQQOR, 2 AKRAM HALLI, 3 KHALID SATORI, 4 HAMID TAIRI 1,2,3,4 Sidi Mohamed Ben Abdellah University, Faculty of Science Dhar

More information

1. Brand Identity Guidelines.

1. Brand Identity Guidelines. 1. Brand Identity Guidelines 1.1 HCT Logo 2. Secondary left aligned for English language literature 1. Primary centre aligned stacked formal 3. Secondary right aligned for Arabic language literature 4.

More information

Online Arabic Handwritten Character Recognition Based on a Rule Based Approach

Online Arabic Handwritten Character Recognition Based on a Rule Based Approach Journal of Computer Science 2012, 8 (11), 1859-1868 ISSN 1549-3636 2012 doi:10.3844/jcssp.2012.1859.1868 Published Online 8 (11) 2012 (http://www.thescipub.com/jcs.toc) Online Arabic Handwritten Character

More information

Different Input Systems for Different Devices

Different Input Systems for Different Devices Different Input Systems for Different Devices Optimized Urdu Touch-Screen Keypad Designs Asad Habib, Masakazu Iwatate, Masayuki Asahara and Yuji Matsumoto Graduate School of Information Science Nara Institute

More information

Internationalized Domain Names

Internationalized Domain Names Internationalized Domain Names Fahd Batayneh Middle East DNS Forum 2018 26 April 2018 Agenda 1 2 3 ICANN s IDN Program Universal Acceptance Initiative Task Force on Arabic Script IDNs (TF- AIDN) 2 ICANN

More information

A Comparative Study of PDF Generation Methods:

A Comparative Study of PDF Generation Methods: MTR110043 MITRE TECHNICAL REPORT A Comparative Study of PDF Generation Methods: Measuring Loss of Fidelity When Converting Arabic and Persian MS Word Files to PDF Contract No.: W15P7T-11-C-F600 Project

More information

Internationalized Domain Names New gtld Program

Internationalized Domain Names New gtld Program Internationalized Domain Names New gtld Program Doug Brent Chief Operating Officer Hong Kong 24 July 2009 Karla Valente Director New gtld Program 0 Agenda Internationalized Domain Names (IDNs) defined

More information

2011 International Conference on Document Analysis and Recognition

2011 International Conference on Document Analysis and Recognition 20 International Conference on Document Analysis and Recognition On-line Arabic Handwrittenn Personal Names Recognition System based b on HMM Sherif Abdelazeem, Hesham M. Eraqi Electronics Engineering

More information

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines BRAND GUIDELINES THE LOGO Guidelines LOGO SYMBOL TYPEFACE 2 COLOR SCHEME When do I use the full-color logo? Use the full-color logo as frequently as possible to maximize and strengthen the brand. PRIMARY

More information

American University of Beirut Logo and Visual Identity Manual. April 2011 version 1.0

American University of Beirut Logo and Visual Identity Manual. April 2011 version 1.0 American University of Beirut Logo and Visual Identity Manual April 2011 version 1.0 Contents Introduction Why a Visual Identity System...4 Visual Identity Policy...4 AUB Logo AUB Logo...6 General Guidelines...7

More information

GEOMETRIC-TOPOLOGICAL BASED ARABIC CHARACTER RECOGNITION, A NEW APPROACH

GEOMETRIC-TOPOLOGICAL BASED ARABIC CHARACTER RECOGNITION, A NEW APPROACH GEOMETRIC-TOPOLOGICAL BASED ARABIC CHARACTER RECOGNITION, A NEW APPROACH HAMED TIRANDAZ, MOHSEN AHMADNIA AND HAMIDREZA TAVAKOLI Electrical and Computer Engineering Department, Hakim Sabzevari University,

More information

3 Qurʾānic typography Qurʾānic typography involves getting the following tasks done.

3 Qurʾānic typography Qurʾānic typography involves getting the following tasks done. TUGboat, Volume 31 (2010), No. 2 197 Qurʾānic typography comes of age: Æsthetics, layering, and paragraph optimization in ConTEXt Idris Samawi Hamid 1 The background of Oriental TEX Attempts to integrate

More information

Research Article Offline Handwritten Arabic Character Recognition Using Features Extracted from Curvelet and Spatial Domains

Research Article Offline Handwritten Arabic Character Recognition Using Features Extracted from Curvelet and Spatial Domains Research Journal of Applied Sciences, Engineering and Technology 11(2): 158-164, 2015 DOI: 10.19026/rjaset.11.1702 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:

More information

The GCC Pilot Project for Arabic Domain Names..kw.qa.om.sa.bh.ae

The GCC Pilot Project for Arabic Domain Names..kw.qa.om.sa.bh.ae The GCC Pilot Project for Arabic Domain Names Raed Al-Fayez Head of the GCC Pilot Project Technical Taskforce SaudiNIC raed@isu.net.sa.kw.qa.om.sa.bh.ae Agenda Characteristics of A Domain Name IDN and

More information

Arabic Domain Names (ADN) Pilot Project

Arabic Domain Names (ADN) Pilot Project Joint UNESCO and ITU Global Symposium on Promoting the Multilingual Internet Arabic Domain Names (ADN) Pilot Project Imad Al-Sabouni Advisor to the Minister of Communications and Technology, Syria Vice

More information

Introduction. Requests. Background. New Arabic block. The missing characters

Introduction. Requests. Background. New Arabic block. The missing characters 2009-11-05 Title: Action: Author: Proposal to encode four combining Arabic characters for Koranic use For consideration by UTC and ISO/IEC JTC1/SC2/WG2 Roozbeh Pournader Date: 2009-11-05 Introduction Although

More information

L2/09-358R Introduction. Recommendation. Background. Rub El Hizb Symbol. For discussion at UTC and by experts. No action is requested.

L2/09-358R Introduction. Recommendation. Background. Rub El Hizb Symbol. For discussion at UTC and by experts. No action is requested. L2/09-358R 2009-10-28 Title: Action: Authors: Discussion document for polishing Koranic support in Unicode For discussion at UTC and by experts. No action is requested. Roozbeh Pournader Date: 2009-10-28

More information

NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR

NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR Mohamed Ben Halima, Hichem karray, Adel. M. Alimi REGIM: REsearch Group on Intelligent Machines University of Sfax, National School of Engineers (ENIS)

More information

Persian/Arabic Baffletext CAPTCHA 1

Persian/Arabic Baffletext CAPTCHA 1 Journal of Universal Computer Science, vol. 12, no. 12 (2006), 1783-1796 submitted: 20/3/06, accepted: 22/12/06, appeared: 28/12/06 J.UCS Persian/Arabic Baffletext CAPTCHA 1 Mohammad Hassan Shirali-Shahreza

More information

INTERNATIONAL STANDARD

INTERNATIONAL STANDARD INTERNATIONAL STANDARD First edition 1996-l 2-l 5 Information and documentation - Extension of the Arabic alphabet coded character set for bibliographic information interchange information et documentation

More information

Internationalized Domain Names an introduction

Internationalized Domain Names an introduction Internationalized Domain Names an introduction Tina Dam Director, IDN Program 1 March 2009 Agenda Where are we and where are we headed IDN TLD Processes IDN Definitions How does IDNs work including examples

More information

Nastaleeq: A challenge accepted by Omega

Nastaleeq: A challenge accepted by Omega Nastaleeq: A challenge accepted by Omega Atif Gulzar, Shafiq ur Rahman Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, Lahore, Pakistan atif dot

More information

Proposed Draft: Unicode Technical Report #53 UNICODE ARABIC MARK ORDERING ALGORITHM

Proposed Draft: Unicode Technical Report #53 UNICODE ARABIC MARK ORDERING ALGORITHM UNICODE ARABIC MARK ORDERING ALGORITHM Authors Roozbeh Pournader ( roozbeh@unicode.org ), Bob Hallissy ( bob_hallissy@sil.org ), Lorna Evans ( lorna_evans@sil.org ) Date 2017-10-06 This version Previous

More information

Award Winning Typefaces by Linotype

Award Winning Typefaces by Linotype Newly released fonts and Midan awarded coveted design prize Award Winning Typefaces by Linotype Bad Homburg, 23 April 2007. Linotype has once again received critical recognition for their commitment to

More information

Overview. Coordinating with our partners, we help make the Internet work.

Overview. Coordinating with our partners, we help make the Internet work. ICANN Update Champika Wijayatunga Regional Security Engagement Manager Asia Pacific TWNIC OPM / TWNOG 27-28 November 2018 1 Overview Coordinating with our partners, we

More information

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1. T Uhttp://www.dkuug.dk/JTC1/SC2/WG2/docs/summaryform.

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1. T Uhttp://www.dkuug.dk/JTC1/SC2/WG2/docs/summaryform. T P P T Form T Uhttp://www.unicode.org/Public/UNIDATA/UCD.html T Uhttp://www.dkuug.dk/JTC1/SC2/WG2/docs/roadmaps.html T Uhttp://www.unicode.org T Uhttp://www.dkuug.dk/JTC1/SC2/WG2/docs/summaryform.html

More information

PCL Greek-8 - Code Page 869

PCL Greek-8 - Code Page 869 PCL Greek-8 - Code Page 869 Page 1 of 5 PCL Symbol Se t: 8G Unicode glyph correspondence tables. Contact:help@redtitan.com http://pcl.to $20 U0020 Space $90 U038A Ê Greek capita l letter iota with tonos

More information

Main Brandmark. Alternative option 1: White torch and white logotype on orange background

Main Brandmark. Alternative option 1: White torch and white logotype on orange background Style Guide Main Brandmark Torch Logotype Preferred brandmark The RFE/RL brandmark consists of two elements 1) the torch and 2) the Radio Free Europe/ Radio Liberty logotype. These two elements should

More information

Internationalized Domain Names: Technical Workshop

Internationalized Domain Names: Technical Workshop Internationalized Domain Names: Technical Workshop Prepared for: LACTLD Meeting Panama City, Panama 4 September 2008 Tina Dam Director, IDN Program tina.dam@icann.org 1 Workshop agenda Review of agenda

More information

Feature Extraction Techniques of Online Handwriting Arabic Text Recognition

Feature Extraction Techniques of Online Handwriting Arabic Text Recognition 2013 5th International Conference on Information and Communication Technology for the Muslim World. Feature Extraction Techniques of Online Handwriting Arabic Text Recognition Mustafa Ali Abuzaraida 1,

More information

The Unicode Standard Version 6.1 Core Specification

The Unicode Standard Version 6.1 Core Specification The Unicode Standard Version 6.1 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

SHEFA STORE CORPORATE DESIGN MANUAL BRAND & FUNCTION // CORPORATE DESIGN GUIDELINES. 01 : Corporate Identity. 02 : Corporate Stationery

SHEFA STORE CORPORATE DESIGN MANUAL BRAND & FUNCTION // CORPORATE DESIGN GUIDELINES. 01 : Corporate Identity. 02 : Corporate Stationery BRAND & FUNCTION // CORPORATE DESIGN GUIDELINES SHEFA STORE CORPORATE DESIGN MANUAL 01 : Corporate Identity 02 : Corporate Stationery 03 : Interactive Designs www.shefa-store.com Corporate Identity 01

More information

IDN developments at IRNIC.ir cctld Alireza Saleh

IDN developments at IRNIC.ir cctld   Alireza Saleh IDN developments at IRNIC.ir cctld http://www.nic.ir Alireza Saleh alireza@irnic.ir Global Symposium on Promoting the Multilingual Internet @ Geneva/Switzerland Special thanks to : Dr. Siavash Shahshahani

More information

ARABIC LETTER BEH WITH SMALL ARABIC LETTER MEEM ABOVE MBEH (BEH WITH MEEM ABOVE) Represents mba sound

ARABIC LETTER BEH WITH SMALL ARABIC LETTER MEEM ABOVE MBEH (BEH WITH MEEM ABOVE) Represents mba sound Universal Multiple-Octet Coded Character Set International Organization for Standardization FOrganisation internationale de normalisation Международная организация по стандартизации Doc Type: Working Group

More information

Enabling Complex Asian Scripts on Mobile Devices

Enabling Complex Asian Scripts on Mobile Devices Enabling Complex Asian Scripts on Mobile Devices Waqar Ahmad Computer Science Department, National University of Computer and Emerging Sciences, Lahore, Pakistan waqar.ahmad@nu.edu.pk Sarmad Hussain Center

More information

Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub

Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub King Fahd University of Petroleum and Minerals Computer Engineering Department Dhahran

More information

Urdu Usage Guide ر ہاردو

Urdu Usage Guide ر ہاردو ب" Urdu Usage Guide و 6 اردو ر This guide will explain how to use Urdu on the computer. After Urdu support has been installed and it is possible to toggle between English and Urdu, two additional items

More information

Developing a Real Time Method for the Arabic Heterogonous DBMS Transformation

Developing a Real Time Method for the Arabic Heterogonous DBMS Transformation Developing a Real Time Method for the Arabic Heterogonous DBMS Transformation S. M. Hadi, S. Murtatha Department of Information & Comm. Eng. College of Engineering Al- Khawarizmi,University of Baghdad

More information

Introduction to International Domain Names for Applications (IDNA)

Introduction to International Domain Names for Applications (IDNA) White Paper Introduction to International Domain Names for Applications (IDNA) diamondip.com by Timothy Rooney Product management director BT Diamond IP for Applications (IDNA) By Tim Rooney, Director,

More information

Standardizing the order of Arabic combining marks

Standardizing the order of Arabic combining marks UTC Document Register L2/14-127 Standardizing the order of Arabic combining marks Roozbeh Pournader, Google Inc. May 2, 2014 Summary The combining class of the combining characters used in the Arabic script

More information

The IDN Variant TLD Program: Updated Program Plan 23 August 2012

The IDN Variant TLD Program: Updated Program Plan 23 August 2012 The IDN Variant TLD Program: Updated Program Plan 23 August 2012 Table of Contents Project Background... 2 The IDN Variant TLD Program... 2 Revised Program Plan, Projects and Timeline:... 3 Communication

More information

Internationalization of a Distance Exam Web Environment

Internationalization of a Distance Exam Web Environment Internationalization of a Distance Exam Web Environment Radouane Mrabet Ecole Nationale Supérieure d Informatique et d Analyse des Systèmes (Rabat, Morocco). Tele- Teaching Laboratory. Email: mrabet@ensias.ma

More information

Introduction to Search and Recommendation

Introduction to Search and Recommendation Introduction to Search and Recommendation Yi Zhang Associate Professor Information Retrieval and Knowledge Management Lab Graduate Director, Technology and Information Management University of California

More information