Multilingual Internet Arabic IDN

Size: px
Start display at page:

Download "Multilingual Internet Arabic IDN"

Transcription

1 Multilingual Internet Arabic IDN Arab Summer School on Internet Governance Cairo, June 27-30, 2009 Dr. Abdulaziz H. Al-Zoman Director of SaudiNIC - CITC Chairman of Steering Committee - Arabic Domain Name Pilot Project

2 Must we all learn English so that we can use the Internet? Is that the only option? -- Dr. Tan Tin Wee, National University of Singapore Internationalization of the internet means that the internet is equally accessible from all languages and scripts -- Tina Dam, Director, IDN Program, ICANN "When we talk about Internet for all, we have to go beyond the people who speak English" -- Manal Ismail, vice chair of the GAC

3 Agenda Part I: Part II: Part III: Part IV: Part V: Part VI: Part VII: What is a Domain Name The Needs for Multilingualization Internationalized Domain Names Contribution Methodology The Arabic Language - What has been done so far? Let us expand and look at the whole script - What are the issues? Arabic Script IDN Working Group (ASIWG) Part VIII: IDN cctld Fast Track

4 About us SaudiNIC Administering the Saudi domain name space (.sa) since Operated by Communication and Information Technology Commission (CITC) governmental org. Leading the local community effort towards supporting Arabic language in DNS Head of the ADNPP Steering Committee Head of the ADNPP Technical Committee Arab Team for Arabic Domain Names Formed under the auspices of Arab League, 2005 Supervising ADNPP: Arabic Domain Name Pilot Project

5 Part I: What is a Domain Name?

6 In the beginning The First 4 Nodes of the Internet in 1969 then called the ARPANET Source:

7 now Source:

8 Domain Names Domain names are the familiar, easy-toremember names for computers on the Internet RFC 1035 top level domain label second level domain label third level domain label A full qualified domain has at max 255 characters Consists of labels separated by. A label may have up to 63 characters Maximum number of labels: 127 Accepted ASCII character set: a-z, 0-9, - Internet I want to access SaudiNIC web site:

9 The DNS Tree Root Zone File TLDs sa uk com org edu com net icann nic ns www

10 Top-Level Domains generic Top-Level Domains (gtld):.com.org.net.mil.edu.gov.int.info.biz Country cctld (2 letters): There are ~ 250 cctlds:.ae United Arab Emirates.bh Bahrain.kw Kuwait.de Germany.eg Egypt.in India.pk Pakistan.sa Saudi Arabia.uk United Kingdom.za South Africa

11 Domain Name System (DNS) We are using domain names, machines are using IP addresses then how to resolve names to addresses Resolving IP addresses from domain names (and vise versa) are done automatically using infrastructure services provided through Domain Name System (DNS) Source: learnthenet.com

12 ns1.mcit.gov.sa Local DNS resolver ns1.isu.net.sa D.ROOT-SERVERS.NET DNS ns.citc.gov.sa LAN Internet DNS ns1.isu.net.sa DNS ns.citc.gov.sa

13 Root Servers 13 root servers which together contain authoritative databases listing all top level domains (i.e. org, net, uk, ae, sa ) Root servers names are letter.root-servers.net: e.g. (m.root-servers.net) name org city type a NSI Herndon, VA, US com b USC-ISI Marina del Rey,CA, US edu c PSInet Herndon, VA, US com d U of Maryland College Park,MD, US edu e NASA M t View, CA, US usg f Internet Software C. Palo Alto, CA, US com g D ISA Vienna, VA, US usg h ARL Aberdeen, MD, US usg i NOR D Unet Stockholm, SE int j NSI (TBD) Herndon, VA, US (com) k R IPE London, UK int l ICANN Marina del Rey,CA, US org m W ID E Tokyo, JP edu

14 Part II: The Needs for Multilingualization

15 What is a Multilingualization? As the Internet continues to grow, many people around the world wish to go online using their native languages Making the Internet accessible to all peoples in their own native languages, regardless of what an individual s language may be. Khalid Fattal - CEO (MINC)

16 Demands for Multilingualism The availability of information in local languages and the development of local content are key elements to promote multilingualism on the Internet. Globalization of Internet has resulted a growing number of users not familiar with ASCII (English) Native speakers of many languages (e.g., Arabic, Persian, Urdu, Chinese, Japanese, Korean, Russian, ) who use non-ascii scripts are at disadvantage Language growth % 755% 668% 405% 459% 204% 100% 121% 83% 163% English Chinese Spanish Japanese French German Arabic Portuguese Korean Italian 16 Source:

17 Internet in the Arab World Population of Arab world: 343 M (5% of world population) Arab Internet users represent 3.2% of world users Average Internet penetration in Arab world < 13 % Language is considered a barrier for many (>70% are using Arabic OS) البحرين العراق االردن الكويت لبنان عمان فلسطين قطر السعودية سوريا االمارات اليمن الجزائر القمر جزر مصر ليبيا موريتانيا المغرب الصومال السودان تونس المجموع 60% 50% 40% 30% 20% 10% 0%

18 International Movements First phase of WSIS ( ) Declaration of Principles Internet should take into account multilingualism (B.48) Applications should be adapted to local needs in languages and cultures (B.51) Information society should be founded on respect for cultural identity, [as well as] cultural and linguistic diversity (B.52) Action Plan put into place technical conditions to facilitate the presence and use of all world languages on the internet (AP-B.6i) create policies that support, respect, preserve, promote, and enhance cultural and linguistic diversity within the Information Society (AP- C.23a) enhance the capacity of indigenous peoples to develop and access content in their own languages (AP-C.23k) Source:

19 International Movements UNESCO General Conference 2003 Recommendation concerning the Promotion and Use of Multilingualism and Universal Access to Cyberspace The public and private sectors and the civil society at local, national, regional and international levels should work to provide the necessary resources and take the necessary measures to alleviate language barriers and promote human interaction on the Internet by encouraging the creation and processing of, and access to, educational, cultural and scientific content in digital form, so as to ensure that all cultures can express themselves and have access to cyberspace in all languages, including indigenous ones. Member States and international organizations should encourage and support capacity-building for the production of local and indigenous content on the Internet. Member States should formulate appropriate national policies on the crucial issue of language survival in cyberspace, designed to promote the teaching of languages, including mother tongues, in cyberspace. International support and assistance to developing countries should be strengthened and extended to facilitate the development of freely accessible materials on language education in electronic form and to the enhancement of human capital skills in this area. Member States, international organizations and information and communication technology industries should encourage collaborative participatory research and development on, and local adaptation of, operating systems, search engines and web browsers with extensive multilingual capabilities, online dictionaries and terminologies. They should support international cooperative efforts with regard to automated translation services accessible to all, as well as intelligent linguistic systems such as those performing multilingual information retrieval, summarizing/abstracting and speech understanding, while fully respecting the right of translation of authors. UNESCO, in cooperation with other international organizations, should establish a collaborative online observatory on existing policies, regulations, technical recommendations, and best practices relating to multilingualism and multilingual resources and applications, including innovations in language computerization.

20 International Movements IGF - Reaching the Next Billion Some important dialog topics - Hyderabad, December 2008 The importance of having content in local languages, The importance of localization and availability of tools, including both software and hardware, for example, as well as keyboards and other devices, search engines, browsers, translation tools which should be available in multiple languages. Efforts to internationalize domain names and the technological difficulties as well as the complex policy and political aspects, According to speakers at IGF in Hyderabad, India The Internet must support the large number of languages in the world at all levels, including content, hardware, software, and internationalized domain names if it is to reach the next billion people. Source:

21 Part III: Internationalized Domain Names (IDN)

22 Internationalized Domain Names IDN represent a step toward an Internet that is equally accessible from all languages and scripts, they, at best, address only a small part of that very broad objective. - RFC 4690 Multilingualization Software IDN Contents Hardware

23 Why is it needed? Current ASCII-based DNs are incapable of representing Arabic characters Difficulty to reach Arabic sites using English DNs (pronunciation & spelling problems) Full Arabic DNs will encourage Arab users to widely use the Internet Arabic News paper صحيفة الشرق األوسط E-government Site يس ر

24 IDN is There are many technical challenges standing against accessing the Internet in non-latin script languages since the Internet was initially deployed to support only ASCII characters The Domain Name System (DNS) was not built for multilingualism ASCII characters daily: Allowed characters (LDH) : ASCII letters [A-Za-z] digits [0-9] hyphen [-] Hence, Internationalized Domain Name is a technical challenge to represent a domain name with NON-ASCII characters (i.e., using Unicode-based labels) ھيئة-االتصاالت.السعودية Only ASCII characters have been used even by people using non-

25 Technical Considerations Key technical issues: Representation of non-ascii codes Support location of non-ascii codes (client side or DNS server side) Mapping non-ascii domain names to current DNS technology Basic technical requirements Preservation of compatibility with current domain names Preservation of uniqueness of domain name space The Internet must not be divided into islands 25

26 IDNA Standard IDNA Standards (2003): Specifies how the conversion from non-ascii to ASCII At the user/application level (web browsers, clients) RFC 3454 Preparation of Internationalized Strings ("Stringprep") A generic mechanism (a collection of rules, tables, and operations) for taking a Unicode string and converting it into a canonical format. RFC 3490 Internationalizing Domain Names in Applications (IDNA) IDNA is the base specification in this group which specifies a mechanism for handling non-ascii labels RFC 3491 Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN) Processing rules that allow end users to enter IDNs into applications. RFC 3492 Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA) Punycode is a mechanism for encoding a Unicode string in ASCII characters. IDNA is currently under revision (IDNA200x) RFC4690 and associated internet drafts suggesting revisions and solutions to some problems

27 IDN History Late 1990s Multilingual domain names were developed at the National University of Singapore Asia Pacific Networking Group (idns, idomain) Several Prototypes and testbeds emerged Several companies began commercialization of the multilingual domain name technology IDN (Internationalized Domain Name) Working Group in IETF Country/regional organizations MINC (Multilingual Domain Names Consortium) AINC (Arabic Internet Names Consortium) Arabic Team for Domain names (2003) CDNC (Chinese Domain Name Consortium ) INFITT (International Forum for IT in Tamil ) JDNA (Japanese Domain Names Association ) Internationalized Domain Name (IDN) Working Group in ICANN Board IDNA adopted 2005 ICANN President's Advisory Committee for IDNs 2007 Eleven IDNA TLDs were added to the root nameservers in order to evaluate their use at the top level of the DNS 2008 ICANN successful evaluated.test IDN TLDs IDN cctld Fast-Track Process 27

28 موقع.السعودية IDNA xn--4gbrim.xn--mgberp4a5d4ar

29 Part IV: Contribution Methodology

30 Contribution Methodology 4 Conducting Web surveys 5 Meeting linguists (face to face) 3 Publishing Reports papers Understanding 1 problems & areas of contributions 6 Disseminate information 2 Group Efforts 7 Testing and building local experiences

31 1 Contribution Methodology Identifying areas of contributions Levels of an A-IDN Solution 1 Linguistic issues To define the accepted Arabic character set to be used for writing Arabic domain names 2 Arabic TLDs 3 Technical solutions 4 Arabic root servers To define the top-level domains of the Arabic domain name tree structure (i.e., Arabic gtlds, and cctlds) IETF (IDNA), UNICODE, ICANN/IANA, (e.g., IDN cctld Fast track)

32 1 Contribution Methodology Identifying areas of contributions Linguistic issues ISSUE 1.1: Tashkeel ISSUE 1.2: Kasheeda ISSUE 1.3: Taa-Marbota+Haa ISSUE 1.4: Hamzah Arabic TLDs ISSUE 1.5: Alif Maqsura+Ya ISSUE 1.6: Numbers ISSUE 1.7: dot or Arabic Zero ISSUE 1.8: Connecting Multiple Words ISSUE 1.9: Space ISSUE 1.10: Mixing Latin & Arabic Characters ISSUE 1.11: Special Charters ISSUE 1.12: Accepted Character Set ISSUE 2.1: Criteria for selecting an Arabic gtld ISSUE 2.2: Suggested list of Arabic gtlds ISSUE 2.3: Criteria for selecting an Arabic cctld ISSUE 2.4: Suggested list of Arabic cctlds

33 2 Contribution Methodology Groups Efforts MINC: Multilingual Internet Names Consortium, 2000 Arabic Working Group AINC: Arab Internet Names Consortium, April 2001 Founder and member of the board Chairman of the Linguistic Committee ADNTF: Arabic Domain Name Task Force, Q2/2003 Formed under the auspices of ESCWA (UN) Issuing an Internet Draft for supporting the Arabic language in domain names GCC cctlds Group: Formed under the auspices of ITC committee of GCC GCC Arabic domain name pilot project Arab Team for Arabic Domain Names, 2005 Formed under the auspices of Arab League Arabic domain name pilot project ASIWG: Arab Scrip IDN Working Group, 2008

34 3 Contribution Methodology Publishing Reports & Papers 5 Scientific research papers published in conference proceedings and journals "Arabic Top-Level Domain Names", International Journal of Computer Processing of Oriental Languages, Volume 17 Number 3 September 2004, To Appear. "Linguistic Issues in Arabic Domain Names", In Proceedings of the 17th NCC, KAAU, Al-Madina Almunawarah, Saudi Arabia, 5-8 April, 2004, pp [in Arabic] "Arabic Top-Level Domain Names", In Proceedings of the 17th NCC, KAAU, Al- Madina Almunawarah, Saudi Arabia, 5-8 April, 2004, pp [in Arabic] "Using Arabic Language in writing domain names", Arab journal of library and information science, Vol 22, No. 3, July 2002, pp [in Arabic]. " Using Arabic Language in writing domain names ", In Proceedings of IACIT 2001, JUST, Irbid, Jordan, Nov., 2001, pp [in Arabic] Technical reports Supporting the Arabic Language in Domain Names, submitted to ADNTF-ESCWA, October 2003 The base for the internet draft Status Report of the Arabic Linguistic Committee of AINC-September 2001 Status Report of the Arabic Linguistic Committee of AINC-April 2002 Internet drafts

35 4 Contribution Methodology Conducting Web Surveys 3 On-line web surveys cover most of the linguistic issues with more than 550 responses Collected information have been analyzed and compared with the recommendations of the AINC linguistic committee Results have been published and presented in conferences

36 5 Contribution Methodology Meeting Linguistic Experts SaudiNIC met with 4 Arabic linguists to get their guidance regarding the Arabic linguistic issues in domain names. Presented the result during the general assembly of the Arabic Language Association

37 6 Contribution Methodology Information Dissemination Web sites (in Arabic and English) Participating in local/regional/international conferences and meetings Publishing scientific research papers Publishing articles in newspaper and magazine Radio programs Seminars to public and interested groups

38 7 Country level Contribution Methodology Test Implementations Individually done be some Arab countries (cctlds) Arabic.English, e.g., com.sa.نطاق Problem of mixing languages (leftto-right and right-to-left) Not accepted (linguistic and socially) GCC level Implementing a pilot project for Arabic Domain names in the GCC Countries Arab world level Extend the GCC Pilot Project to include members of the Arab League. Renamed: "Arabic Domain Names Pilot Project, under the auspices of the Arab League 2 committees (Steering and technical) as part of the Arabic Team for Domain Names ICANN Arabic example.test (مثال. إختبار ( Moderate the Arabic site for the IDNwiki gateway Published a technical report about the test IDN Top Level Domain Evaluations and Testing Report

39 7 Contribution Methodology ADNPP: Participants so far Participated Countries: United Arab Emirates Saudi Arabia Qatar Oman Palestine Egypt Tunisia Syria Jordan Morocco Libya

40 7 Contribution Methodology Applications and tools Develop many tools and systems that supports Arabic domain names Browser plug-in Arabic.Arabic Simple IDN registry system IDN/ASCII convertering tool Convert domain names form IDN to ASCII and vice versa. Web-based Whois service. DNS checker for Arabic Domains: Check if an IDN domain name is hosted on any name servers. Host checker for Arabic Domains: Resolve IDN domains to the correspondent IP address and vice versa. Zone file editor for Arabic domains: Create and manage Arabic zone files easily using this zone editor.

41 Compare with:

42 Part V: The Arabic Language What s been done so far?

43 Arabic Language Characteristics 1. Written from right to left in a cursive style 2. Most characters have different shapes depending on their position (beginning, middle, or end) within a word probably conjugated with preceding and succeeding characters. These different shapes for a single character do not count as different code points but they are handled using different fonts Letters that can be joined are always joined in both hand-written and printed Arabic. ج ب ا ن ج ج ج ب ب ب ا ن ن ن ج ب ا ن جبان

44 Arabic Language Characteristics 3. Tashkeel (diacritic) A small sing (not a letter) that is usually put on top or under a character for the purpose of correct pronunciation It is not widely used except incase of the possibility of mispronouncing words that have the same letters but with different pronunciations, and hence having different meanings. 4. Abbreviations are not widely used When an abbreviation is written (in domain name) characters will be joined together leads to a different word and pronunciation 5. Two sets of numerals are used: Arabic : 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Arabic-Indic : Numerals are written from left to right ج ب ا ن ج ب ان

45 Arabic Language Characteristics 6. Words are separated by spaces A space character is needed between words to get correct shaping Connecting words without spaces not acceptable decrease readability Confused with other words space مدارسخيف مدارس خيف خطا دبي خط أدبي space

46 Basic Linguistic Issues Identified a number of linguistic issues with respect to domain names: Usage of Tashkeel (diacritics). Usage of Kashidah. Character folding. Which numerical digits should be supported. Connecting multiple words. Mixing with other languages. Usage of special characters. Discussed with more than 60 members of the AINC linguistic committee reaching final recommendations A number of surveys have been collected from the public Received more than 550 responses Collected information have been analyzed and compared with the recommendations of the AINC linguistic committee Discussed with Arabic linguists to get their guidance regarding the Arabic linguistic issues in domain names. Discussed and agreed by the Arab Team fro Domain Names (under Arab League)

47 Basic Linguistic Issues Tashkeel (diacritics) 17% 36% لا تدعم 3% No 44% و ق א W F א א א א E د K א W א א و و و א م ع و و א K א دعم في No effect الواجهة دعم كامل Yes لم يحدد Recommendation Tashkeel should not be allowed. However, if there is a need to allowed users to entered Tashkeel as part of a domain name then it should be stripped off before IDNA processing W ن و א א م א א אم א א א وא א م و د א

48 Basic Linguistic Issues Tashkeel (diacritics) From user point view, he/she does not know Whether the label is with or without tashkeel Whether the label is totally (i.e., all characters) use tashkeel The correct pronunciations to put the correct taskeel There are many different pronunciations depending on the local dialogue There will be huge number of combinations when registering a single label good playground for phishing جمعية-الحاسبات.السعودية ج م ع ي ة-الحاسبات.السعودية ج مع ي ة-الحاسبات.السعودية ج م ع ي ة-الحاسبات.السعودية جمع ية-الحاسبات.السعودية ج معي ة-الحاسبات.السعودية وزارة-التربية.سورية و ز ار ة -التربية.سورية و ز ار ة -التربية.سورية و ز ار ة-التربية.سورية و ز ار ة-التربية.سورية

49 Basic Linguistic Issues Kashidah (Tatweel) 30% لا تدعم 6% No 29% 35% Recommendation دعم كامل Yes It should not be allowed. دعم في No effect الواجهة و א و א م א E W F دون K W א א و م אK لم يحدد W م א אم א Kق א

50 Basic Linguistic Issues Character Folding ه + ة %29.2 %47.3 % % 12% كيف تعامل و + ؤ الحروف مختلفة ى + ي + ئ ا + (ا أ إ آ) %41.3 %35.2 %22.5 %30.6 %48 % % %28.4 %50 %21.6 نفس الحرف لم يحدد لا يدعم No دعم Yes لم يحدد Recommendation Folding should not be allowed. ل و א Fא و E وא א א وא א E F E F א א E אF א ل א אو وא אو א زF و E W و و א ن K א و א א د א ج א و و ن ذ אK א א א وא و א K د د وא א א א א و א و א K א ن ن و א א א א ن و א و א ق W

51 Basic Linguistic Issues Digits 46% 12% 27% 15% 0,1,2,3,,9 الا رقام كل المجموعتين both Recommendation If it is technically possible, it is preferred to support both (Latin and Arabic) sets with folding to one set. Otherwise, Latin set is sufficient 0, 1, 2, 3, א א אم א م א אF E F א א م א E, 9 K א W EKF א و E0F א א א و د א א א وא K א و و EOF א ن א א א E0F و א ElF وא د وא E1F و א د وאK א و و א א ن E F א א א אم א אن א وא א א א א و א א א وא א א א ن ن و א א א א وא وא א א و א K ن د א م א وא وא א م وא א م א K و ن ذ א אم א م א لم يحدد W

52 Basic Linguistic Issues Connecting Words 12% 3% 6% 46% 33% شرطة Dash فراغ Space دمج Joined أخرى Other ن א א א א א ق W م א אغ א د وא א א א م د א אغ א وא אم ز K א א? J? א א م א אغ א א א وא م א אغ و א E JFو א ن وאK א א ل א K א אغ א و א و א א א אغ وא و ن אم א אغ א و א א قK لم يحدد Recommendation Let technologies serve the language. Spaces should be supported to be part of a domain name. It is recommended that multiple words are separated by the character "-". א و א א ع ن ل F א א KE א د א אغK א א אم א אغ א אم א K א Y J[ W

53 Arabic Team for Domain Names Recommendations Tashkeel (Diacritics) Kasheeda Character folding: Teh Marbuta + Heh different forms of Hamzah Alif Maqsura+YaNumbers Numbers (numerical digits) Connecting Multiple Words Mixing Latin and Arabic Characters Special Characters #, $, %,...) Tashkeel should not be allowed. However, if there is a need to allowed users to entered it as part of a domain name then it should be stripped off by nameprep Kasheeda should be disallowed Folding should not be allowed If it is technically possible, it is preferred to support both (Latin and Arabic) sets with folding to one set. Otherwise, Latin set is sufficient It is recommended that multiple words are separated by the character "-". It is recommended that Arabic domain names be pure Arabic and they should not be mixed with other languages. It is recommended that Arabic domain names should follow the standard with respect to the use of special characters.

54 Accepted Character Set Table Development Principles Study and discuss the linguistic issues extensively (more than 3 years) by a team who work under the umbrella of Arab League Adhere to LDH convention Follow the inclusion mode recommended by the new IDNA standards Many characters are disallowed: symbols, punctuations, diacritics (tashkeel), Koranic annotation signs, honorifics, etc are not allowed. Arabic labels are not intended to write sentences or phrases The goal is to develop: A simple character set table that includes ONLY needed letters and digits Linguistic Guidelines for the Use of the Arabic Language in Internet Domains

55 Accepted Character Set Table Characters from Unicode Arabic Table ( FF) 0621 (ء) Arabic Letter HAMZA 0622 (آ) Arabic Letter ALEF with MADDA above 0623 (أ) Arabic Letter ALEF with HAMZA above 0624 (ؤ) Arabic Letter WAW with HAMZA above 0625 (إ) Arabic Letter ALEF with HAMZA below 0626 (ئ) Arabic Letter YEH with HAMZA above 0627 (ا) Arabic Letter ALEF 0628 (ب) Arabic Letter BEH 0629 (ة) Arabic Letter TEH MARBUTA 062A (ت) Arabic Letter TEH 062B (ث) Arabic Letter THEH 062C (ج) Arabic Letter JEEM 062D (ح) Arabic Letter HAH 062E (خ) Arabic Letter KHAH 062F (د) Arabic Letter DAL 0630 (ذ) Arabic Letter THAL 0631 (ر) Arabic Letter REH 0632 (ز) Arabic Letter ZAIN 0633 (س) Arabic Letter SEEN 0634 (ش) Arabic Letter SHEEN 0635 (ص) Arabic Letter SAD 0636 (ض) Arabic Letter DAD 0637 (ط) Arabic Letter TAH 0638 (ظ) Arabic Letter ZAH 0639 (ع) Arabic Letter AIN 063A (غ) Arabic Letter GHAIN 0641 (ف) Arabic Letter FEH 0642 (ق) Arabic Letter QAF 0643 (ك) Arabic Letter KAF 0644 (ل) Arabic Letter LAM 0645 (م) Arabic Letter MEEM 0646 (ن) Arabic Letter NOON 0647 (ھ) Arabic Letter HEH 0648 (و) Arabic Letter WAW 0649 (ى) Arabic Letter ALEF MAKSURA 064A (ي) Arabic Letter YEH 0660 (0) Arabic-Indic Digit Zero 0661 (1) Arabic-Indic Digit One 0662 (2) Arabic-Indic Digit Two 0663 (3) Arabic-Indic Digit Three 0664 (4) Arabic-Indic Digit Four 0665 (5) Arabic-Indic Digit Five 0666 (6) Arabic-Indic Digit Six 0667 (7) Arabic-Indic Digit Seven 0668 (8) Arabic-Indic Digit Eight 0669 (9) Arabic-Indic Digit Nine

56 Accepted Character Set Table Characters from Unicode Basic Latin Table ( F): 0030 (0) Digit Zero 0031 (1) Digit One 0032 (2) Digit Two 0033 (3) Digit Three 0034 (4) Digit Four 0035 (5) Digit Five 0036 (6) Digit Six 0037 (7) Digit Seven 0038 (8) Digit Eight 0039 (9) Digit Nine 002D (-) Hyphen-Minus 002E (.) Full Stop (Dot)

57

58 Part VI: Let us expand and look at the whole script. What are the issues?

59 About Arabic Script The 2 nd most widely used alphabetic writing system in the world Used by many languages such as: Persian, Urdu, Turkish, Kurdish, Pashto, Jawi, It is widely used by more than 43 countries more than one billion potential users could be concerned in using Arabic script domain names.

60 Accepted characters for Arabic, Persian, Urdu, Pashto, Jawi

61 Arabic Script IDN - Major Issues 1. Acceptable/disallowed characters IDNA200x table (Pvalid /Disallowed /ContextO) Language tables 2. Combining Marks 3. Diacritics 4. World/label separators (space, ZWNJ, ZWJ, hyphen) 5. Digits 6. Confusing similar characters (e.g. variant tables) 7. Bidirectional

62 Issues Need Further Investigations 1. Valid Unicode Codepoints ; CONTEXTO # ARABIC NUMBER SIGN..ARABIC SIGN SAFHA A ; UNASSIGNED # <reserved>..<reserved> 060B..060F ; DISALLOWED # AFGHANI SIGN..ARABIC SIGN MISRA ; PVALID # ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..AR A ; UNASSIGNED # <reserved>..<reserved> 061B ; DISALLOWED # ARABIC SEMICOLON 061C..061D ; UNASSIGNED # <reserved>..<reserved> 061E..061F ; DISALLOWED # ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC Q 0620 ; UNASSIGNED # <reserved> A ; PVALID # ARABIC LETTER HAMZA..ARABIC LETTER GHAIN 063B..063F ; UNASSIGNED # <reserved>..<reserved> E ; PVALID # ARABIC TATWEEL..ARABIC FATHA WITH TWO DOTS 065F ; UNASSIGNED # <reserved> ; PVALID # ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT 066A..066D ; DISALLOWED # ARABIC PERCENT SIGN..ARABIC FIVE POINTED STA 066E ; PVALID # ARABIC LETTER DOTLESS BEH..ARABIC LETTER HIG ; DISALLOWED # ARABIC LETTER HIGH HAMZA ALEF..ARABIC LETTER D3 ; PVALID # ARABIC LETTER TTEH..ARABIC LETTER YEH BARREE 06D4 ; DISALLOWED # ARABIC FULL STOP 06D5..06DC; PVALID # ARABIC LETTER AE..ARABIC SMALL HIGH SEEN 06DD ; CONTEXTO # ARABIC END OF AYAH 06DE ; DISALLOWED # ARABIC START OF RUB EL HIZB 06DF..06E8 ; PVALID # ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL 06E9 ; DISALLOWED # ARABIC PLACE OF SAJDAH 06EA..06FC ; PVALID # ARABIC EMPTY CENTRE LOW STOP..ARABIC LETTER 06FD..06FE ; DISALLOWED # ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SI 06FF ; PVALID # ARABIC LETTER HEH WITH INVERTED V Source: draft-faltstrom-idnbis-tables-05.txt

63 Disallowed characters by IDNA200X

64 Recommended Disallowed characters

65 Issues Need Further Investigations 2. Combining Marks The use of combining marks with some base characters would confuse with other character. Combining maddah and hamza Other combining marks E ا ا ا ا ا ا ا U U+0654 = U+0627 U+0654 is confusing with U = is confusing with ح ح ح ح ځ ح U+062d U+0654 U+062d U+0654 U+0681

66 Issues Need Further Investigations 3. Diacritics (Tashkeel) Tashkeel Points: about 9 064B-0652, 0670 From user point view, he/she does not know Whether the label is with or without tashkeel Whether the label is totally (i.e., all characters) use tashkeel The correct pronunciations to put the correct taskeel There are many different pronunciations depending on the local dialogue There will be huge number of combinations when registering a single label good playground for phishing جمعية ج م ع ي ة ج مع ي ة ج م ع ي ة ج مع ي ة جمعية جمعية جمعية ج معية جم عية

67 Issues Need Further Investigations 4. ZWNJ & ZWJ Control Characters The support of ZWJ and ZWNJ in domain names would result in some confusions Their use and concepts are not known to regular users

68 Issues Need Further Investigations 4. ZWNJ & ZWJ Control Characters ZWNJ: Visually noticed حبل ح بل input[0] = U+062d input[1] = U+200c input[2] = U+0628 input[3] = U+0644 input[0] = U+062d input[1] = U+0628 input[2] = U+0644

69 Issues Need Further Investigations 4. ZWNJ & ZWJ Control Characters ZWNJ: Visually Unnoticed ط بل input[0] = U+0637 input[1] = U+200c input[2] = U+0628 input[3] = U+0644 طبل input[0] = U+0637 input[1] = U+0628 input[2] = U+0644

70 Issues Need Further Investigations 4. ZWNJ & ZWJ Control Characters ZWJ Visually Unnoticed مجمع-الرباط-الدولي مجمع-الرباط-الدولي input[0] = U+0645 input[1] = U+062c input[2] = U+0645 input[3] = U+0639 input[4] = U+200d input[5] = U+002d input[6] = U+0627 input[7] = U+0644 input[8] = U+0631 input[9] = U+0628 input[10] = U+0627 input[11] = U+0637 input[12] = U+200d input[13] = U+002d input[14] = U+0627 input[15] = U+0644 input[16] = U+062f input[17] = U+0648 input[18] = U+0644 input[19] = U+064a ZWJ Visually noticed input[0] = U+0645 input[1] = U+062c input[2] = U+0645 input[3] = U+0639 input[4] = U+002d input[5] = U+0627 input[6] = U+0644 input[7] = U+0631 input[8] = U+0628 input[9] = U+0637 input[10] = U+002d input[11] = U+0627 input[12] = U+0644 input[13] = U+062f input[14] = U+0648 input[15] = U+0644 input[16] = U+064a

71 Issues Need Further Investigations 5. Digits Arabic-Indic VS. Eastern Arabic-Indic digits ٩ ٨ ٧ ۶ ۵ ۴ ٣ ٢ ١ ٠ ۱۲۳۷۸۹۰ ١٢٣٧٨٩٠ input[0] = U+06f1 input[1] = U+06f2 input[2] = U+06f3 input[3] = U+06f7 input[4] = U+06f8 input[5] = U+06f9 input[6] = U+06f0 input[0] = U+0661 input[1] = U+0662 input[2] = U+0663 input[3] = U+0667 input[4] = U+0668 input[5] = U+0669 input[6] = U+0660

72 Issues Need Further Investigations 5. Digits Arabic-Indic vs. European-Arabic digits Windows has supported number substitution by allowing the representation of different cultural shapes for the same digits while keeping the internal storage of these digits unified among different locales, for example numbers are stored in their well known hexadecimal values, 0x40, 0x41, but displayed according to the selected language. Source: م 12 م input[0] = U+0645 input[1] = U+0031 input[2] = U+0032 input[3] = U+0645 م ١٢ م input[0] = U+0645 input[1] = U+0661 input[2] = U+0662 input[3] = U+0645

73 Issues Need Further Investigations 6. Similar Shape Characters There are a number of groups of characters that have the same shapes, eg. Kaf, Heh, Yeh, Alef, groups

74

75 Issues Need Further Investigations 6. Similar Shape Characters كلمني کلمني کلمنې input[0] = U+06a9 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+06d0 input[0] = U+06a9 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+064a input[0] = U+0643 input[1] = U+0644 input[2] = U+0645 input[3] = U+0646 input[4] = U+064a

76 Issues Need Further Investigations 6. Similar Shape Characters کلی كلى کلۍ input[0] = U+06a9 input[1] = U+0644 input[2] = U+06cd input[0] = U+06a9 input[1] = U+0644 input[2] = U+06cc input[0] = U+0643 input[1] = U+0644 input[2] = U+0649

77

78 Issues Need Further Investigations 7. Bidirectional Behavior Arabic script domains will include characters that are LTR (e.g., dash, dot, AE digits) Exiting IDNA2003 (stringprep) assumes that the first and last characters of the label must be RandALCat characters.

79 We have long journey to reach user satisfaction! The Internet must support the large number of languages in the world at all levels, including content, hardware, software, and IDN if it is to reach the next billion people according to IGF speakers

80 Thank you شكككك راااا شککککراااا xn--mgbti28b input[0] = U+0634 input[1] = U+06a9 input[2] = U+0631 input[3] = U+0627 xn--mgbti4d input[0] = U+0634 input[1] = U+0643 input[2] = U+0631 input[3] = U+0627

Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages

Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages Proposed Solution for Writing Domain Names in Different Arabic Script Based Languages TF-AIDN, June/2014 Presented by AbdulRahman I. Al-Ghadir Researcher in SaudiNIC Content What we have done so far? Problem

More information

Writing Domain Names in Different Arabic Script Based Languages

Writing Domain Names in Different Arabic Script Based Languages Writing Domain Names in Different Arabic Script Based Languages Language VS. Script ICANN Regional Meeting, Dubai April 1-3 2008 Dr. Abdulaziz H. Al-Zoman Director of SaudiNIC - CITC Chairman of Steering

More information

Arabic Text Segmentation

Arabic Text Segmentation Arabic Text Segmentation By Dr. Salah M. Rahal King Saud University-KSA 1 OCR for Arabic Language Outline Introduction. Arabic Language Arabic Language Features. Challenges for Arabic OCR. OCR System Stages.

More information

About SaudiNIC. What we have done. What is Next. Lessons learned

About SaudiNIC. What we have done. What is Next. Lessons learned About SaudiNIC What we have done What is Next Lessons learned SaudiNIC is a non-profit unit that is operated by Communication and Information Technology Commission (CITC) which is a semigovernmental entity.

More information

Character Set Supported by Mehr Nastaliq Web beta version

Character Set Supported by Mehr Nastaliq Web beta version Character Set Supported by Mehr Nastaliq Web beta version Sr. No. Character Unicode Description 1 U+0020 Space 2! U+0021 Exclamation Mark 3 " U+0022 Quotation Mark 4 # U+0023 Number Sign 5 $ U+0024 Dollar

More information

Arabic Domain Names (ADN) Pilot Project

Arabic Domain Names (ADN) Pilot Project Joint UNESCO and ITU Global Symposium on Promoting the Multilingual Internet Arabic Domain Names (ADN) Pilot Project Imad Al-Sabouni Advisor to the Minister of Communications and Technology, Syria Vice

More information

Nafees Nastaleeq v1.01 beta

Nafees Nastaleeq v1.01 beta Nafees Nastaleeq v1.01 beta Release Notes November 07, 2007 CENTER FOR RESEARCH IN URDU LANGUAGE PROCESSING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES, LAHORE PAKISTAN Table of Contents 1 Introduction...4

More information

About SaudiNIC. What we have done.. ( ) What is Next.. ( ) Lessons learned

About SaudiNIC. What we have done.. ( ) What is Next.. ( ) Lessons learned About SaudiNIC What we have done.. ( ) What is Next.. ( ) Lessons learned SaudiNIC is a non-profit unit that is operated by Communication and Information Technology Commission (CITC) which is a semi-governmental

More information

1 See footnote 2, below.

1 See footnote 2, below. To: UTC From: Azzeddine Lazrek, Cadi Ayyad University, Marrakesh, Morocco (with Debbie Anderson, SEI, UC Berkeley, and with assistance from Murray Sargent, Laurentiu Iancu, and others) RE: Arabic Math

More information

The GCC Pilot Project for Arabic Domain Names..kw.qa.om.sa.bh.ae

The GCC Pilot Project for Arabic Domain Names..kw.qa.om.sa.bh.ae The GCC Pilot Project for Arabic Domain Names Raed Al-Fayez Head of the GCC Pilot Project Technical Taskforce SaudiNIC raed@isu.net.sa.kw.qa.om.sa.bh.ae Agenda Characteristics of A Domain Name IDN and

More information

Overview & Update. Manager, Regional Relations Middle East. Internet Festival Hammamet, Tunis August 2008

Overview & Update. Manager, Regional Relations Middle East. Internet Festival Hammamet, Tunis August 2008 Internationalized ti Domain Names: Overview & Update Baher Esmat Manager, Regional Relations Middle East Internet Festival Hammamet, Tunis August 2008 1 Introduction to IDNs IDN stands for Internationalized

More information

Proposed keyboard layout for Swahili in Arabic script

Proposed keyboard layout for Swahili in Arabic script أ Proposed keyboard layout for Swahili in Arabic script Kevin Donnelly kevin@dotmon.com Version 0.1, March 2010 Introduction Swahili was originally written in Arabic script in its area of origin (the littoral

More information

Improved Method for Sliding Window Printed Arabic OCR

Improved Method for Sliding Window Printed Arabic OCR th Int'l Conference on Advances in Engineering Sciences & Applied Mathematics (ICAESAM'1) Dec. -9, 1 Kuala Lumpur (Malaysia) Improved Method for Sliding Window Printed Arabic OCR Prof. Wajdi S. Besbas

More information

Arabic Script IDN Working Group (ASIWG)

Arabic Script IDN Working Group (ASIWG) Arabic Script IDN Working Group (ASIWG) ICANN Paris IDN Workshop 26 Jun08 Spot The Difference They Look The Same To Us But Not To A Computer When 1 is not 1 The Arabic Language is only a part of the Arabic

More information

Nafees Nastaleeq v1.02 beta

Nafees Nastaleeq v1.02 beta Nafees Nastaleeq v1.02 beta Release Notes September 24, 2008 CENTER FOR RESEARCH IN URDU LANGUAGE PROCESSING NATIONAL UNIVERSITY OF COMPUTER AND EMERGING SCIENCES, LAHORE PAKISTAN Table of Contents 1 Introduction...4

More information

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name

Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs. A. Generating Arabic glyphs from the Schematic Name Proposal for changes to ArabicShaping.txt to allow machine generation of Arabic fonts and glyphs by Adil Allawi, Diwan Software Limited adil@diwan.com Introduction One of the big problems for Arabic text

More information

SaudiNIC s Proposed Solution. MENOG 8, Khobar, May 14-18, 2011

SaudiNIC s Proposed Solution. MENOG 8, Khobar, May 14-18, 2011 SaudiNIC s Proposed Solution MENOG 8, Khobar, May 14-18, 2011 Arabic Script Major Issues Confusing Similar Characters Proposed solution Characteristics Language-level required tables Language-level required

More information

Internationalized Domain Names

Internationalized Domain Names Internationalized Domain Names Introduction & Update MENOG 1 Bahrain April 3-5, 2007 By: Baher Esmat Middle East Liaison IP and DNS Internet 207.248.168.180 ISP icann.org 192.0.34.163 ISP Backbone ISP

More information

L2/11-033R 1 Introduction

L2/11-033R 1 Introduction To: UTC and ISO/IEC JTC1/SC2 WG2 Title: Proposal to add ARABIC MARK SIDEWAYS NOON GHUNNA From: Lorna A. Priest (SIL International) Date: 10 February 2011 1 Introduction ARABIC MARK SIDEWAYS NOON GHUNNA

More information

Domain Names in Pakistani Languages. IDNs for Pakistani Languages

Domain Names in Pakistani Languages. IDNs for Pakistani Languages ا ہ 6 5 a ز @ ں ب Domain Names in Pakistani Languages س a ی س a ب او اور را < ہ ر @ س a آف ا ر ا 6 ب 1 Domain name Domain name is the address of the web page pg on which the content is located 2 Internationalized

More information

ICANN November Tina Dam Director, IDN Program

ICANN November Tina Dam Director, IDN Program ICANN 33 6 November 2008 Tina Dam Director, IDN Program IDN SLD registrations since 2001 (testbed) 2003(protocol) 2 IDNs what a year! Fast Track Draft Plan for public comments Outstanding key issues: Relation

More information

Umbrella. Branding & Guideline

Umbrella. Branding & Guideline Umbrella. Branding & Guideline OUR LOGO. OUR COLORS. #FFFFFF Font COLOR #2A3942 #64A0C6 mix color C: 75% M: 68% Y: 67% K: 90% H: 320 S:61% B:0 R:0 G:0 B:0 C: 75% M: 68% Y: 67% K: 90% H: 320 S:61% B:0 R:0

More information

Universal Acceptance Technical Perspective. Universal Acceptance

Universal Acceptance Technical Perspective. Universal Acceptance Universal Acceptance Technical Perspective Universal Acceptance Warm-up Exercise According to w3techs, which of the following pie charts most closely represents the fraction of websites on the Internet

More information

Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore

Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore Sarmad Hussain Sarmad Hussain Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore www.cle.org.pk sarmad@cantab.net Arabic Script

More information

It Internationalized ti Domain Names W3C Track: An International Web

It Internationalized ti Domain Names W3C Track: An International Web It Internationalized ti Domain Names W3C Track: An International Web Tina Dam ICANN Director, IDN Program tina.dam@icann.org 17th International World Wide Web Conference, WWW2008 Beijing International

More information

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant)

ICANN IDN TLD Variant Issues Project. Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ICANN IDN TLD Variant Issues Project Presentation to the Unicode Technical Committee Andrew Sullivan (consultant) ajs@anvilwalrusden.com I m a consultant Blame me for mistakes here, not staff or ICANN

More information

Trusted Future of Internet with IDN and UA

Trusted Future of Internet with IDN and UA Trusted Future of Internet with IDN and UA GFCE Delhi 12 th Oct 2018 Dr. Ajay Data Co-Chair NBGP Coordinator EAI UASG ccnso Council Member Oct25th 1 ASCII Domain Name Label www.cafe-123.com Third Level

More information

OUR LOGO. SYMBOL LOGO SYMBOL LOGO ORIGINAL STRUCTURE

OUR LOGO. SYMBOL LOGO SYMBOL LOGO ORIGINAL STRUCTURE OUR LOGO. ORIGINAL STRUCTURE SYMBOL LOGO SYMBOL LOGO OUR COLORS. Infographic Color 2A3942 FED708 2A3942 E77525 804D9F CLEAR SPACE. PRINT SAFE AREA MINIMUM SIZE - PRINT H: 30 pt ONLINE SAFE AREA MINIMUM

More information

Minutes of Workshop May 15-16, 2009 Version 0.3

Minutes of Workshop May 15-16, 2009 Version 0.3 The second workshop on Internationalized Domain Names for Local Content Development in Pakistani Languages was organized by the Ministry of IT and Telecom on May 15-16, 2009 at the Center for Research

More information

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek

Others Symbols, Additional characters proposed to Unicode. Azzeddine Lazrek JTC1/SC2/WG2 N 3088 Others Symbols, Additional characters proposed to Unicode Azzeddine Lazrek lazrek@ucam.ac.ma Cadi Ayyad University, Faculty of Sciences P.O. Box 2390, Marrakech, Morocco Phone: +212

More information

CNNIC Contributes in Internationalized Domain Name

CNNIC Contributes in Internationalized Domain Name CNNIC Contributes in Internationalized Domain Name What s Ahead What are IDNs? The need for IDN Pass, present, future of IDN What should we do? What Are IDNs The Concept Internationalized Domain Names

More information

IDN - what s up? Patrik Fältström

IDN - what s up? Patrik Fältström IDN - what s up? Patrik Fältström paf@cisco.com 1 Old stuff (what is IDNA) What is it? What implications do we get? IDNA uses Unicode 3.2 2 Protocol issues Old protocols can only handle a subset of US-

More information

qatar national day 2017 brand guidelines 2017

qatar national day 2017 brand guidelines 2017 2017 brand guidelines 2017 the following guidelines demonstrate how best to apply the brand 2 CONTENTS 3 contents p5. vision & mission p7. logo p8. logo rationale p9. logo clear space p10. logo do s p11.

More information

Modeling Nasta leeq Writing Style

Modeling Nasta leeq Writing Style Modeling Nasta leeq Writing Style Aamir Wali National University of Computer and Emerging Sciences Overview: Urdu اب پ ت ٹ ث ج چ ح خ د ڑ ڈ ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن ه ء ی ے ہ ں و In Urdu, a

More information

Overview. Coordinating with our partners, we help make the Internet work.

Overview. Coordinating with our partners, we help make the Internet work. ICANN Update Champika Wijayatunga Regional Security Engagement Manager Asia Pacific TWNIC OPM / TWNOG 27-28 November 2018 1 Overview Coordinating with our partners, we

More information

Identity Guidelines. December 2012

Identity Guidelines. December 2012 Identity Guidelines December 2012 Identity Guidelines Contents 1.0 Our Logo Our logo Our wordmark Colour treatments Clear space, large and small sizes Correct logo placement Incorrect logo usage 2.0 Colour

More information

ISO/IEC JTC 1/SC 2. Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat JTC 1 Plenary 2012/11/05-10, Jeju

ISO/IEC JTC 1/SC 2. Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat JTC 1 Plenary 2012/11/05-10, Jeju ISO/IEC JTC 1/SC 2 Yoshiki MIKAMI, SC 2 Chair Toshiko KIMURA, SC 2 Secretariat 2012 JTC 1 Plenary 2012/11/05-10, Jeju what is new Work items ISO/IEC 10646 2 nd ed. 3 rd ed. (2012) ISO/IEC 14651 Amd.1-2

More information

REEM READYMIX Brand Guideline

REEM READYMIX Brand Guideline REEM READYMIX Brand Guideline Implementing Reem Readymix brand in communications V.I - February 2018 Introduction Reem Readymix is a leading supplier of all types of readymix concrete and cementbased plastering

More information

Recognition of secondary characters in handwritten Arabic using Fuzzy Logic

Recognition of secondary characters in handwritten Arabic using Fuzzy Logic International Conference on Machine Intelligence (ICMI 05), Tozeur, Tunisia, 2005 Recognition of secondary characters in handwritten Arabic using Fuzzy Logic Mohammed Zeki Khedher1 Ghayda Al-Talib2 1 Faculty

More information

Internationalized Domain Names Variant Issues Project

Internationalized Domain Names Variant Issues Project Internationalized Domain Names Variant Issues Project 1 P a g e 1. Background Internationalized Domain Names Variant Issues Project Arabic Variant TLD Issues and Requirements This document identifies issues

More information

Proposal to encode productive Arabic-script modifier marks

Proposal to encode productive Arabic-script modifier marks Proposal to encode productive Arabic-script modifier marks Date: May 16, 2003 Author: Jonathan Kew, SIL International Kamal Mansour, Agfa Monotype Mark Davis, IBM Address: Horsleys Green High Wycombe Bucks

More information

IDN - the protocol. Patrik Fältström

IDN - the protocol. Patrik Fältström IDN - the protocol Patrik Fältström paf@cisco.com 1 In the beginning 3454 Preparation of Internationalized Strings ("stringprep"). P. Hoffman, M. Blanchet. December 2002. (Format: TXT=138684 bytes) (Status:

More information

Universal Acceptance An Update

Universal Acceptance An Update Universal Acceptance An Update Don Hollander / GDD Summit / May 2016 Universal Acceptance Universal Acceptance Universal Acceptance is the state where all valid domain names and email addresses are accepted,

More information

ICANN PacNOG 11

ICANN PacNOG 11 ICANN Update @ PacNOG 11 Savenaca Vocea Nadi, 2 June 2012 The mission of The Internet Corporation for Assigned Names and Numbers ("ICANN ) To coordinate, at the overall level, the global Internet's systems

More information

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines BRAND GUIDELINES THE LOGO Guidelines LOGO SYMBOL TYPEFACE 2 COLOR SCHEME When do I use the full-color logo? Use the full-color logo as frequently as possible to maximize and strengthen the brand. PRIMARY

More information

Internationalized Domain Names an introduction

Internationalized Domain Names an introduction Internationalized Domain Names an introduction Tina Dam Director, IDN Program 1 March 2009 Agenda Where are we and where are we headed IDN TLD Processes IDN Definitions How does IDNs work including examples

More information

Internationalized Domain Names

Internationalized Domain Names Internationalized Domain Names Fahd Batayneh Middle East DNS Forum 2018 26 April 2018 Agenda 1 2 3 ICANN s IDN Program Universal Acceptance Initiative Task Force on Arabic Script IDNs (TF- AIDN) 2 ICANN

More information

VOL. 3, NO. 7, Juyl 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

VOL. 3, NO. 7, Juyl 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Arabic Hand Written Character Recognition Using Modified Multi-Neural Network Farah Hanna Zawaideh Irbid National University, Computer Information System Department dr.farahzawaideh@inu.edu.jo ABSTRACT

More information

ICANN Overview & Global and Strategic Partnerships

ICANN Overview & Global and Strategic Partnerships ICANN Overview & Global and Strategic Partnerships Theresa Swinehart Vice President Global and Strategic Partnerships ICANN Baher Esmat Manager, Regional Relations Middle East ICANN Dubai, UAE 1 April,

More information

SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS

SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS 31 st December 016. Vol.94. No. 005-016 JATIT & LLS. All rights reserved. ISSN: 199-8645 www.jatit.org E-ISSN: 1817-3195 SCALE-SPACE APPROACH FOR CHARACTER SEGMENTATION IN SCANNED IMAGES OF ARABIC DOCUMENTS

More information

DNS and ICANN. Laurent Ferrali. 27th August 2018

DNS and ICANN. Laurent Ferrali. 27th August 2018 Laurent Ferrali 27th August 2018 DNS and ICANN ITU Annual Regional Human Capacity Building Workshop on Strengthening Capacities in Internet Governance in Africa, Abuja, Nigeria 1 DNS? 2 Unique Names and

More information

# ICANN/ISOC cctld workshop # October 2006 # Sofia, Bulgaria. # implementing IDNs. Andrzej Bartosiewicz

# ICANN/ISOC cctld workshop # October 2006 # Sofia, Bulgaria. # implementing IDNs. Andrzej Bartosiewicz # ICANN/ISOC cctld workshop # 24-26 October 2006 # Sofia, Bulgaria # implementing IDNs Andrzej Bartosiewicz andrzejb@nask.pl # schedule for.pl. August the 11 th, 2003: NASK s IETF draft September the 11

More information

2011 International Conference on Document Analysis and Recognition

2011 International Conference on Document Analysis and Recognition 20 International Conference on Document Analysis and Recognition On-line Arabic Handwrittenn Personal Names Recognition System based b on HMM Sherif Abdelazeem, Hesham M. Eraqi Electronics Engineering

More information

Introduction to International Domain Names for Applications (IDNA)

Introduction to International Domain Names for Applications (IDNA) White Paper Introduction to International Domain Names for Applications (IDNA) diamondip.com by Timothy Rooney Product management director BT Diamond IP for Applications (IDNA) By Tim Rooney, Director,

More information

Root Server System Advisory Committee

Root Server System Advisory Committee Root Server System Advisory Committee Jun Murai, Chair of RSSAC ICANN Public meeting June 28, 2002 Bucharest, RO DNS Tree Root Name Servers root (dot) TLD Name Servers jp ro com org ac ad co or kyoto-u

More information

Internationalized Domain Names New gtld Program

Internationalized Domain Names New gtld Program Internationalized Domain Names New gtld Program Doug Brent Chief Operating Officer Hong Kong 24 July 2009 Karla Valente Director New gtld Program 0 Agenda Internationalized Domain Names (IDNs) defined

More information

The IDN Variant TLD Program: Updated Program Plan 23 August 2012

The IDN Variant TLD Program: Updated Program Plan 23 August 2012 The IDN Variant TLD Program: Updated Program Plan 23 August 2012 Table of Contents Project Background... 2 The IDN Variant TLD Program... 2 Revised Program Plan, Projects and Timeline:... 3 Communication

More information

Segmentation and Recognition of Arabic Printed Script

Segmentation and Recognition of Arabic Printed Script Institute of Advanced Engineering and Science IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 2, No. 1, March 2013, pp. 20~26 ISSN: 2252-8938 20 Segmentation and Recognition of Arabic

More information

ICANN Update at PacNOG15

ICANN Update at PacNOG15 ICANN Update at PacNOG15 Save Vocea GSE, RVP Oceania 14 July 2014, Port Vila, VU Overview NTIA-IANA function stewardship transition ICANN update from ICANN 50 Participation in ICANN NTIA IANA stewardship

More information

A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY

A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY A MELIORATED KASHIDA-BASED APPROACH FOR ARABIC TEXT STEGANOGRAPHY Ala'a M. Alhusban and Jehad Q. Odeh Alnihoud Computer Science Dept, Al al-bayt University, Mafraq, Jordan ABSTRACT Steganography is an

More information

Internationalized Domain Names. 21 June 2009 Tina Dam Sr. Director, IDNs

Internationalized Domain Names. 21 June 2009 Tina Dam Sr. Director, IDNs Internationalized Domain Names 21 June 2009 Tina Dam Sr. Director, IDNs Agenda Where are we and where are we headed IDN TLD Processes IDN Definitions How does IDNs work including examples of applications

More information

ISeCure. The ISC Int'l Journal of Information Security. High Capacity Steganography Tool for Arabic Text Using Kashida.

ISeCure. The ISC Int'l Journal of Information Security. High Capacity Steganography Tool for Arabic Text Using Kashida. The ISC Int'l Journal of Information Security July 2010, Volume 2, Number 2 (pp. 107 118) http://www.isecure-journal.org High Capacity Steganography Tool for Arabic Text Using Kashida Adnan Abdul-Aziz

More information

World Summit on the Information Society (WSIS) and the Digital Divide

World Summit on the Information Society (WSIS) and the Digital Divide World Summit on the Information Society (WSIS) and the Digital Divide Dr Tim Kelly, Head, Strategy and Policy Unit International Telecommunication Union KADO/APWINC Digital Opportunity Conference, Seoul,

More information

SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY

SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY SHARP-EDGES METHOD IN ARABIC TEXT STEGANOGRAPHY 1 NUUR ALIFAH ROSLAN, 2 RAMLAN MAHMOD, NUR IZURA UDZIR 3 1 Department of Multimedia, FSKTM, UPM, Malaysia-43400 2 Prof, Department of Multimedia, FSKTM,

More information

ISSUES PAPER Selection of IDN cctlds associated with the ISO two letter codes

ISSUES PAPER Selection of IDN cctlds associated with the ISO two letter codes ISSUES PAPER Selection of IDN cctlds associated with the ISO 3166-1 two letter codes Background: In the DNS, a cctld string (like.jp,.uk) has been defined to represent the name of a country, territory

More information

1. Brand Identity Guidelines.

1. Brand Identity Guidelines. 1. Brand Identity Guidelines 1.1 HCT Logo 2. Secondary left aligned for English language literature 1. Primary centre aligned stacked formal 3. Secondary right aligned for Arabic language literature 4.

More information

# ICANN/ISOC cctld workshop # October 2006 # Sofia, Bulgaria. # implementing IDNs. Andrzej Bartosiewicz

# ICANN/ISOC cctld workshop # October 2006 # Sofia, Bulgaria. # implementing IDNs. Andrzej Bartosiewicz # ICANN/ISOC cctld workshop # 24-26 October 2006 # Sofia, Bulgaria # implementing IDNs Andrzej Bartosiewicz andrzejb@nask.pl # schedule for.pl. pl August the 11 th, 2003: NASK s IETF draft September the

More information

Plenipotentiary Conference (PP- 14) Busan, 20 October 7 November 2014

Plenipotentiary Conference (PP- 14) Busan, 20 October 7 November 2014 Plenipotentiary Conference (PP- 14) Busan, 20 October 7 November 2014 WORKING GROUP OF THE PLENARY Document DT/76- E 3 November 2014 Original: English WORKING GROUP OF THE PLENARY DRAFT RESOLUTION ITU'S

More information

ICANN Staff Presentation

ICANN Staff Presentation ICANN Staff Presentation CENTR Meeting Pisa 12-10-99 Louis Touton Andrew McLaughlin The Basic Bargain ICANN = Internationalization of Policy Functions for DNS and IP Addressing systems + Private Sector

More information

Related to the Internet

Related to the Internet International Public Policy Issues Related to the Internet and the Role of the Governments Regional Follow-up to the Outcome of the World Summit on the Information Society Marco Obiso ICT Applications

More information

IDN Registrar Perspective

IDN Registrar Perspective IDN Registrar Perspective ccnso TechDay, 20 June 2011 Presented by Janna Lam Copyright 2010 IP Mirror Private Limited. All rights reserved. IDN Fast Track Process First IDN cctld approved on 22 Apr 2010

More information

Internationalized Domain Names. Tina Dam, Director, IDN Program 3 March 2009

Internationalized Domain Names. Tina Dam, Director, IDN Program 3 March 2009 Internationalized Domain Names Tina Dam, Director, IDN Program tina.dam@icann.org 3 March 2009 IDN Discussions this week Community discussions working in the ICANN model At Large, ccnso, GAC, GNSO, constituency,

More information

Internationalized Domain Names for. Applications (IDNA) 12/20/2016 1

Internationalized Domain Names for. Applications (IDNA) 12/20/2016 1 Internationalized Domain Names for Applications (IDNA) 12/20/2016 1 Agenda Understanding basic Domain Name System terms About IDNA protocol IDNs in Indian Language Perspective 12/20/2016 2 Domain Name

More information

Handling of Variants. Lucy Wang (On behalf of CDNC) August 20, 2009

Handling of Variants. Lucy Wang (On behalf of CDNC) August 20, 2009 Handling of Variants Lucy Wang (On behalf of CDNC) August 20, 2009 -Universal Declaration of Human Rights Content The origin and facts of the variant issue How CDNC handles the issue CDNC Support and Petition

More information

Strings 20/11/2018. a.k.a. character arrays. Strings. Strings

Strings 20/11/2018. a.k.a. character arrays. Strings. Strings ECE 150 Fundamentals of Programming Outline 2 a.k.a. character arrays In this lesson, we will: Define strings Describe how to use character arrays for strings Look at: The length of strings Copying strings

More information

Proposed Service. Name of Proposed Service: Technical description of Proposed Service: Addition of IDNs to all Afilias TLDs

Proposed Service. Name of Proposed Service: Technical description of Proposed Service: Addition of IDNs to all Afilias TLDs Proposed Service Name of Proposed Service: Addition of IDNs to all Afilias TLDs Technical description of Proposed Service: Afilias plc, Afilias Technologies Limited, Afilias Domains No. 5 Limited, DotGreen

More information

The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur

The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur The right hehs for Arabic script orthographies of Sorani Kurdish and Uighur Roozbeh Pournader, Google Inc. May 8, 2014 Summary The Arabic letter heh has some variants in the Unicode Standard, which has

More information

Promoting accountability and transparency of multistakeholder partnerships for the implementation of the 2030 Agenda

Promoting accountability and transparency of multistakeholder partnerships for the implementation of the 2030 Agenda 2016 PARTNERSHIP FORUM Promoting accountability and transparency of multistakeholder partnerships for the implementation of the 2030 Agenda 31 March 2016 Dialogue Two (3:00 p.m. 5:45 p.m.) ECOSOC CHAMBER,

More information

ICANN 48 NEWCOMER SESSION

ICANN 48 NEWCOMER SESSION ICANN 48 NEWCOMER SESSION This Is YOUR Day WELCOME! Newcomer Experience ICANN and the Internet Eco-System ICANN and the Multi-Stakeholder Model LUNCH BREAK 1200-1315 ICANN s Work ICANN Meeting Week Staying

More information

U.S. Japan Internet Economy Industry Forum Joint Statement October 2013 Keidanren The American Chamber of Commerce in Japan

U.S. Japan Internet Economy Industry Forum Joint Statement October 2013 Keidanren The American Chamber of Commerce in Japan U.S. Japan Internet Economy Industry Forum Joint Statement 2013 October 2013 Keidanren The American Chamber of Commerce in Japan In June 2013, the Abe Administration with the support of industry leaders

More information

Recent developments in IDNs

Recent developments in IDNs Recent developments in IDNs ICANN 8/3/17 Asmus Freytag Root Zone Label Generation Rules There is an ongoing project at ICANN to define Label Generation Rules (LGRs) for the Root Zone. Label Generation

More information

BRAND GUIDELINES JANUARY 2017

BRAND GUIDELINES JANUARY 2017 BRAND GUIDELINES JANUARY 2017 GETTING AROUND Page 03 05 06 07 08 09 10 12 14 15 Section 01 - Our Logo 02 - Logo Don ts 03 - Our Colors 04 - Our Typeface 06 - Our Art Style 06 - Pictures 07 - Call to Action

More information

Resolution adopted by the General Assembly. [on the report of the Second Committee (A/64/417)]

Resolution adopted by the General Assembly. [on the report of the Second Committee (A/64/417)] United Nations General Assembly Distr.: General 9 February 2010 Sixty-fourth session Agenda item 50 Resolution adopted by the General Assembly [on the report of the Second Committee (A/64/417)] 64/187.

More information

Internationalization of a Distance Exam Web Environment

Internationalization of a Distance Exam Web Environment Internationalization of a Distance Exam Web Environment Radouane Mrabet Ecole Nationale Supérieure d Informatique et d Analyse des Systèmes (Rabat, Morocco). Tele- Teaching Laboratory. Email: mrabet@ensias.ma

More information

Award Winning Typefaces by Linotype

Award Winning Typefaces by Linotype Newly released fonts and Midan awarded coveted design prize Award Winning Typefaces by Linotype Bad Homburg, 23 April 2007. Linotype has once again received critical recognition for their commitment to

More information

L2/ ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC

L2/ ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646 1 Please fill all the sections A, B and C below. Please read Principles and Procedures

More information

Draft Implementation Plan for IDN cctld Fast Track Process

Draft Implementation Plan for IDN cctld Fast Track Process Draft Implementation Plan for IDN cctld Fast Track Process Please note that this is a discussion draft only. Potential IDN cctld requesters should not rely on any of the proposed details in the information

More information

Font Features for Lateef

Font Features for Lateef Font s for Lateef The Lateef font includes a number of optional features that provide alternative rendering that might be preferable for use in some contexts. The chart below enumerates the details of

More information

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet

Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet Blending Content for South Asian Language Pedagogy Part 2: South Asian Languages on the Internet A. Sean Pue South Asia Language Resource Center Pre-SASLI Workshop 6/7/09 1 Objectives To understand how

More information

ISO/IEC JTC/SC2/WG Universal Multiple Octet Coded Character Set (UCS)

ISO/IEC JTC/SC2/WG Universal Multiple Octet Coded Character Set (UCS) ISO/IEC JTC/SC2/WG2 ---------------------------------------------------------------------------- Universal Multiple Octet Coded Character Set (UCS) -------------------------------------------------------------------------------

More information

Getting ready for the Expansion of the DNS

Getting ready for the Expansion of the DNS Universal Acceptance Getting ready for the Expansion of the DNS Lars Steffen Get Engaged in ICANN Seminar / 21 February 2018 Universal Acceptance Warm-Up Warm-up Exercise According to w3techs, which of

More information

The Global Context of Sustainable Development Data

The Global Context of Sustainable Development Data The Global Context of Sustainable Development Data Linda Hooper, UN Statistics Division UNDA10 - Workshop for ESCWA Sound Institutional, environment, cooperation, dialogue and partnerships for the production

More information

2. Requester's name: Urdu and Regional Language Software Development Forum, Ministry of Science and Technology, Government of Pakistan

2. Requester's name: Urdu and Regional Language Software Development Forum, Ministry of Science and Technology, Government of Pakistan N2413-4 (L2-02/163) ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646 1 Please fill all the sections A, B and C below. (Please read

More information

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1

ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1 TP PT Form for PT ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS 1 FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646TP Please fill all the sections A, B and C below. Please read

More information

Multilingual mathematical e-document processing

Multilingual mathematical e-document processing Multilingual mathematical e-document processing Azzeddine LAZREK University Cadi Ayyad, Faculty of Sciences Department of Computer Science Marrakech - Morocco lazrek@ucam.ac.ma http://www.ucam.ac.ma/fssm/rydarab

More information

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines

THE LOGO Guidelines LOGO. Waste Free Environment Brand Guidelines BRAND GUIDELINES THE LOGO Guidelines LOGO SYMBOL TYPEFACE 2 COLOR SCHEME When do I use the full-color logo? Use the full-color logo as frequently as possible to maximize and strengthen the brand. PRIMARY

More information

ICANN & Global Partnerships

ICANN & Global Partnerships ICANN & Global Partnerships Baher Esmat Manager, Regional Relations Middle East cctld Training, Amman 26-29 Nov, 2007 1 What is ICANN? The Internet Corporation for Assigned Names and Numbers (ICANN) is

More information

Online Arabic Handwritten Character Recognition Based on a Rule Based Approach

Online Arabic Handwritten Character Recognition Based on a Rule Based Approach Journal of Computer Science 2012, 8 (11), 1859-1868 ISSN 1549-3636 2012 doi:10.3844/jcssp.2012.1859.1868 Published Online 8 (11) 2012 (http://www.thescipub.com/jcs.toc) Online Arabic Handwritten Character

More information

Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub

Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub Arabic Diacritics Based Steganography Mohammed A. Aabed, Sameh M. Awaideh, Abdul-Rahman M. Elshafei and Adnan A. Gutub King Fahd University of Petroleum and Minerals Computer Engineering Department Dhahran

More information

Africa s Common Position on Internet Governance The Dakar Resolution

Africa s Common Position on Internet Governance The Dakar Resolution Document WSIS-II/PC-3/CONTR/88-E 16 September 2005 Original: English The African Information and Communication Technologies (ICT) Ministers Africa s Common Position on Internet Governance The Dakar Resolution

More information