National Character Support and the SAS system on UNIX. Jochen Kirsten, SAS Institute GmbH

Similar documents
The Use of Reserved (= Undefined) 1252 Code Page Slots in TrueType Fonts

Approach to Archive Programs in PDF and Retrieve Programs from PDF Files

The MIDI input and MIDI output sections describe how MIDI input and output are configured generally.

Hauptwerk Virtual Pipe Organ Hauptwerk version 3.22 MIDI Implementation. MIDI Implementation

Title: The Cyrillic TimeScore family of fonts Author: Jan de Kloe Date: July 4 th, 2003 Version: 1.02 updated January 11, 2004

Appendix C. Numeric and Character Entity Reference

ASCII Code - The extended ASCII table

OOstaExcel.ir. J. Abbasi Syooki. HTML Number. Device Control 1 (oft. XON) Device Control 3 (oft. Negative Acknowledgement

SPEECH RECOGNITION COMMON COMMANDS

GS-Base 8.0. Citadel5. help file.

Password Management Guidelines for Cisco UCS Passwords

Information technology Coded graphic character set for text communication Latin alphabet

Introducing GS-Base. Citadel5. GS-Base

{c,} c 00E7 ç &ccedil LATIN SMALL LETTER C WITH CEDILLA {'e} e 00E8 è &egrave LATIN SMALL LETTER E WITH GRAVE {e'} e 00E9 é &eacute LATIN SMALL

WSR Commands. WSR Commands: Mouse Grid: What can I say?: Will show a list of applicable commands

ISO/IEC JTC 1/SC 35. User Interfaces. Secretariat: Association Française de Normalisation (AFNOR)

Adobe FrameMaker 6.0

HANDOUT: COMPUTER PARTS

PlainDSP M2M Communication Experimental Details This document describes the machine-to-machine (M2M) communication experiments using PlainDSP.

SMS API TECHNICAL SPECIFICATION

Request for Comments: XXXX November Registration of a Georgian Character Set draft-giasher-geostd8-00.txt

ISO/IEC JTC 1/SC 35 User Interfaces Secretariat: AFNOR

ISO/TC46/SC4/WG1 N 240, ISO/TC46/SC4/WG1 N

Jazz-Style Font Creation: An Art and a Science

Character Entity References in HTML 4 and XHTML 1.0

Number Systems II MA1S1. Tristan McLoughlin. November 30, 2013

REGISTRATION NUMBER: G1: G2: G3: C0: C1: NAME Supplementary set for Latin-4 alternative with EURO SIGN

L2/ From: Toshiko Kimura Sent: Thursday, September 14, :36 AM Subject: (SC2.586) ISO 2375: New Registration

APPENDIX 1 BRAILLE SYMBOLS AND INDICATORS. Braille Characters Letters Numbers Contractions Indicators Punctuation and Symbols

Encryption Using Timing Clock YAHYA LAYTH KHALEEL, MUSTAFA ABDULFATTAH HABEEB, RAGHAD ABDULRAHMAN SHABAN

OBJECTIVES After reading this chapter, the student should be able to:

Can R Speak Your Language?

Useful FrameMaker Keyboard Commands

Science Translations Software Reference Reprints

1. Character/String Data, Expressions & Intrinsic Functions. Numeric Representation of Non-numeric Values. (CHARACTER Data Type), Part 1

QUICK REFERENCE GUIDE

Computer Programming in MATLAB

QuickSuper. Client Fund Upload File Specification.

PCL Greek-8 - Code Page 869

BrianHetrick.com Application Note AN-1.0 A Latin-1 and Latin-3 Characters US Keyboard Layout for Microsoft Windows

Zeichen-Referenztabelle (1-127)

Speech Recognition Voice Pro Enterprise 4.0 (Windows based Client) MANUAL Linguatec GmbH

Characters Lesson Outline

Windows Tips and Tricks

G47 Text Engraving (Group 00) - Mill. Troubleshooting. How it Works. Haas Technical Documentation. Setting 85 is Too High for Shallow Text Engraving

9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation

FD-011WU. 2D Barcode Reader User Guide V1.6CC

Overview of General Dragon Commands

Expr Language Reference

Chapter 7. Representing Information Digitally

Using Custom Number Formats

Speech Recognition Voice Pro Enterprise 4.0 Client (Windows based Client) MANUAL Linguatec GmbH

set in Options). Returns the cursor to its position prior to the Correct command.

The European Computer Modern Fonts Documentation

Course Schedule. CS 221 Computer Architecture. Week 3: Plan. I. Hexadecimals and Character Representations. Hexadecimal Representation

2a. Codes and number systems (continued) How to get the binary representation of an integer: special case of application of the inverse Horner scheme

Review of Number Systems

AN INTRODUCTION TO BRAILLE MATHEMATICS

Student Guide Updated February 22, 2018

file://h:\cc51\private\jtc%201%20sc%2022\abstract1.htm

SAP SMS 365, enterprise service SMTP Technical Specification October 2013

Wordman s Production Corner

Lecture (03) Binary Codes Registers and Logic Gates

ABCDEFGHIJKLMNOPQRSTUVW XYZabcdefghijklmnopqrstuvw xyzàáâãäāăåǻąæǽćçĉčċď ĐÈÉÊĚËĒĔĖĘĜĞĠĦIÌÍĨÏĪĬĮİIJ ĪĬĮIJĴĶĹŁĽĻĿŃŇÑÒøOÔÕÖŌŎ

ISO/IEC JTC1/SC22/WG20 N

Installing and Using The VT420 Video Terminal With PC Terminal Mode Update

DEPARTMENT OF MATHS, MJ COLLEGE

The process of preparing an application to support more than one language and data format is called internationalization. Localization is the process

C How to Program, 6/e by Pearson Education, Inc. All Rights Reserved.

INTERNATIONAL TELECOMMUNICATION UNION

Information technology. Specification method for cultural conventions ISO/IEC JTC1/SC22/WG20 N690. Reference number of working document:

Volume and File Structure for Write-Once and Rewritable Media using Non-Sequential Recording for Information Interchange

MK D Imager Barcode Scanner Configuration Guide

The TRANTAB Procedure

Friendly Fonts for your Design

Basic data types. Building blocks of computation

Section 1. The essence of COBOL programming. Mike Murach & Associates

DEFENSE LOGISTICS AGENCY HEADQUARTERS 8725 JOHN J. KINGMAN ROAD FORT BELVOIR, VIRGINIA

Beyond Base 10: Non-decimal Based Number Systems

BITG 1233: Introduction to C++

Handbook. CLIÉ handheld basic operations. Entering text on your CLIÉ. handheld. Exchanging and updating files/data using the HotSync operation

Chapter 2 Number Systems and Codes Dr. Xu

Cooking up Enhanced PDF with pdfmark Recipes

BD-6500BT Bluetooth 2D Barcode Scanner Configuration Guide

CHAPTER TWO. Data Representation ( M.MORRIS MANO COMPUTER SYSTEM ARCHITECTURE THIRD EDITION ) IN THIS CHAPTER

mf2pt1 Produce PostScript Type 1 fonts from Metafont source Scott Pakin,

Onboarding guide for new gtlds operated by Afnic

Handbook. Sony CLIÉ handheld basic operations. Entering data on your Sony CLIÉ. handheld. Exchanging and updating data using a HotSync operation

FILES 12:45 1:30pm. 1. KSAM/MPE Dictionary defined files.

Maciej Sobieraj. Lecture 1

LESSON 24 Numbers and Print Symbols

REST SERVICE. Web Services API Version 1.5

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

Beyond Base 10: Non-decimal Based Number Systems

Version 5.5. Multi-language Projects. Citect Pty Ltd 3 Fitzsimmons Lane Gordon NSW 2072 Australia

Module -10. Encoder. Table of Contents

Terminal Emulation User's Guide

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

System Operation and Configuration

Network Working Group Request for Comments: 74 October 16, 1970 SPECIFICATIONS FOR NETWORK USE OF THE UCSB ON-LINE SYSTEM

Transcription:

National Character Support and the SAS system on UNIX Jochen Kirsten, SAS Institute GmbH Abstract In almost all of the European countries National Character Support is a topic of great interest when it comes to computing. Problems with National Character Support start when text is entered at a keyboard and continue through to data exchange between different applications on different hardware. This paper gives an overview of the most commonly used character sets as well as the differences in character handling between different platforms. It will give examples on how the SAS System can be configured for National Character Support independently and also in combination with other software products. 382

National Character Support and the SAS System on UNIX Jochen Kirsten, SAS Institute GmbH What are National Characters? The answer to this question heavily depends on the point of view. Considering character sets like russian or greek one comes to the conclusion that these totally consist of national characters. On the other hand a russian person as well could claim that ego the german alphabet consists of national characters only. We do not want to discuss these extreme cases, but will to restrict ourselves to characteristics of alphabets that have most of their characters in common. To do so we are taking the alphabets of the 'latin-based' languages, thus ignoring character sets like japanese and chinese. We will not look at special signs from technical and scientific or musical character sets either. Character Sets The problems having to do with the representation of are connected with the demand for processing written machines. Thus special machine alphabets have been Here are some of the most commonly known: characters texts with developed. The telegraph alphabet is one of the oldest among them. It is a 5-bit-code from a technical point of view and has been developed for the telegraphical transfer of information. with a width of five bit per character it only consists of 32 different characters. When the first microcomputers came out there was a character set implemented on most of them called ASCII (American Standard Code for Information Interchange). This is a 7-bit-code by definition, thus consisting of 128 different characters. This code has all of the characters needed for representing english texts, uppercase and lowercase, as well as the numbers and punctuation characters. As all of the manufacturers implemented ASCII on their machines it became a real standard in this area. The disadvantage of ASCII became obvious when european users began to use microcomputers: there were no national characters available, such as ego the german ' umlaute' and the ' sharp s'... ' Therefore derivatives of the ASCII were developed, known as 'National Character Replacement ASCII'. with these character sets some of the characters that were considered not very useful, such as curly and square brackets, were substituted with the desired national characters. The disadvantage of this method was that only one of the national character sets was available at the same 383

time and that the brackets that were often used with programming could not be used any longer. Other than that texts that have been written with eg. german replacement set could not properly be reproduced on devices that used eg. french replacement set. These problems _and the fact that most computers use bytes as the smallest addressable unit anyway, lead into extending ASCII with the eighth bit. Now it was possible to use 256 different characters. The first 128 of which are reserved for ASCII, the next 128 could now be used for national characters. Unfortunately the sequence of characters in this extension could not be standardized. So diffenent manufacturers and organisations defined their own extension sequence. Here are some of the more common ones: IBM PC-Code Tables (IBM specific) BP-Roman8 (BP specific) DEC multinational (DEC specific) IS08850-Latin-l (ISO definition) On the mainframes there always was one character set as a standard accepted; EBCDIC (Extended Binary Coded Decimal Interchange Code) uses one byte to represent a character. National character handling in this area is an interesting topic as well. Considering languages such as japanese, chinese, or arabic and the great variety of all their characters it is obvious that not all of them can be represented using only one byte. The demand to assemble all of the characters out of all the alphabets into one single character set lead into thinking about 2-byte-codes. These use two bytes to represent one character. This has not yet become a standard. Transitions The main problem caused by the different character representations shows up when texts that were created using a certain character set, are to be reproduced on a device that uses a different.character set. Then it is possible that texts containing national characters do not come out properly. Either the national characters are deleted entirely or they are substituted by other, in the specific connection senseless, characters. For the ease of discussion we distinguish between two main of transitions: forms o vertical transitions o horizontal transitions 384

vertical Transitions These are transitions where texts are being different hardware platforms within the same mente As an example we take on a UNIX machine. exchanged software the SAS System on a mainframe computer between environ- and Problems Mainly there is the problem of code transformation. The EBCDIC characters coming from the mainframe have to be transformed into the character set being used on the target machine. E.g. this can be the IS08859-Latin-l character set. If some of the target machine's output devices only have the capability of showing ASCII characters, a transition from 8-bit characters to 7-bit characters has to take place. Solutions When converting SAS data in the mentioned area the procedures CPORT and CIMPORT are being used. These are using tables to do the translation. These tables can be adapted by the users using the TRANSLATE option. Details and syntax can be looked up in the 'SAS procedure guide'. For the transition of data between different platforms SAS Institute provides the procedures UPLOAD and DOWNLOAD. Both take translation tables to do the code transformation as well. With release 6.07, family 1 and 2, the tables can be adapted with patches to the user's needs. Having release 6.09 it will be possible to change the tables using the TRANTAB option. Horizontal Transitions The second form of transition of text takes place on the same hardware platform between different applications. E.g. the SAS System using the IS08850-Latin-l character set and a DBMS product with HP-Roman8 on UNIX. Problems The problems are similar to the ones previously mentioned. The code has to be transformed between the different character sets. Also it could be necessary to transform from ASCII to an 8-bit representation or vice versa. 385

Solutions In general there is the possibility to use the Institute-supplied I/O-filters 'sasnlsip' and 'sasnlsop'. These are programs that can be used within UNIX-pipes to transform and filter data streams. Especially this is possible with the FILENAME statement using the PIPE option. Effect and syntax are described in the technical report P-235 'Using International Character Support with the SAS System Release 6.07 on UNIX Operating systems and Derivatives'. Filters should only be used on text files. Using them on binary files could easily destroy the internal structures of the files. Running on x-terminals there is also the possibility to configure the SAS System with the appropriate font resources: and SAS.DMSFont SAS.DMSBoldFont If an IS08859-Latin-l font is not used it is necessary before the SAS System is started to remap the keyboard of the x-terminal accordingly with 'xmodmap'. An example of how to do so will follow later. Terminal Types In gerneral there are two different terminal types on which application software like the SAS system behaves differently. o character oriented terminals o pixel oriented terminals (X-terminals) 386

Character oriented Terminals Running on character oriented terminals the SAS System makes use of their characteristics. This means if the terminal supports national. characters then the SAS Software supports them too, provided the SAS System recognizes the terminal properly. Therefore the environment variables TERM TERMINFO TERMINFOADD have to be set up correctly and NLSLANG must have the correct value Ceg 'german') in order for the SAS System to pick up the appropriate national characters out of the 'sasnlsmap' terminal description. This is a data structure S1m1- lar to the terminfo database and is described in the technical report P-235 'Using International Character Support with the SAS System Release 6.07 on UNIX Operating Systems and Derivatives'. As a little hint it should be mentioned that it can be necessary to use the UNIX command stty -istrip to switch off the stripping of the eighth bit with terminal-i/o if a-bit support is wanted. 387

Pixel oriented Terminals (X-Terminals) Running - on this type of terminal the SAS system supports the IS08859-Latin-l character set by default. Of course it is possible to load different fonts like the HP Roman8 on HP9000 machines or PC850 on IBM RS/6000 workstations. To get this to work on the one hand it is necessary to specify the appropriate font resource, on the other hand the keyboard has to be remapped using 'xmodmap'. Beginning with release 6.09 it will also be possible to use 'sasnlsmap' for x-terminals. This will be discussed later in this paper. How to specify different font resources is described in the 'SAS Companion for the UNIX Environment and Derivatives'. Keyboard Happing using 'xmodmap' Example 1: The SAS system runs on an IBM RS/6000 X-server using a PC850 font and a german keyboard. It is necessary to remap the keyboard for the 'umlaute' as well as for the 'sharp s'. xmodmap - keycode 49 = Ox84 Ox8e keycode 48 = Ox94 Ox99 keycode 35 = Ox81 Ox9a keycode 20 = Oxe1 Ox3:f When using this X-server it is possible to remap the keyboard for all four keylevels. The character codes can be looked up in a PC850 table. How to get the keycodes is described after the next example. Example 2: The SAS system is running on an HP9000 machine using an HP-Roman8 font. xmodmap - keycode 115 = adiaeresis Adiaresis keycode 116 = odiaeresis Odiaeresis keycode 107 = udiaeresis Udiaeresis keycode 99 = ssharp? Running on X-servers of this type the SAS system only honors the first two keylevels. It will be possible with release 6.09 to access all four levels using 'sasnlsmap' for x-terminals. 388

Keycodes There are several-possibilities to get to the keycodes. The elegant of which is to use most xev an X-client, which displays information about all events to the X-server on the terminal. The pressing of a key is an event and results in an appropriate output. Among other information this output contains the keycode of the key pressed. Unfortunately 'xev' is not available on all of the X-servers. In this case users may also use xmodmap -pk which gives a complete list of all keys. This list contains the keycodes of the single keys as well as the set up of the different keylevels. 389

sasnlsmap for x-terminals Due to the fact that some terminal emulators under X, especially aixterm on IBM's AIX and hpterm on Hewlett Packard's HP9000 machines, handle national character representation internally by mapping them to vendor specific character sets and SAS sessions don't handle the character codes emitted from X server, it leads to data incompatibility using SAS and a vendor specific terminal emulator. Using SAS sessions as an application under X Windows including the character mapping concept we can solve this problem. National language support under X windows is implemented in the SAS system in the same manner as NLS for ASCII terminals. During NLS processing under X windows only one-to-one translations for input and output maps are available. N-to-one or oneto-n translations will be ignored. Mapping Concept User-written text file{s) describing the I/O-translation are compiled using the standalone tool cmc into a small, fast, compressed database which is loaded during the SAS session startup. Code statement A new statement has been added to the syntax of text files for creating map databases. It specifies the meaning of the symbols for the internal representation. This statement, if sent, must be the first statement in a source file. used used pre- These are the codes available: iso8859 roman 8 pc850 If this statement is not specified codename defaults to iso8859. For more information on used symbols see the appendices. Keysym statement Keysyms are a concept especially developed for X. In the international character translation concept used by the SAS System keysyms are used to identify keys by their keysym definition. In order to translate input sequences like <compose key>ae to the <adiaeresis> or <lowercase a umlaut> character, the compose key will be recognized by its keysym. Keysyms are used according to the X motif keysym definition. When SAS is used as an X application, keysym definitions in character maps are in effect. The keysym statement has to be used before any map is defined. The symbolname of the keysym may be used instead of the numeric 390

value. This statement helps to make modifications in keysyms simplier. An arbitrary entry could look like this: code keysym iso8859 mute_grave = OxlOOOOOa9; compose = Oxff20; Agrave = <mute grave> "A" AE = <compose> "Ae" : OxcO; : Oxc6; 391

X Resources and Environment Variables When a SAS session is started the SAS System will attempt to find the appropriate translation table for the current terminal configuration. Three variables provide the information needed by the SAS System to find the table. In correspondence with the X environment, X resources, specified either through the command line "-xrm" option or in the $HOME/.Xdefaults or /usr/lib/xll/app-defaults/sas file, are queried by the SAS system to retrieve the necessary information to load the mapping tables and set appropriate mapping. There are three X resources for specifying the map, the language and the terminfo addendum directory, where the appropriate map can be found. SAS.NlsDir: <path> NlsOir specifies the directory in which sas searches for the mapping tables. If it isn't given the environment variable "TERMINFOAOO" is used. The default for this directory is /usr/lib/sas/terminfo. SAS.NlsMap: <nls map descriptor> NlsMap specifies the NLS map table to be used in X NLS. If it is neither specified through the command line, in.xdefaults nor in /usr/lib/xll/app-defaults/sas file, then no translation will be performed. Note that here no UNIX environment variables are checked. If NlsMap is set to NOMAP, any translation will be supressed. SAS.NlsLang: <language> NlsLang specifies the subset of the map table specified in NlsMap. It is the name of the character set to which the current terminal is configured. In case NlsLang isn't give as an X resource the environment variable "NLSLANG" 1.S checked. If this is also not specified, then the language subset defaults to "extended". The mapping mechanism is only activated if the SAS system found a valid translation map specified by these resources. 392

Graphics In this area there are structures available that support national characters, these were introduced in 6.03: o devmaps o keymaps Behind these hides a SAS-specific table of different national characters. Input at the keyboard is being translated with the help of 'keymaps' into this code table. Output to special devices is translated using 'devmaps' into the hardware-specific character set of the device. Details about these structures and how to use them are described in 'Technical Report P-170'. 393

Appendix A The :tso 8859 Latin 1 Character set Symbol hex dec oct space Ox20 32 40 exclam Ox21 33 41 quotedbl Ox22 34 42 numbersign Ox23 35 43 dollar Ox24 36 44 percent Ox25 37 45 ampersand Ox26 38 46 quoteright Ox27 39 47 parenleft Ox28 40 50 parenright Ox29 41 51 asterisk Ox2a 42 52 plus Ox2b 43 53 comma Ox2c 44 54 minus ox2d 45 55 period Ox2e 46 56 slash Ox2f 47 57 0 Ox30 48 60 1 Ox31 49 61 2 Ox32 50 62 3 Ox33 51 63 4 Ox34 52 64 5 Ox35 53 65 6 Ox36 54 66 7 Ox37 55 67 8 ox38 56 70 9 Ox39 57 71 colon Ox3a 58 72 semicolon Ox3b 59 73 less ox3c 60 74 equal Ox3d 61 75 greater Ox3e 62 76 question ox3f 63 77 at Ox40 64 100 A Ox41 65 101 B Ox42 66 102 C Ox43 67 103 D Ox44 68 104 E ox45 69 105 F Ox46 70 106 G Ox47 71 107 394

The ISO 8859 Latin 1 Character Set (continued) symbol hex dec oct H Ox48 72 110 I Ox49 73 111 J Ox4a 74 112... K i 0", Ox4b I L Ox4c 76 114 i M Ox4d 77 115 i1 l' N Ox4e 78 116 0 Ox4f 79 117 P Ox50 80 120 Q Ox51 81 121 S :\ T ox54 84 124 f, U Ox55 85 125 \; V Ox56 86 126 E W Ox57 87 127 " h j. R ox52 Ox53 82 83 122 123 X Ox58 88 130 Y Ox59 89 131 Z Ox5a 90 132 bracketleft Ox5b 91 133 backs lash ox5c 92 134 bracketright Ox5d 93 135 asciicircum Ox5e 94 136 underscore Ox5f 95 137 ; ' quote left Ox60 96 140 a ox61 97 141 b Ox62 98 142 c Ox63 99 143 d Ox64 100 144 e Ox65 101 145 f Ox66 102 146 g Ox67 103 147 h Ox68 104 150 i Ox69 105 151 j Ox6a 106 152 k Ox6b 107 153 1 Ox6c 108 154 m Ox6d 109 155 n Ox6e 110 156 0 Ox6f 111 157 395

The ISO 8859 Latin 1 Character Set (continued) symbol hex dec oct p Ox70 112 160 q Ox71 113 161 r Ox72 114 162 s Ox73 115 163 t Ox74 116 164 u Ox75 117 165 v Ox76 118 166 w Ox77 119 167 x Ox78 120 170 Y Ox79 121 171 z Ox7a 122 172 braceleft Ox7b 123 173 bar Ox7c 124 174 braceright Ox7d 125 175 asciitilde Ox7e 126 176 nobreakspace OxaO 160 240 exclamdown Oxa1 161 241 cent Oxa2 162 242 sterling Oxa3 163 243 currency Oxa4 164 244 yen Oxa5 165 245 brokenbar Oxa6 166 246 section Oxa7 167 247 diaeresis Oxa8 168 250 copyright Oxa9 169 251 }" ) ordfeminine Oxaa 170 252 1: ( guillemotleft Oxab 171 253 ji ; notsign Oxac 172 254 hyphen Oxad 173 255 registered Oxae 174 256 macron Oxaf 175 257 r' : r : 1- degree OxbO 176 260 plusminus Oxb1 177 261 twosuperior Oxb2 178 262 threesuperior Oxb3 179 263 acute Oxb4 180 264 mu Oxb5 181 265 paragraph Oxb6 182 266 periodcentered Oxb7 183 267 " 396

The ISO 8859 Latin 1 Character Set (continued) symbol hex dec oct cedilla oxb8 184 270 onesuperior Oxb9 185 271 ordmaseuline Oxba 186 272 guillemotright Oxbb 187 273 onequarter Oxbe 188 274 onehalf Oxbd 189 275 threequarters Oxbe 190 276 questiondown Oxbf 191 277 Agrave OxeO 192 300 Aaeute Oxe1 193 301 Aeireumflex Oxe2 194 302 Atilde Oxe3 195 303 Adiaeresis Oxe4 196 304 Aring Oxe5 197 305 AE Oxe6 198 306 Ceedilla Oxe7 199 307 Egrave Oxe8 200 310 Eaeute Oxe9 201 311 Eeireumflex Oxea 202 312 Ediaeresis Oxeb 203 313 Igrave Oxee 204 314 Iaeute Oxed 205 315 Ieireumflex Oxee 206 316 Idiaeresis Oxef 207 317 Eth OxdO 208 320 Ntilde Oxd1 209 321 Ograve Oxd2 210 322 Oaeute Oxd3 211 323 oeireumflex Oxd4 212 324 otilde Oxd5 213 325 Odiaeresis Oxd6 214 326 multiply Oxd7 215 327 l Ooblique Oxd8 216 330 Ugrave Oxd9 217 331 Uaeute Oxda 218 332 Ueireumflex Oxdb 219 333 Udiaeresis Oxde 220 334 Yaeute Oxdd 221 335 Thorn Oxde 222 336 ssharp Oxdf 223 337 397

The ISO 8859 Latin 1 Character Set (continued) Symbol hex dec oct agrave OxeO 224 340 aacute Oxe1 225 341 acircumflex Oxe2 226 342 atilde Oxe3 227 343 adiaeresis Oxe4 228 344 aring Oxe5 229 345 ae Oxe6 230 346 ccedilla Oxe7 231 347 egrave Oxe8 232 350 eacute Oxe9 233 351 ecircumflex Oxea 234 352 ediaeresis Oxeb 235 353 igrave Oxec 236 354 iacute Oxed 237 355 icircumflex Oxee 238 356 idiaeresis Oxef 239 357 t, i.! i eth OxfO 240 360 ntilde Oxf1 241 361 ograve Oxf2 242 362 oacute Oxf3 243 363 ocircumflex Oxf4 244 364 otilde Oxf5 245 365 odiaeresis Oxf6 246 366 division Oxf7 247 367 oslash Oxf8 248 370 ugrave Oxf9 249 371 uacute Ox fa 250 372 ucircumflex Oxfb 251 373 udiaeresis Oxfc 252 374 yacute Oxfd 253 375 thorn Oxfe 254 376 ydiaeresis Oxff 255 377, 398

1 " i j Appendix B The Roman 8 Character set symbol hex dec oct space Ox20 32 40 exclamation point Ox21 33 41 quotation mark Ox22 34 42 number s i'gn Ox23 35 43 dollar-sign Ox24 36 44 percent_sign ox25 37 45 ampersand Ox26 38 46 apostrophe ox27 39 47 [? opening parenthesis Ox28 40 50,- closing-parenthesis Ox29 41 51 : asterisk Ox2a 42 52 plus Ox2b 43 53 comma ox2c 44 54 minus ox2d 45 55 tt.,. period Ox2e 46 56 : slant Ox2f 47 57 zero Ox30 48 60 one Ox31 49 61 two Ox32 50 62 three ox33 51 63 four Ox34 52 64 five ox35 53 65 six Ox36 54 66 seven Ox37 55 67 eight Ox38 56 70 nine ox39 57 71 colon Ox3a 58 72 semicolon Ox3b 59 73 less than sign Ox3c 60 74 equal_sign ox3d 61 75 greater than sign Ox3e 62 76 question_mark ox3f 63 77 commercial at Ox40 64 100 uppercase_a Ox41 65 101 uppercase_b Ox42 66 102 uppercase_c Ox43 67 103 uppercase_d Ox44 68 104 uppercase_e Ox45 69 105 uppercase_f Ox46 70 106 uppercase_g Ox47 71 107 399

'l'he Roman 8 Character set (continued) Symbol hex dec oct uppercase_h Ox48 72 110 uppercase_i Ox49 73 111 uppercase_j Ox4a 74 112 uppercase_k Ox4b 75 113 uppercase_l Ox4c 76 114 uppercase_m Ox4d 77 115 uppercase_n Ox4e 78 116 uppercase_o Ox4f 79 117 uppercase_p Ox50 80 120 uppercase_q Ox51 81 121 uppercase_r Ox52 82 122 uppercase S Ox53 83 123 uppercase T Ox54 84 124 uppercase_u Ox55 85 125 uppercase_v Ox56 86 126 uppercase_w Ox57 87 127 uppercase_x Ox58 88 130 uppercase_y Ox59 89 131 uppercase Z Ox5a 90 132 opening_square_bracket Ox5b 91 133 reverse slant Ox5c 92 134 closing=square_bracket Ox5d 93 135 caret Ox5e 94 136 underscore Ox5f 95 137 opening_single_quote Ox60 96 140 lowercase a Ox61 97 141 lowercase-b Ox62 98 142 lowercase-c Ox63 99 143 lowercase-d Ox64 100 144 lowercase-e Ox65 101 145 lowercase-f Ox66 102 146 lowercase=g Ox67 103 147 lowercase h Ox68 104 150 lowercase-i Ox69 105 151 lowercase j Ox6a 106 152 lowercase k Ox6b 107 153 lowercase-l Ox6c 108 154 lowercase-m Ox6d 109 155 lowercase-n Ox6e 110 156 lowercase-o Ox6f 111 157 400

The RomanS Character set (continued) Symbol hex dec oct lowercase p Ox70 112 160 lowercase=q Ox71 113 161 lowercase r Ox72 114 162 lowercase-s Ox73 115 163 lowercase-t Ox74 116 164 lowercase-u Ox75 117 165 lowercase-v Ox76 118 166 lowercase-w Ox77 119 167 lowercase x Ox78 120 170 lowercase=y Ox79 121 171 lowercase z Ox7a 122 172 opening brace Ox7b 123 173 vertical line Ox7c 124 174 closing_brace Ox7d 125 175 tilde Ox7e 126 176 uppercase A grave accent Oxa1 161 241 uppercase-a-circumflex Oxa2 162 242 uppercase-e-grave accent Oxa3 163 243 uppercase-e-circumflex Oxa4 164 244 uppercase-e-umlaut Oxa5 165 245 uppercase-r-circumflex Oxa6 166 246 uppercase=r=umlaut Oxa7 167 247 acute accent Oxa8 168 250 grave accent Oxa9 169 251 circumflex accent Oxaa 170 252 umlaut accent Oxab 171 253 tilde accent Oxac 172 254 uppercase U grave accent Oxad 173 255 uppercase-u-circumflex Oxae 174 256 italian_lira_symbol Oxaf 175 257 over line OxbO 176 260 degree Oxb3 179 263 uppercase_c_cedilla Oxb4 180 264 lowercase c cedilla Oxb5 181 265 uppercase-n-tilde Oxb6 182 266 lowercase-n-tilde Oxb7 183 267 401

The Roman 8 Character set (continued) Symbol hex dec oct inverse exclamation mark Oxb8 184 270 inverse=question_mark Oxb9 185 271 general currency symbol Oxba 186 272 british-pound sign oxbb 187 273 japanese yen symbol Oxbc 188 274 section_sign- Oxbd 189 275 dutch_guilder_symbol Oxbe 190 276 us_cent_symbol Oxbf 191 277 lowercase a circumflex oxco 192 300 lowercase-e-circumflex Oxc1 193 301 lowercase-o-circumflex Oxc2 194 302 lowercase-u-circumflex Oxc3 195 303 lowercase-a-acute accent Oxc4 196 304 lowercase-e-acute-accent Oxc5 197 305 lowercase-o-acute-accent Oxc6 198 306 lowercase-u-acute-accent Oxc7 199 307 lowercase_a_grave_accent Oxc8 200 310 lowercase_e_grave_accent Oxc9 201 311 lowercase_o_grave_accent Oxca 202 312 lowercase_u_grave_accent Oxcb 203 313 lowercase a umlaut Oxcc 204 314 lowercase-e-umlaut Oxcd 205 315 lowercase-o-umlaut Oxce 206 316 lowercase-u-umlaut Oxcf 207 317 uppercase A degree OxdO 208 320 lowercase-i-circumflex Oxd1 209 321 uppercase-a-crossbar Oxd2 210 322 uppercase=ae_ligature Oxd3 211 323 lowercase a degree Oxd4 212 324 lowercase-i-acute accent Oxd5 213 325 lowercase-o-crossbar oxd6 214 326 lowercase=ae_ligature Oxd7 215 327 uppercase A umlaut Oxd8 216 330 lowercase=i=grave_accent Oxd9 217 331 uppercase_o_umlaut Oxda 218 332 uppercase U umlaut Oxdb 219 333 uppercase-e-acute accent Oxdc 220 334 lowercase-i-umlaut Oxdd 221 335 sharp_s Oxde 222 336 uppercase_o_circumflex Oxdf 223 337 402

The Roman 8 Character set (continued) Symbol hex dec oct uppercase A acute accent OxeO 224 340 uppercaseatilde- Oxe1 225 341 lowercase-a-tilde Oxe2 226 342 uppercase_d=with_stroke Oxe3 227 343 lowercase d with stroke Oxe4 228 344 uppercase=i=acute_accent Oxe5 229 345 uppercase_i_grave_accent Oxe6 230 346 uppercase_o_acute_accent Oxe7 231 347 uppercase 0 grave accent Oxe8 232 350 uppercase-o-tilde- Oxe9 233 351 lowercase-o-tilde Oxea 234 352 uppercase-s-with caron Oxeb 235 353 lowercase-s-with-caron Oxec 236 354 uppercase u-acute_accent Oxed 237 355 uppercase_y_umlaut Oxee 238 356 lowercase_y_umlaut Oxef 239 357 uppercase_thorn OxfO 240 360 lowercase thorn Oxf1 241 361 long_dash- Oxf6 246 366 one fourth Oxf7 247 367 one half Oxf8 248 370 feminine ordinal indicator Oxf9 249 371 masculine ordinal indicator Ox fa 250 372 opening_guillemets Oxfb 251 373 solid Oxfc 252 374 closing guillemets Oxfd 253 375 plus_minus_sign Oxfe 254 376 403

404

! 1 } ;, f t, 1 r The PC 850 Character set (continued) Symbol' hex dec oct H Ox48 72 110., I OX49 73 111 1 J Ox4a 74 112 ;- K Ox4b 75 113 ;.r;, L Ox4c 76 114 1 M Ox4d 77 115 i l N Ox4e 78 116 r, 0 Ox4f 79 117 ;- r,i' p Ox50 80 120, Q Ox51 81 121 ' fi R Ox52 82 122 S Ox53 83 123 e T 054 84 124 ' U Ox55 85 125 V Ox56 86 126 31 W Ox57 87 127 i i :, t: j2" X Ox58 88 130 y Ox59 89 131 "- Z Ox5a 90 132 bracket left Ox5b 91 133 I t backs lash Ox5c 92 134 ' \ bracketright Ox5d 93 135 asciicircum Ox5e 94 136 underscore Ox5f 95 137 quoteleft Ox60 96 140 a Ox61 97 141 b Ox62 98 142 c Ox63 99 143 d Ox64 100 144 e Ox65 101 145 f Ox66 102 146 }\ 9 Ox67 103 147 h Ox68 104 150 i Ox69 105 151 j Ox6a 106 152 k Ox6b 107 153,. 1 Ox6c 108 154 m Ox6d 109 155 n Ox6e 110 156 0 Ox6f 111 157 405

The PC 850 Character set (continued) Symbol hex dec oct p Ox70 112 160 q Ox71 113 161 r Ox72 114 162 s Ox73 115 163 t Ox74 116 164 u Ox75 117 165 v Ox76 118 166 w Ox77 119 167 x Ox78 120 170 Y Ox79 121 171 z Ox7a 122 172 brace left Ox7b 123 173 bar Ox7c 124 174 braceright Ox7d 125 175 asciitilde Ox7e 126 176 delta Ox7f 127 177 Ccedilla Ox80 128 200 udiaeresis Ox81 129 201 eacute Ox82 130 202 acircumflex Ox83 131 203 adiaeresis Ox84 132 204 agrave Ox85 133 205 aring Ox86 134 206 ccedilla Ox87 135 207 ecircumflex Ox88 136 210 ediaeresis Ox89 137 211 egrave Ox8a 138 212 idiaeresis Ox8b 139 213 icircumflex Ox8c 140 214, it: igrave Ox8d 141 215 Adiaeresis Ox8e 142 216 ; " Aring Ox8f 143 217 ]: Eacute Ox90 144 220. 1'[.1! ae Ox91 145 221!' AE Ox92 146 222 ocircumflex Ox93 147 223 " j; odiaeresis Ox94 148 224 ograve Ox95 149 225 ucircumflex Ox96 150 226 ugrave Ox97 151 227 }.' 1. -n. 406 '" fi ;-

"i' ;!} The PC 850 Character set (continued) T '.. " Symbol hex dec oct ": " ydiaeresis Ox98 152 230 Odiaeresis Ox99 153 231 Udiaeresis Ox9a 154 232 os lash Ox9b 155 233 f; sterling Ox9c 156 234.1 l Ooblique Ox9d 157 235 :; :t, multiply Oxge 158 236 ""' florin Ox9f 159 237 " aacute OxaO 160 240 f 1 iacute Oxa1 161 241!, oacute Oxa2 162 242 J uacute Oxa3 163 243 K ntilde Oxa4 164 244 ' Ntilde Oxa5 165 245., ordfeminine Oxa6 166 246 'I ordmasculine Oxa7 167 247 5, {: e,. questiondown Oxa8 168 250 : registered Oxa9 169 251 ;i' notsign Oxaa 170 252 t onehalf Oxab 171 253,: onequarter Oxac 172 254 } exclamdown Oxad 173 255 guillemotleft Oxae 174 256 gui1 1 emotright Oxaf 175 257 quarter_hashed OxbO 176 260 half hashed Oxb1 177 261 full-hashed Oxb2 178 262 vertical bar Oxb3 179 263 vertical -bar w left arm Oxb4 180 264 Aacute Oxb5 181 265 Acircumflex Oxb6 182 266 j Agrave Oxb7 183 267 > copyright Oxb8 184 270 Oxb9 185 271 Oxba 186 272 Oxbb 187 273 Oxbc 188 274 cent Oxbd 189 275 yen Oxbe 190 276 upper_right_corner Oxbf 191 277 lower left corner oxco 192 300 inverted tee Oxc1 193 301 upright_tee Oxc2 194 302 vertical _bar_w_right_arm Oxc3 195 303 407

The PC 850 Character set (continued) symbol horizontal bar crossed barsatilde Atilde hex Oxc4 oxc5 Oxc6 Oxc7 dec 196 197 198 199 oct 304 305 306 307 international_currency Oxc8 Oxc9 Oxca Oxcb Oxcc Oxcd Oxce Oxcf 200 201 202 203 204 205 206 207 310 311 312 313 314 315 316 317 eth Eth Ecircumflex Ediaeresis Egrave idotless Iacute Icircumflex OxdO Oxd1 Oxd2 Oxd3 Oxd4 Oxd5 Oxd6 Oxd7 208 209 210 211 212 213 214 215 320 321 322 323 324 325 326 327 Idiaeresis lower right corner upper-left corner bright character cell bottom-bright character cell vertical line-broken - Igrave top_br ight_character_ce I I Oxd8 Oxd9 Oxda Oxdb Oxdc Oxdd Oxde Oxdf 216 217 218 219 220 221 222 223 330 331 332 333 334 335 336 337 oacute ssharp Ocircumflex Ograve otilde otilde mu thorn OxeO Oxe1 Oxe2 Oxe3 Oxe4 Oxe5 Oxe6 Oxe7 224 225 226 227 228 229 230 231 340 341 342 343 344 345 346 347 408

f. i } 1 c The PC 850 Character set (continued), symbol hex dec oct h r Thorn Oxe8 232 350 Uacute Oxe9 233 351 Ucircumflex Oxea 234 352 ;' " Ugrave Oxeb 235 353 yacute Oxec 236 354 i Yacute Oxed 237 355 macron Oxee 238 356! acute Oxef 239 357 }:,i}? hyphen OxfO 240 360 ; i plusminus Oxf1 241 361!: ; double underscore Oxf2 242 362 \: threequarters Oxf3 243 363 : paragraph Oxf4 244 364 t: section oxf5 245 365 " division Oxf6 246 366 cedilla Oxf7 247 367 t I: degree Oxf8 248 370 diaeresis Oxf9 249 371 periodcentered Oxfa 250 372 onesuperior Oxfb 251 373 threesuperior Oxfc 252 374 twosuperior Oxfd 253 375 filled vertical _rectangle Oxfe 254 376 fox_space Oxff 255 377 409