D16 Code sets, NLS and character conversion vs. DB2

Size: px
Start display at page:

Download "D16 Code sets, NLS and character conversion vs. DB2"

Transcription

1 D16 Code sets, NLS and character conversion vs. DB2 Roland Schock ARS Computer und Consulting GmbH :45 a.m. 12:45 p.m. Platform: DB2 for Linux, Unix, Windows Code sets and character conversion is something, which is usually neglected during database design and usage. Everybody expects it will work correctly without any effort. But practice shows, the true detail and impact is often misunderstood and a few details can help adminstrators and database developers to do the right thing. After some necessary definitions this presentation describes, how you can specify the code page used. You will see what character conversion is and how to avoid common problems. At the end we will shortly discuss performance impacts. 1

2 Overview What are character sets, encoding schemes and code pages? Where can I define the code page used? What is code page conversion and where does it happen? What problems can arise and how can I avoid them? Performance considerations 2 On the next few slides we will define basic terms frequently used for this topic. The terms are widely used, but often they are only understood partially. In the case of problems it is essential to understand the concepts to deduct the origin of the problem. 2

3 Character Sets Basically a character set is just a collection of entities or graphical symbols with a meaning. Examples for character sets are the latin alphabet, digits, naval flag signs or other symbols: A, B, C,... α γ π ξ ᇹ ぁゆヘルツヘ ンス 亹怔떟떥 3 Here we use the word 'set' in the mathematical context. It is an unordered collection of elements. One of the most used character sets in Europe is the latin alphabet. But this is just a very small subset of the character sets needed for the most common languages. Other, less obvious character sets are naval flag signs, symbols for the sign language of the deaf, japanese, chinese or other asiatic characters, etc. 3

4 Character Encoding A character encoding or code page is a mapping of symbols of a character set to bit patterns which are also referred as code points. A 17, B 23, C 42, Typical examples of encodings are ASCII, EBCDIC or Unicode. Part of the encoding scheme is also the definition of a serialisation scheme to convert the code point into a sequence of bytes. 4 The symbols of a character set are now put in an sequence and are numbered. The ordinal number will then used as a code point for this symbol. If we have more than 256 symbols, a single byte isn't enough to encode a charater and we have to think about an encoding scheme. 4

5 ASCII Sample of an encoding scheme: First version 1963, Standardized 1968 Ordered mapping to 7-bit numbers 5 5

6 Single Byte Char Sets (SBCS) Extensions from 7-bit ASCII to 8-bit code pages ISO-8859-x: ASCII + special characters for some languages Platform specific charsets: Windows ANSI or MacRoman 6 ISO (Latin 1): ASCII + special characters for westeuropean languages ISO (Latin 2): ASCII + special characters for easteuropean languages ISO , -4,..., -14: ASCII + special characters for arabic, greek, turk, hebrew, thailandic or baltic languages ISO : modified ISO including Euro-Symbol ( ) 6

7 Double Byte Char Sets (DBCS) Expansion of the SBCS concept from one byte to two bytes per character Mainly used for asiatic languages with more than 256 characters to encode Latin text is expanded to twice the size of SBCS 7 7

8 EUC (Extended Unix Code) Multi Byte Char Set (MBCS): 2 or 4 bytes/char Only used for Japanese, Korean, Traditional and Simplified Chinese on Unix platform Uses single shift characters to switch to a another code group to build a multi byte character 8 8

9 Unicode Intended to simplify and unify the different definitions of code pages and hence conversion. The first definition contained characters (16-bit, 1991, UCS-2). Version 2.0 extended the charset with 16 planes for up to characters (32-bit, 1996, UCS-4). Today in Unicode Version 4.0 we have approx characters assigned to code points. 9 See also: 9

10 Unicode char sets and encodings UCS-2: two bytes per character UCS-4: four bytes per character UTF-16: Encoding of UCS-4 into one or two words: the first 64k code points use two bytes per character, all others four byte UTF-8: dynamic or variable length encoding of characters with one to four bytes Possible problems with UCS-2, UCS-4, UTF-16: Byte order differences (big-endian vs. little-endian) between different processor architectures. 10 Beside a mapping of characters to numbers an enconding scheme is essential to store the data in a sequence of bytes. The simplest encoding is to store a 16-bit or 32-bit wide code point in 2 or 4 bytes. This is used in UCS-2 or UCS-4. But this encoding scheme is not very efficient for latin texts which mainly consist of ASCII characters. A text string would consist mainly of 00 bytes. This would also cause problems for the string functions of the C programming language, as it uses a null byte as termination character. UTF-8 is an encoding scheme, which distributes the bits needed in one or more bytes. This requires a more sophisticated routine to read and write strings, but it allows to continue to use the C string functions. Details of the UTF-8 encoding are on the next slide. 10

11 UTF-8 Encoding in variable length sequence of bytes Simple recognition of multibyte chars Compact storage of text in latin chars Only the shortest encoding allowed 11 11

12 Overview What are character sets, encoding schemes and code pages? Where can I define the code page used? What is code page conversion and where does it happen? What problems can arise and how can I avoid them? Performance considerations 12 12

13 Usage of a code page Code pages can be specified at different levels: At the operating system where the application runs At the operating system where the server runs At the operating system where the application is prepared/bound At the database level 13 In a client/server environment, the code page used on a client needs not to be thesameas thecodepageusedon theserver. Local applications tend to use as a default the local defined code page of the operating system. A special situation can occur in a multiplatform environment, where clients, server and the application developers generating code with static SQL use different code pages on their machines. During compilation of programs with embedded static SQL a precompile pass is used, which needs a database connection. As default the local code page is used, which can be different from the other users code pages. If the user later accesses the static SQL, a code page conversion can happen to convert the data first to the code page used for the static SQL. During creation of a database the administrator can specify a code page of the database. This can't be changed afterwards. 13

14 Default code page As default DB2 server and clients use the local settings of the operating system or user: Windows: The server process is using the default region settings of the operating system. Linux/Unix: The codepage is derived from the locale setting for the instance user (i.e. the user running the database processes). Client (LUW): The current locale settings of the user determine the code page used during CONNECT. Programming language: Java is always using Unicode when connecting to a database via JDBC

15 Specifying a code page: OS level Windows: Control Panel Regional and Language settings, chcp command Linux/Unix: locale command 15 15

16 At prepare/bind time Special case during development of database software with static, embedded SQL. Embedded SQL needs a prepare phase before compilation of the source code. Later the prepared package needs to be bound to the database with the bind command. Both commands need a database connection and at the connect time; the current setting of the locale is used

17 Defining a database w/ code page Explicitly set the code page at creation time: CREATE DB test USING CODESET codeset TERRITORY territory COLLATE collatingseq Otherwise current locale is used to determine database codeset. The choosen code page cannot be changed later. In DB2 for iseries and for z/os you can also define single columns of a table in a different code set (not detailed here)

18 Overview What are character sets, encoding schemes and code pages? Where can I define the code page used? What is code page conversion and where does it happen? What problems can arise and how can I avoid them? Performance considerations 18 18

19 Code page conversion If application and server use a different code page, code page conversion happens. Code page conversion is always done at the receivers side: at the servers side for data sent from client to server at the clients side for data sent from server to client Exception: Importing IXF files generated on a different system with another code page If conversion tables are missing: SQLCODE In some rare cases a code page conversion is done more than once. If you import some IXF files on a client machine, a local code page conversion is used, if the IXF files were generated on another machine with a different code page (e.g. export data on a windows machine to IXF and import the data on a linux machine). When this data is sent to the server into a database with another code page, the data has to be converted a second time from the clients code page to the servers code page at the server. 19

20 Client to server conversion Client uses code page X Server uses code page Y Send data using code page X Receive data in Y Convert to code page X Receive data Convert to code page Y Process data Return result in code page Y 20 20

21 Using DB2 Connect Client uses code page X Gateway uses code page Y Server uses code page Z Send data using code page X Receive data in Y Convert to code page X Receive data Convert to code page Y Send data in Y Receive data in Z Convert to Y Return result in Y 21 Receive data Convert to code page Z Return result in code page Z 21

22 Overview What are character sets, encoding schemes and code pages? Where can I define the code page used? What is code page conversion and where does it happen? What problems can arise and how can I avoid them? Performance considerations 22 22

23 Other considerations Mapping of characters (injective): If a character in the source code page is not contained in the target code page, it is replaced by a substitution character. Round trip conversion (bijective): If no substitution needs to take place between source and target code pages, a round trip conversion does not loose information. Encoding/Decoding can change the number of bytes needed to store the data. 23 Details to character substitution can be found in "Application Development Guide: Programming Client Applications", Chapter 29 After a succesful connect to the database, the user/application getssome information returned in the SQLCA data area: - The second token in the SQLERRMC field (tokens are separated by X'FF') indicated the code page of the database. The ninth token indicates the code page of the application. If they are different, we will experience code page conversion. - The first and second entries in the SQLERRD array: SQLERRD(1) contains an integer value equal to the maximum expected expansion or contraction factor for the length of mixed character data, when converted from the applications code page to the database code page. SQLERRD(2) contains this value for conversions from database code page to application code page. A value of 0 or 1 indicated no expansion. A value greater 1 indicates a possible expansion in length; a negative value a possible contraction. 23

24 More considerations Using different conversion tables and -Symbol: Microsoft ANSI code page and the official code page 850 have a different code point for the Euro symbol. If needed code coversion tables can be replaced (ref. Administration Guide, Planning). Unicode support: DB2 supports the UCS-2 character set with UTF-8 and UCS-2 encoding for Unicode databases For PureXML (V9.x) a UTF-8 database is needed. 24 When a Unicode database is created, CHAR, VARCHAR, LONG VARCHAR and CLOB data are stored in UTF-8 form, and GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC, and DBCLOB data are stored in UCS-2 big.endian form. 24

25 More considerations To change a code page of a database, you have to use db2move (Export/Import). Backup/Restore cannot be used. So choosing the right database code page during database creation is crucial. Binary data (BLOB, FOR BIT DATA) is internally stored with code page 0, so no character conversion is applied

26 Troubleshooting Identify used code pages: db2 get db cfg for sample Retrieves database code page Displaying SQLCA area during CONNECT with CLP When connecting to a database via CLP the option " a" displays the SQLCA data area, which shows the code page of the database and the connecting client. If connecting to iseries or zseries machines from DB2 LUW, check if conversion tables are available. 26 After a succesful connect to the database, the user/application getssome information returned in the SQLCA data area: - The second token in the SQLERRMC field (tokens are separated by X'FF') indicated the code page of the database. The ninth token indicates the code page of the application. If they are different, we will experience code page conversion. - The first and second entries in the SQLERRD array: SQLERRD(1) contains an integer value equal to the maximum expected expansion or contraction factor for the length of mixed character data, when converted from the applications code page to the database code page. SQLERRD(2) contains this value for conversions from database code page to application code page. A value of 0 or 1 indicated no expansion. A value greater 1 indicates a possible expansion in length; a negative value a possible contraction. 26

27 Overview What are character sets, encoding schemes and code pages? Where can I define the code page used? What is code page conversion and where does it happen? What problems can arise and how can I avoid them? Performance considerations 27 27

28 Performance considerations Try to avoid unneccessary conversions. Create databases already with the code page needed for your applications. For international databases prefer UTF-8, especially when used with Java programs. Conversion takes time. 28 Every unneccesary conversion costs some time during data access. It is quite minimal, but can sum up. If using Java to access data, prefer UTF-8 databases, as Java uses internally Unicode for character encoding. Keep in mind that a CHAR(10) field can contain 10 ASCII characters i.e. 10 bytes, but not necessary 10 Unicode characters. So for international applications prefer VARCHAR fields with some extra bytes left for expansion due to conversion. 28

29 Links IBM developerworks white paper: DB2 Infocenter Unicode UTF-8 article at Wikipedia

30 D16 Code sets, NLS and character conversion vs. DB2 Roland Schock ARS Computer und Consulting GmbH 30 30

Data types String data types Numeric data types Date, time, and timestamp data types XML data type Large object data types ROWID data type

Data types String data types Numeric data types Date, time, and timestamp data types XML data type Large object data types ROWID data type Data types Every column in every DB2 table has a data type. The data type influences the range of values that the column can have and the set of operators and functions that apply to it. You specify the

More information

Navigating the pitfalls of cross platform copies

Navigating the pitfalls of cross platform copies Navigating the pitfalls of cross platform copies Kai Stroh, UBS Hainer GmbH Overview Motivation Some people are looking for a way to copy data from Db2 for z/ OS to other platforms Reasons include: Number

More information

2011 Martin v. Löwis. Data-centric XML. Character Sets

2011 Martin v. Löwis. Data-centric XML. Character Sets Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers

More information

2007 Martin v. Löwis. Data-centric XML. Character Sets

2007 Martin v. Löwis. Data-centric XML. Character Sets Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers

More information

Attacking Internationalized Software

Attacking Internationalized Software Scott Stender scott@isecpartners.com Black Hat August 2, 2006 Information Security Partners, LLC isecpartners.com Introduction Background Internationalization Basics Platform Support The Internationalization

More information

Unicode Support. Chapter 2:

Unicode Support. Chapter 2: Unicode Support Chapter 2: SYS-ED/Computer Education Techniques, Inc. Ch 2: 1 SYS-ED/Computer Education Techniques, Inc. Ch 2: 1 Objectives You will learn: Unicode features. How to use literals and data

More information

Attacking Internationalized Software

Attacking Internationalized Software Scott Stender scott@isecpartners.com Black Hat August 2, 2006 Information Security Partners, LLC isecpartners.com Introduction Who are you? Founding Partner of Information Security Partners, LLC (isec

More information

CS144: Content Encoding

CS144: Content Encoding CS144: Content Encoding MIME (Multi-purpose Internet Mail Extensions) Q: Only bits are transmitted over the Internet. How does a browser/application interpret the bits and display them correctly? MIME

More information

Character Encodings. Fabian M. Suchanek

Character Encodings. Fabian M. Suchanek Character Encodings Fabian M. Suchanek 22 Semantic IE Reasoning Fact Extraction You are here Instance Extraction singer Entity Disambiguation singer Elvis Entity Recognition Source Selection and Preparation

More information

Can R Speak Your Language?

Can R Speak Your Language? Languages Can R Speak Your Language? Brian D. Ripley Professor of Applied Statistics University of Oxford ripley@stats.ox.ac.uk http://www.stats.ox.ac.uk/ ripley The lingua franca of computing is (American)

More information

UTF and Turkish. İstinye University. Representing Text

UTF and Turkish. İstinye University. Representing Text Representing Text Representation of text predates the use of computers for text Text representation was needed for communication equipment One particular commonly used communication equipment was teleprinter

More information

Casabac Unicode Support

Casabac Unicode Support Unicode Support Unicode Support Full Unicode support was added into the GUI Server with build 25_20040105. Before ISO 8859-1 was used for encoding and decoding HTML pages and your system's default encoding

More information

Representing Characters and Text

Representing Characters and Text Representing Characters and Text cs4: Computer Science Bootcamp Çetin Kaya Koç cetinkoc@ucsb.edu Çetin Kaya Koç http://koclab.org Winter 2018 1 / 28 Representing Text Representation of text predates the

More information

Chapter 4: Computer Codes. In this chapter you will learn about:

Chapter 4: Computer Codes. In this chapter you will learn about: Ref. Page Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence Ref. Page

More information

Unicode and the Implications of Its Implementation

Unicode and the Implications of Its Implementation STORAGE SOLUTIONS WHITE PAPER Unicode and the Implications of Its Implementation Contents 1. Introduction...1 2. What is Unicode?...1 3. Converting to Unicode...1 3.1 Create a disaster recovery image of

More information

Class Library java.lang Package. Bok, Jong Soon

Class Library java.lang Package. Bok, Jong Soon Class Library java.lang Package Bok, Jong Soon javaexpert@nate.com www.javaexpert.co.kr Object class Is the root of the class hierarchy. Every class has Object as a superclass. If no inheritance is specified

More information

DB2 Universal Database for z/os

DB2 Universal Database for z/os DB2 Uniersal Database for z/os Version 8 Internationalization Guide SC19-2739-00 DB2 Uniersal Database for z/os Version 8 Internationalization Guide SC19-2739-00 Note Before using this information and

More information

9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation

9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation Data Representation II CMSC 313 Sections 01, 02 The conversions we have so far presented have involved only unsigned numbers. To represent signed integers, computer systems allocate the high-order bit

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

Practical character sets

Practical character sets Practical character sets In MySQL, on the web, and everywhere Domas Mituzas MySQL @ Sun Microsystems Wikimedia Foundation It seems simple a b c d e f a ą b c č d e ę ė f а б ц д е ф פ ע ד צ ב א... ---...

More information

Release Bulletin Mainframe Connect Client Option for CICS 15.0

Release Bulletin Mainframe Connect Client Option for CICS 15.0 Release Bulletin Mainframe Connect Client Option for CICS 15.0 Document ID: DC71770-01-1500-01 Last revised: August 2007 Topic Page 1. Accessing current release bulletin information 2 2. Product summary

More information

Hardware Requirements

Hardware Requirements Hardware Requirements Hardware Requirements 06.11.06 Unicode non-unicode SAP System Hardware Requirements in SAP Systems Will I need more hardware in a Unicode system than in a non-unicode system? SAP

More information

1 sur 26 13/12/2011 15:18 2 sur 26 13/12/2011 15:18 Introduction The File API provides an interface with the server's OS File system. It allows you to handle files and folders as JavaScript objects using

More information

IT101. Characters: from ASCII to Unicode

IT101. Characters: from ASCII to Unicode IT101 Characters: from ASCII to Unicode Java Primitives Note the char (character) primitive. How does it represent the alphabet letters? What is the difference between char and String? Does a String consist

More information

Princeton University. Computer Science 217: Introduction to Programming Systems. Data Types in C

Princeton University. Computer Science 217: Introduction to Programming Systems. Data Types in C Princeton University Computer Science 217: Introduction to Programming Systems Data Types in C 1 Goals of C Designers wanted C to: Support system programming Be low-level Be easy for people to handle But

More information

Administrivia. CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries. Representing characters DATA REPRESENTATION

Administrivia. CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries. Representing characters DATA REPRESENTATION Administrivia CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries Jan Plane & Alan Sussman {jplane, als}@cs.umd.edu Project 6 due next Friday, 12/10 public tests posted

More information

IBM DB2 UDB V7.1 Family Fundamentals.

IBM DB2 UDB V7.1 Family Fundamentals. IBM 000-512 DB2 UDB V7.1 Family Fundamentals http://killexams.com/exam-detail/000-512 Answer: E QUESTION: 98 Given the following: A table containing a list of all seats on an airplane. A seat consists

More information

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur 1 Number Representation 2 1 Topics to be Discussed How are numeric data items actually

More information

Table of Contents Chapter 1 - Introduction Chapter 2 - Designing XML Data and Applications Chapter 3 - Designing and Managing XML Storage Objects

Table of Contents Chapter 1 - Introduction Chapter 2 - Designing XML Data and Applications Chapter 3 - Designing and Managing XML Storage Objects Table of Contents Chapter 1 - Introduction 1.1 Anatomy of an XML Document 1.2 Differences Between XML and Relational Data 1.3 Overview of DB2 purexml 1.4 Benefits of DB2 purexml over Alternative Storage

More information

Compatibility matrix: HP Service Manager Software version 7.00

Compatibility matrix: HP Service Manager Software version 7.00 Compatibility matrix: HP Service Manager Software version 7.00 Click one of the following links to see more detailed information. Servers Windows Client Web Client Compatibility between Service Manager

More information

Compatibility matrix: ServiceCenter 6.2

Compatibility matrix: ServiceCenter 6.2 Compatibility matrix: ServiceCenter 6.2 Click one of the following links to see more detailed information. Architecture diagram Servers Windows Client Web Client Compatibility between ServiceCenter clients

More information

Extended Character Sets for UCAS Systems

Extended Character Sets for UCAS Systems Extended Character Sets for UCAS Systems Admissions Conference 2010 Mike Gwyer ASCII The American Standard Code for Information Interchange A character-encoding scheme based on the ordering of the English

More information

Multilingual vi Clones: Past, Now and the Future

Multilingual vi Clones: Past, Now and the Future THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference Monterey, California, USA, June

More information

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1 IT 1204 Section 2.0 Data Representation and Arithmetic 2009, University of Colombo School of Computing 1 What is Analog and Digital The interpretation of an analog signal would correspond to a signal whose

More information

Picsel epage. PowerPoint file format support

Picsel epage. PowerPoint file format support Picsel epage PowerPoint file format support Picsel PowerPoint File Format Support Page 2 Copyright Copyright Picsel 2002 Neither the whole nor any part of the information contained in, or the product described

More information

Representing Characters, Strings and Text

Representing Characters, Strings and Text Çetin Kaya Koç http://koclab.cs.ucsb.edu/teaching/cs192 koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.cs.ucsb.edu Fall 2016 1 / 19 Representing and Processing Text Representation of text predates the use

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Web Crawler Finds and downloads web pages automatically provides the collection for searching Web is huge and constantly

More information

Full file at

Full file at Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class

More information

Code Page Configuration in PowerCenter

Code Page Configuration in PowerCenter Code Page Configuration in PowerCenter 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

GLOBALISATION. History. Simple example. What should be globalised?

GLOBALISATION. History. Simple example. What should be globalised? GLOBALISATION History I bet it is quite natural to dream about writing software thatis beingsoldaroundthe world However, there may be some small obstacles on the way to selling your software worldwide.

More information

Learn about Oracle DECODE and see some examples in this article. section below for an example on how to use the DECODE function in the WHERE clause.

Learn about Oracle DECODE and see some examples in this article. section below for an example on how to use the DECODE function in the WHERE clause. Instruction Decode In Oracle Where Clause Examples I have following requirement for writing a query in oracle. I need to fetch all the records from a Table T1 (it has two date columns D1 and D2)based on

More information

FLICONV-API. Generated by Doxygen

FLICONV-API. Generated by Doxygen FLICONV-API 1 1.8.13 Contents 1 FLUC ICONV Interface 1 1.1 CCSID's, encoding strings and defines................................ 1 1.2 Compatibility mode.......................................... 2 1.3

More information

ECE 122. Engineering Problem Solving Using Java

ECE 122. Engineering Problem Solving Using Java ECE 122 Engineering Problem Solving Using Java Lecture 27 Linear and Binary Search Overview Problem: How can I efficiently locate data within a data structure Searching for data is a fundamental function

More information

MythBusters Globalization Support

MythBusters Globalization Support MythBusters Globalization Support Avoid Data Corruption Christian Gohmann @CGohmannDE DOAG2018 About me Christian Gohmann Senior Consultant at Trivadis GmbH, Düsseldorf (Germany) Instructor since 2014

More information

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Web crawlers Retrieving web pages Crawling the web» Desktop crawlers» Document feeds File conversion Storing the documents Removing noise Desktop Crawls! Used

More information

Data Storage and Query Answering. Data Storage and Disk Structure (4)

Data Storage and Query Answering. Data Storage and Disk Structure (4) Data Storage and Query Answering Data Storage and Disk Structure (4) Introduction We have introduced secondary storage devices, in particular disks. Disks use blocks as basic units of transfer and storage.

More information

IBM C IBM DB2 11 DBA for z/os. Download Full Version :

IBM C IBM DB2 11 DBA for z/os. Download Full Version : IBM C2090-312 IBM DB2 11 DBA for z/os Download Full Version : http://killexams.com/pass4sure/exam-detail/c2090-312 Answer: C, E QUESTION: 58 You want to convert a segmented table space into a partition-by-growth

More information

Multilingual Computing with the 9.1 SAS Unicode Server Stephen Beatrous, SAS Institute, Cary, NC

Multilingual Computing with the 9.1 SAS Unicode Server Stephen Beatrous, SAS Institute, Cary, NC Paper 1036 Multilingual Computing with the 9.1 Unicode Server Stephen Beatrous, Institute, Cary, NC ABSTRACT In today s business world, information comes in many languages and you may have customers and

More information

[MS-UCODEREF]: Windows Protocols Unicode Reference. Intellectual Property Rights Notice for Open Specifications Documentation

[MS-UCODEREF]: Windows Protocols Unicode Reference. Intellectual Property Rights Notice for Open Specifications Documentation [MS-UCODEREF]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,

More information

SAPGUI for Windows - I18N User s Guide

SAPGUI for Windows - I18N User s Guide Page 1 of 30 SAPGUI for Windows - I18N User s Guide Introduction This guide is intended for the users of SAPGUI who logon to Unicode systems and those who logon to non-unicode systems whose code-page is

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CMPS Introduction to Computer Science Lecture Notes Binary Numbers Until now we have considered the Computing Agent that executes algorithms to be an abstract entity. Now we will be concerned with techniques

More information

Digital Logic. The Binary System is a way of writing numbers using only the digits 0 and 1. This is the method used by the (digital) computer.

Digital Logic. The Binary System is a way of writing numbers using only the digits 0 and 1. This is the method used by the (digital) computer. Digital Logic 1 Data Representations 1.1 The Binary System The Binary System is a way of writing numbers using only the digits 0 and 1. This is the method used by the (digital) computer. The system we

More information

Db2 Alter Table Alter Column Set Data Type Char

Db2 Alter Table Alter Column Set Data Type Char Db2 Alter Table Alter Column Set Data Type Char I am trying to do 2 alters to a column in DB2 in the same alter command, and it doesn't seem to like my syntax alter table tbl alter column col set data

More information

Binary, Hexadecimal and Octal number system

Binary, Hexadecimal and Octal number system Binary, Hexadecimal and Octal number system Binary, hexadecimal, and octal refer to different number systems. The one that we typically use is called decimal. These number systems refer to the number of

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 2 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle E-Business Suite Internationalization and Multilingual Features

More information

Introduction to Numbering Systems

Introduction to Numbering Systems NUMBER SYSTEM Introduction to Numbering Systems We are all familiar with the decimal number system (Base 10). Some other number systems that we will work with are Binary Base 2 Octal Base 8 Hexadecimal

More information

Tex with Unicode Characters

Tex with Unicode Characters Tex with Unicode Characters 7/10/18 Presented by: Yuefei Xiang Agenda ASCII Code Unicode Unicode in Tex Old Style Encoding -Inputenc, -ucs Morden Encoding -XeTeX -LuaTeX Unicode bi-direction in Tex -Emacs-AucTeX

More information

Long Filename Specification

Long Filename Specification Long Filename Specification by vindaci fourth release First Release: November 18th, 1996 Last Update: January 6th, 1998 (Document readability update) Compatibility Long filename (here on forth referred

More information

MythBusters Globalization Support

MythBusters Globalization Support MythBusters Globalization Support Avoid Data Corruption Christian Gohmann @CGohmannDE nloug_tech18 About me Christian Gohmann Senior Consultant at Trivadis GmbH, Düsseldorf (Germany) Instructor since 2014

More information

The Unicode Standard Version 11.0 Core Specification

The Unicode Standard Version 11.0 Core Specification The Unicode Standard Version 11.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

Data encoding. Lauri Võsandi

Data encoding. Lauri Võsandi Data encoding Lauri Võsandi Binary data Binary can represent Letters of alphabet, plain-text files Integers, floating-point numbers (of finite precision) Pixels, images, video Audio samples Could be stored

More information

Worksheet - Storing Data

Worksheet - Storing Data Unit 1 Lesson 12 Name(s) Period Date Worksheet - Storing Data At the smallest scale in the computer, information is stored as bits and bytes. In this section, we'll look at how that works. Bit Bit, like

More information

M1 Computers and Data

M1 Computers and Data M1 Computers and Data Module Outline Architecture vs. Organization. Computer system and its submodules. Concept of frequency. Processor performance equation. Representation of information characters, signed

More information

Thus needs to be a consistent method of representing negative numbers in binary computer arithmetic operations.

Thus needs to be a consistent method of representing negative numbers in binary computer arithmetic operations. Signed Binary Arithmetic In the real world of mathematics, computers must represent both positive and negative binary numbers. For example, even when dealing with positive arguments, mathematical operations

More information

Computer Science Applications to Cultural Heritage. Introduction to computer systems

Computer Science Applications to Cultural Heritage. Introduction to computer systems Computer Science Applications to Cultural Heritage Introduction to computer systems Filippo Bergamasco (filippo.bergamasco@unive.it) http://www.dais.unive.it/~bergamasco DAIS, Ca Foscari University of

More information

III-16Text Encodings. Chapter III-16

III-16Text Encodings. Chapter III-16 Chapter III-16 III-16Text Encodings Overview... 410 Text Encoding Overview... 410 Text Encodings Commonly Used in Igor... 411 Western Text Encodings... 412 Asian Text Encodings... 412 Unicode... 412 Unicode

More information

Portability. Professor Jennifer Rexford

Portability. Professor Jennifer Rexford Portability Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 8 1 Goals of this Lecture

More information

Java Notes. 10th ICSE. Saravanan Ganesh

Java Notes. 10th ICSE. Saravanan Ganesh Java Notes 10th ICSE Saravanan Ganesh 13 Java Character Set Character set is a set of valid characters that a language can recognise A character represents any letter, digit or any other sign Java uses

More information

Understanding Unicode

Understanding Unicode Understanding Unicode in Sage SalesLogix Version 8.0 Developed by Sage SalesLogix User Assistance Copyright 1997-2012, Sage Software, Inc. All Rights Reserved. This product and related documentation are

More information

SQL return codes that are preceded by a minus sign (-) indicate that the SQL statement SOME SYMBOLS THAT MIGHT BE LEGAL ARE: token-list, -105

SQL return codes that are preceded by a minus sign (-) indicate that the SQL statement SOME SYMBOLS THAT MIGHT BE LEGAL ARE: token-list, -105 Complete List Of Error Codes In Db2 9 INFORMATION RETURNED FOR THE ERROR INCLUDES SQLCODE sqlcode SQLSTATE sqlstate AND MESSAGE TOKENS token-list. regeneration of a function, 9: implicit regeneration of

More information

This is a list of questions and answers about Unicode in Perl, intended to be read after perlunitut.

This is a list of questions and answers about Unicode in Perl, intended to be read after perlunitut. NAME Q and A perlunifaq - Perl Unicode FAQ This is a list of questions and answers about Unicode in Perl, intended to be read after perlunitut. perlunitut isn't really a Unicode tutorial, is it? No, and

More information

Creating an Oracle Database Using DBCA. Copyright 2009, Oracle. All rights reserved.

Creating an Oracle Database Using DBCA. Copyright 2009, Oracle. All rights reserved. Creating an Oracle Database Using DBCA Objectives After completing this lesson, you should be able to do the following: Create a database by using the Database Configuration Assistant (DBCA) Generate database

More information

SAP NetWeaver BI. Unicode Compliance. Product Management SAP NetWeaver BI. Version 7.0 December, 2008

SAP NetWeaver BI. Unicode Compliance. Product Management SAP NetWeaver BI. Version 7.0 December, 2008 SAP NetWeaver BI Unicode Compliance Product Management SAP NetWeaver BI Version 7.0 December, 2008 Agenda 1. Unicode in General 2. Excursus: MDMP 3. Unicode support of SAP NetWeaver BI 4. Interfaces to

More information

St. Benedict s High School. Computing Science. Software Design & Development. (Part 2 Computer Architecture) National 5

St. Benedict s High School. Computing Science. Software Design & Development. (Part 2 Computer Architecture) National 5 Computing Science Software Design & Development (Part 2 Computer Architecture) National 5 DATA REPRESENTATION Numbers Binary/Decimal Conversion Example To convert 69 into binary: write down the binary

More information

Code Page Settings and Performance Settings for the Data Validation Option

Code Page Settings and Performance Settings for the Data Validation Option Code Page Settings and Performance Settings for the Data Validation Option 2011 Informatica Corporation Abstract This article provides general information about code page settings and performance settings

More information

IBM DB Developing Embedded SQL Applications SC

IBM DB Developing Embedded SQL Applications SC IBM DB2 9.7 for Linux, UNIX, and Windows Developing Embedded SQL Applications SC27-2445-00 IBM DB2 9.7 for Linux, UNIX, and Windows Developing Embedded SQL Applications SC27-2445-00 Note Before using

More information

Project 3: Base64 Content-Transfer-Encoding

Project 3: Base64 Content-Transfer-Encoding CMSC 313, Computer Organization & Assembly Language Programming Section 0101 Fall 2001 Project 3: Base64 Content-Transfer-Encoding Due: Tuesday November 13, 2001 Objective The objectives of this programming

More information

Introduction to Informatics

Introduction to Informatics Introduction to Informatics Lecture : Encoding Numbers (Part II) Readings until now Lecture notes Posted online @ http://informatics.indiana.edu/rocha/i The Nature of Information Technology Modeling the

More information

CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

More information

DB2 for z/os Stored Procedures Update

DB2 for z/os Stored Procedures Update Robert Catterall, IBM rfcatter@us.ibm.com DB2 for z/os Stored Procedures Update Michigan DB2 Users Group May 15, 2013 Information Management Agenda A brief review of DB2 for z/os stored procedure enhancements

More information

CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

More information

Japanese utf 8 font. Japanese utf 8 font.zip

Japanese utf 8 font. Japanese utf 8 font.zip Japanese utf 8 font Japanese utf 8 font.zip 22/11/2010 Japanese: 私はガラスを (Literal UTF-8) Representing Middle English on the Web with UTF-8; The Kermit Bibliography (in UTF-8)What I'd like to do is save

More information

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions Structure of von Nuemann machine Arithmetic and Logic Unit Input Output Equipment Main Memory Program Control Unit 1 1 Instruction Set - the type of Instructions Arithmetic + Logical (ADD, SUB, MULT, DIV,

More information

Database Database administration

Database Database administration System i Database Database administration Version 6 Release 1 System i Database Database administration Version 6 Release 1 Note Before using this information and the product it supports, read the information

More information

Logic, Words, and Integers

Logic, Words, and Integers Computer Science 52 Logic, Words, and Integers 1 Words and Data The basic unit of information in a computer is the bit; it is simply a quantity that takes one of two values, 0 or 1. A sequence of k bits

More information

Oracle Database. Migration Assistant for Unicode Guide Release 1.1 E

Oracle Database. Migration Assistant for Unicode Guide Release 1.1 E Oracle Database Migration Assistant for Unicode Guide Release 1.1 E26097-01 September 2011 Oracle Database Migration Assistant for Unicode Guide, Release 1.1 E26097-01 Copyright 2011, Oracle and/or its

More information

LBSC 690: Information Technology Lecture 05 Structured data and databases

LBSC 690: Information Technology Lecture 05 Structured data and databases LBSC 690: Information Technology Lecture 05 Structured data and databases William Webber CIS, University of Maryland Spring semester, 2012 Interpreting bits "my" 13.5801 268 010011010110 3rd Feb, 2014

More information

Portability. Goals of this Lecture. The Real World is Heterogeneous

Portability. Goals of this Lecture. The Real World is Heterogeneous Portability The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 8 1 Goals of this Lecture Learn to write code that works with multiple: Hardware

More information

THE INTEGER DATA TYPES. Laura Marik Spring 2012 C++ Course Notes (Provided by Jason Minski)

THE INTEGER DATA TYPES. Laura Marik Spring 2012 C++ Course Notes (Provided by Jason Minski) THE INTEGER DATA TYPES STORAGE OF INTEGER TYPES IN MEMORY All data types are stored in binary in memory. The type that you give a value indicates to the machine what encoding to use to store the data in

More information

Easy-to-see Distinguishable and recognizable with legibility. User-friendly Eye friendly with beauty and grace.

Easy-to-see Distinguishable and recognizable with legibility. User-friendly Eye friendly with beauty and grace. Bitmap Font Basic Concept Easy-to-read Readable with clarity. Easy-to-see Distinguishable and recognizable with legibility. User-friendly Eye friendly with beauty and grace. Accordance with device design

More information

Database Design on Construction Project Cost System Nannan Zhang1,a, Wenfeng Song2,b

Database Design on Construction Project Cost System Nannan Zhang1,a, Wenfeng Song2,b 3rd International Conference on Materials Engineering, Manufacturing Technology and Control (ICMEMTC 2016) Database Design on Construction Project Cost System Nannan Zhang1,a, Wenfeng Song2,b 1 School

More information

Chapter 2. Data Representation in Computer Systems

Chapter 2. Data Representation in Computer Systems Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting

More information

Chapter 7. Representing Information Digitally

Chapter 7. Representing Information Digitally Chapter 7 Representing Information Digitally Learning Objectives Explain the link between patterns, symbols, and information Determine possible PandA encodings using a physical phenomenon Encode and decode

More information

SAS. Social Network Analysis Server 6.2: Installation and Configuration Guide, Third Edition. SAS Documentation

SAS. Social Network Analysis Server 6.2: Installation and Configuration Guide, Third Edition. SAS Documentation SAS Social Network Analysis Server 6.2: Installation and Configuration Guide, Third Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016.

More information

DB2 for z/os: Continuous Delivery of New Features (part 2) Chris Crone DE DB2 Development Presented by Mark Rader WSC: DB2 for z/os

DB2 for z/os: Continuous Delivery of New Features (part 2) Chris Crone DE DB2 Development Presented by Mark Rader WSC: DB2 for z/os DB2 for z/os: Continuous Delivery of New Features (part 2) Chris Crone DE DB2 Development Presented by Mark Rader WSC: DB2 for z/os Applications Static SQL, DDL, and DCL In DB2 11, Static SQL is controlled

More information

Using dynamic SQL in COBOL

Using dynamic SQL in COBOL Using dynamic SQL in COBOL You can use all forms of dynamic SQL in all supported versions of COBOL. For a detailed description and a working example of the method, see Sample COBOL dynamic SQL program

More information

Java Multilingual Elementary Tool

Java Multilingual Elementary Tool November 28, 2004 Outline Designing Outline Multilingual system: refer to computer programs which permit user interaction with the computer in one or more languages A Java multilingual elementary tool

More information

XML is a popular multi-language system, and XHTML depends on it. XML details languages

XML is a popular multi-language system, and XHTML depends on it. XML details languages 1 XML XML is a popular multi-language system, and XHTML depends on it XML details languages XML 2 Many of the newer standards, including XHTML, are based on XML = Extensible Markup Language, so we will

More information

But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15

But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15 Integer Representation Representation of integers: unsigned and signed Sign extension Arithmetic and shifting Casting But first, encode deck of cards. cards in suits How do we encode suits, face cards?

More information

COSC431 IR. Compression. Richard A. O'Keefe

COSC431 IR. Compression. Richard A. O'Keefe COSC431 IR Compression Richard A. O'Keefe Shannon/Barnard Entropy = sum p(c).log 2 (p(c)), taken over characters c Measured in bits, is a limit on how many bits per character an encoding would need. Shannon

More information