Unicode Support. Chapter 2:

Similar documents
Introduction. Chapter 1:

Enterprise COBOL. B Batch Compilation...7:14-15 BINARY... 9:41 BINARY (COMP or COMP-4)...9:39-40 Bit Manipulation Routines... 7:45

Chapter 1 INTRODUCTION. SYS-ED/ Computer Education Techniques, Inc.

Enterprise COBOL V6.1: What s New? Tom Ross Captain COBOL February 29

IBM COBOL for AIX, V2.0 provides a powerful development environment for building COBOL applications

Chapter 4: Computer Codes. In this chapter you will learn about:

GnuCOBOL Quick Reference

Copyright Network Management Forum

COMPUTER EDUCATION TECHNIQUES, INC. (COBOL_QUIZ- 4.8) SA:

Chapter 2 SYSTEM OVERVIEW. SYS-ED/ Computer Education Techniques, Inc.

Data types String data types Numeric data types Date, time, and timestamp data types XML data type Large object data types ROWID data type

Chapter 2 INTERNAL SORTS. SYS-ED/ Computer Education Techniques, Inc.

Release Bulletin Mainframe Connect Client Option for CICS 15.0

COBOL for AIX, Version 4.1

DEFINING DATA CONSTANTS AND SYMBOLS

Navigating the pitfalls of cross platform copies

Full Speed Ahead with COBOL Into the Future

Presentation Outline

Data Definition, Movement, and Comparison

COBOL performance: Myths and Realities

The process of preparing an application to support more than one language and data format is called internationalization. Localization is the process

JAVASCRIPT AND JQUERY: AN INTRODUCTION (WEB PROGRAMMING, X452.1)

D16 Code sets, NLS and character conversion vs. DB2

APPENDIX E SOLUTION TO CHAPTER SELF-TEST CHAPTER 1 TRUE-FALSE FILL-IN-THE-BLANKS

8/16/12. Computer Organization. Architecture. Computer Organization. Computer Basics

X/Open CAE Specification

Compiler and Runtime Migration Guide

The Unicode Standard Version 11.0 Core Specification

Micro Focus RM/COBOL. RM/COBOL Syntax Summary

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

Visual C# Instructor s Manual Table of Contents

CGI Subroutines User's Guide

Casabac Unicode Support

B.V. Patel Institute of BMC & IT, UTU 2014

Appendix C WORKSHOP. SYS-ED/ Computer Education Techniques, Inc.

RM/COBOL to RM/COBOL-85

Decision Making using the IF Statement. Logical Control Structures

Full file at

COBOL MOCK TEST COBOL MOCK TEST III

WSDL2RPG FAQ. FAQ How to Send Base64 Encoded Passwords

1 class Lecture2 { 2 3 "Elementray Programming" / References 8 [1] Ch. 2 in YDL 9 [2] Ch. 2 and 3 in Sharan 10 [3] Ch.

Debug Tool: Introduction. Performance Objectives

ASG-Rochade SCANCOB Release Notes

About these Release Notes. This document contains important information about Pro*COBOL 12c Release 2 (12.2).

TABLE 1 HANDLING. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

Legac-E Education. Passing Parameters. to COBOL sub-routines

Moving Data and Printing Information. The Move Statement has The following Format: Move Identifier-1 To Identifier-2. Move Literal-1 To Identifier-2

About these Release Notes. Documentation Accessibility. New Features in Pro*COBOL

zcobol System Programmer s Guide v1.5.06

Computers Programming Course 5. Iulian Năstac

CS313D: ADVANCED PROGRAMMING LANGUAGE

MetaMap Manager User Guide

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Zheng-Liang Lu Java Programming 45 / 79

ADVANTAGES. Via PL/SQL, all sorts of calculations can be done quickly and efficiently without use of Oracle engine.

The PCAT Programming Language Reference Manual

IBM Enterprise PL/I for z/os V3.6 delivers performance, usability, and quality enhancements

$DXM DBX Dictionary Maintenance Utility

FRAC: Language Reference Manual

UNIT- 3 Introduction to C++

TIBCO ActiveMatrix BusinessWorks Plug-in for Data Conversion Release Notes

d-file Language Reference Manual

Java Notes. 10th ICSE. Saravanan Ganesh

Using the PowerExchange CallProg Function to Call a User Exit Program

Data Types, Variables and Arrays. OOC 4 th Sem, B Div Prof. Mouna M. Naravani

Self-test Programming Fundamentals

Type of Cobol Entries

enterprise product suite 2.2.2

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

SAS Viya 3.1 FAQ for Processing UTF-8 Data

Jet Data Manager 2014 SR2 Product Enhancements

Chapter 1 RUNNING A SIMPLE JOB. SYS-ED/ Computer Education Techniques, Inc.

The Warhol Language Reference Manual

Chapter 18. Generating DB2 High Performance Unload jobs

File Commands. Objectives

The manuals listed below are also recommended:

S Coding in COBOL for optimum performance

CICS Transaction Server for VSE/ESA

Contents I Introduction 1 Introduction to PL/SQL iii

UTF and Turkish. İstinye University. Representing Text

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

INTRODUCTION TO DATABASE

Princeton University. Computer Science 217: Introduction to Programming Systems. Data Types in C

FLICONV-API. Generated by Doxygen

IBM WebSphere Development Studio for iseries V5R4 provides tools to create modern IBM iseries solutions

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Representing Characters and Text

3. Java - Language Constructs I

The SPL Programming Language Reference Manual

Exercise: Using Numbers

Introduction to Visual Basic and Visual C++ Arithmetic Expression. Arithmetic Expression. Using Arithmetic Expression. Lesson 4.

String Instructions In Cobol Delimited By Spaces

Class Library java.lang Package. Bok, Jong Soon

CMPS 10 Introduction to Computer Science Lecture Notes

Programming. Syntax and Semantics

UNIT 3 OPERATORS. [Marks- 12]

JavaScript CS 4640 Programming Languages for Web Applications

CS266 Software Reverse Engineering (SRE) Reengineering and Reuse of Legacy Software

Transcription:

Unicode Support Chapter 2: SYS-ED/Computer Education Techniques, Inc. Ch 2: 1 SYS-ED/Computer Education Techniques, Inc. Ch 2: 1

Objectives You will learn: Unicode features. How to use literals and data items. Coding commands to manipulate Unicode fields. SYS-ED/Computer Education Techniques, Inc. Ch 2: 2 SYS-ED/Computer Education Techniques, Inc. Ch 2: 2

Unicode: What is it? It is the industry standard for the coded character set: It is defined by Unicode Consortium and ISO. Covers all commonly used characters in the world in one code page. As compared to one "language" per ASCII, EBCDIC, or EUC code page. Characters: text, digits, special characters, symbols, control characters,... Multiple Unicode encoding formats: UTF-8, UTF-16, UTF-32. SYS-ED/Computer Education Techniques, Inc. Ch 2: 3 SYS-ED/Computer Education Techniques, Inc. Ch 2: 3

Unicode: Why Support global e-business environment: Applications for multi-cultural/multi-geographic businesses. Networks of heterogeneous systems. Enables a common implementation for global applications versus a separate code page for each geographic area or system platform. Supported by all key operating system and middleware platforms. Required by: XML, HTML, Java, and other software. SYS-ED/Computer Education Techniques, Inc. Ch 2: 4 SYS-ED/Computer Education Techniques, Inc. Ch 2: 4

Unicode Support in for z/os Enable Unicode processing for COBOL applications. Consistent with COBOL 2002 standard. Interoperate with: DB2 Unicode support. Java. COBOL XML support. SYS-ED/Computer Education Techniques, Inc. Ch 2: 5 SYS-ED/Computer Education Techniques, Inc. Ch 2: 5

Unicode Support: Overview Unicode literal and value clause. Unicode data type. New compiler options: CODEPAGE() NSYMBOL() Collation: binary. Implicit conversions for EBCDIC data assigned to or compared with Unicode data. Explicit conversions via intrinsic functions. SYS-ED/Computer Education Techniques, Inc. Ch 2: 6 SYS-ED/Computer Education Techniques, Inc. Ch 2: 6

Unicode Literals N αβγ, N Straße Literal value in source is encoded in some EBCDIC code pages. Value is converted to UTF-16 for execution. Value limited to those represented by the source program code page. NX 03B103B203B3 Can be used for characters: not supported by editor. or not in code page of source program. SYS-ED/Computer Education Techniques, Inc. Ch 2: 7 SYS-ED/Computer Education Techniques, Inc. Ch 2: 7

Unicode Data Type USAGE NATIONAL, Picture character N: 01 Japan pic N(20) usage national. One UTF-16 encoding unit (2 bytes) per PICTURE N character. "Character" defined in terms of PICTURE symbol positions, for reference modification, character counts, etc. SYS-ED/Computer Education Techniques, Inc. Ch 2: 8 SYS-ED/Computer Education Techniques, Inc. Ch 2: 8

Compiler Options CODEPAGE ( ccsid ) Specifies EBCDIC CCSID for: literals in the source program. contents of alphanumeric and DBCS data items. Shipped default is 1140 (Latin-1 with Euro) NSYMBOL (DBCS NATIONAL). 01 X PIC NN. and N. are ambiguous: Unicode or DBCS? NSYMBOL option controls default interpretation. PICTURE G and G... are treated as DBCS regardless of NSYMBOL. SYS-ED/Computer Education Techniques, Inc. Ch 2: 9 SYS-ED/Computer Education Techniques, Inc. Ch 2: 9

Assignment NATIONAL, DISPLAY or DISPLAY-1 item may be assigned to NATIONAL item: 01 Country pic N(20) usage national. 01 USA pic X(13) value United States. 01 Greece pic X(6) value Ελλάδα. Move USA to Country. Move Greece to Country. Numeric integer may be assigned to NATIONAL. Padding with Unicode space character: X 0020. Truncation by 2-byte encoding units: Application logic responsible for avoiding partial character truncation when dealing with characters represented in two encoding units. SYS-ED/Computer Education Techniques, Inc. Ch 2: 10 SYS-ED/Computer Education Techniques, Inc. Ch 2: 10

Unicode Compares National item may be compared with: national, alphanumeric, DBCS, or numeric integer operand. If Country = N' 日本 ' Non-Unicode operand is converted to Unicode. Shorter operand values are padded with Unicode blanks. Byte for byte comparison in binary order: No culturally sensitive compares: e.g. N'ç' is not equal N'c', regardless of locale No normalization: e.g. á (composed) is not equal to a (decomposed)sd SYS-ED/Computer Education Techniques, Inc. Ch 2: 11 SYS-ED/Computer Education Techniques, Inc. Ch 2: 11

Other Language Syntax Supporting Unicode Statements involving comparisons: EVALUATE, IF, INSPECT, PERFORM. SEARCH, STRING, UNSTRING. SORT, MERGE, indexed file keys. Class condition on Unicode data: NUMERIC, ALPHABETIC, ALPHABETIC-LOWER, ALPHABETIC-UPPER, class-name. Unicode arguments for CALL or INVOKE. INITIALIZE... REPLACING NATIONAL... Reference modification. SYS-ED/Computer Education Techniques, Inc. Ch 2: 12 SYS-ED/Computer Education Techniques, Inc. Ch 2: 12

Intrinsic Conversion Functions FUNCTION DISPLAY-OF(national-data [ccsid]) Returns alphanumeric (EBCDIC) representation of NATIONAL argument. FUNCTION NATIONAL-OF(ebcdic-data [ccsid]) Returns UTF-16 representation of EBCDIC (DISPLAY or DISPLAY-1) argument. If ccsid omitted, it defaults to value from CODEPAGE() compiler option. ccsid may represent an EBCDIC, ASCII, EUC or UTF-8 code page. Recommendation: Use only one EBCDIC code page in a program. SYS-ED/Computer Education Techniques, Inc. Ch 2: 13 SYS-ED/Computer Education Techniques, Inc. Ch 2: 13

Unicode: Using in DB2 COBOL Programs Consistent support for Unicode in DB2 and COBOL: Same Unicode conversion facility. Binary collation. CCSID information for NATIONAL, DISPLAY, or DISPLAY-1 host variables is automatically coordinated between COBOL and DB2. e.g. EXEC SQL DECLARE:X VARIABLE CCSID 1140 END-EXEC Is no longer required. SYS-ED/Computer Education Techniques, Inc. Ch 2: 14 SYS-ED/Computer Education Techniques, Inc. Ch 2: 14

COBOL Unicode Support and XML Processing COBOL now contains built-in syntax for processing XML documents. XML PARSE statement parses XML documents, drives processing procedure for each event. XML documents may be encoded in UTF-16 Unicode. XML documents encoded in UTF-8 may be converted to UTF-16 using the NATIONAL-OF function, then parsed. XML-NTEXT special register returns to the program the Unicode content from the document, that is associated with each event. SYS-ED/Computer Education Techniques, Inc. Ch 2: 15 SYS-ED/Computer Education Techniques, Inc. Ch 2: 15

Unicode Example IDENTIFICATION DIVISION. PROGRAM-ID. NATSAMP. DATA DIVISION. WORKING-STORAGE SECTION. 01 f1 pic n(5) usage national. 01 f2 pic x(5) value '12345'. PROCEDURE DIVISION. move 'abc' to f1 display f1 move function NATIONAL-OF(f2) to f1 display f1 move function DISPLAY-OF(f1) to f2 display f2 goback. SYS-ED/Computer Education Techniques, Inc. Ch 2: 16 SYS-ED/Computer Education Techniques, Inc. Ch 2: 16