웹기술및응용. XML Basics 2018 년 2 학기. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Similar documents
XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

Author: Irena Holubová Lecturer: Martin Svoboda

Chapter 1: Getting Started. You will learn:

Introduction to XML. An Example XML Document. The following is a very simple XML document.

Introduction to XML. XML: basic elements

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

2009 Martin v. Löwis. Data-centric XML. XML Syntax

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Data Presentation and Markup Languages

M359 Block5 - Lecture12 Eng/ Waleed Omar

CSS, Cascading Style Sheets

COMP9321 Web Application Engineering

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

CSC Web Technologies, Spring Web Data Exchange Formats

Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa

XML: Managing with the Java Platform

XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

The concept of DTD. DTD(Document Type Definition) Why we need DTD

What is XML? XML is designed to transport and store data.

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

Structured documents

XML. Objectives. Duration. Audience. Pre-Requisites

Introduction to XML. Chapter 133

11. EXTENSIBLE MARKUP LANGUAGE (XML)

Well-formed XML Documents


CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

COMP9321 Web Application Engineering

The XML Metalanguage

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia

XML & Related Languages

XML: Extensible Markup Language

extensible Markup Language (XML) Basic Concepts

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Fundamentals of Web Programming a

extensible Markup Language

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

XML Information Set. Working Draft of May 17, 1999

2006 Martin v. Löwis. Data-centric XML. Document Types

Fundamentals of Web Programming a

Web Services Part I. XML Web Services. Instructor: Dr. Wei Ding Fall 2009

XML. extensible Markup Language. Overview. Overview. Overview XML Components Document Type Definition (DTD) Attributes and Tags An XML schema

Introduction to XML. National University of Computer and Emerging Sciences, Lahore. Shafiq Ur Rahman. Center for Research in Urdu Language Processing

XML. extensible Markup Language. ... and its usefulness for linguists

XML. Jonathan Geisler. April 18, 2008

Solutions. a. Yes b. No c. Cannot be determined without the DTD. d. Schema. 9. Explain the term extensible. 10. What is an attribute?

Semistructured data, XML, DTDs

XML. XML Syntax. An example of XML:

Chapter 10: Understanding the Standards

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

Java EE 7: Back-end Server Application Development 4-2

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

markup language carry data define your own tags self-descriptive W3C Recommendation

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

Introduction to Data Management CSE 344

XML module 2. Creating XML. Hans C. Arents. senior IT market analyst. I.T. Works. Guiding the IT Professional

Constructing a Document Type Definition (DTD) for XML

XML stands for Extensible Markup Language and is a text-based markup language derived from Standard Generalized Markup Language (SGML).

UNIT I. A protocol is a precise set of rules defining how components communicate, the format of addresses, how data is split into packets

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

Chapter 1: XML Syntax

Semistructured Data and XML

Chapter 13 XML: Extensible Markup Language

Introduction to XML (Extensible Markup Language)

[MS-XML]: Microsoft Extensible Markup Language (XML) 1.0 Fourth Edition Standards Support Document

Contents. Markup Language and the need of XML. Using environment XML and growth direction. To understand dxml standard.

W3C XML XML Overview

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan

Chapter 1. Creating XML Documents

Tutorial 2: Validating Documents with DTDs

Data Exchange. Hyper-Text Markup Language. Contents: HTML Sample. HTML Motivation. Cascading Style Sheets (CSS) Problems w/html

SRI VIDYA COLLEGE OF ENGINEERING & TECHNOLOGY- VIRUDHUNAGAR

Introduction to Database Systems CSE 414

- XML. - DTDs - XML Schema - XSLT. Web Services. - Well-formedness is a REQUIRED check on XML documents

IT2353 WEB TECHNOLOGY Question Bank UNIT I 1. What is the difference between node and host? 2. What is the purpose of routers? 3. Define protocol. 4.

Chapter 1: XML Syntax

Session [2] Information Modeling with XSD and DTD

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

XML: and related technologies

Additional Readings on XPath/XQuery Main source on XML, but hard to read:

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

Appendix H XML Quick Reference

Editor s Concrete Syntax (ECS): a Profile of SGML for Editors

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Semantic Web. XML and XML Schema. Morteza Amini. Sharif University of Technology Fall 94-95

Introduction to Semistructured Data and XML. Contents

XML Extensible Markup Language

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

XML (Extensible Markup Language

Informatique de Gestion 3 èmes Bachelier Groupes 230x

XML and DTD. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 444

XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

XML Metadata Standards and Topic Maps

Transcription:

웹기술및응용 XML Basics 2018 년 2 학기 Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

목차 q Introduction to XML q XML Document Structure and Basic Syntax 2

Introduction to XML

XML (extensible Markup Language) 개요 (1) q What extensible means in XML Ø Capable of being extended Ø Means that you can define your own markups q Markups (Tags) Ø Information added to content of a text that enhances its meaning o Demarcates or labels parts of a text Ø Types of markups in HTML o Semantic Markup: describes the meaning of content E.g.) <TITLE>, <BODY> o Stylistic Markup: describes how to present the content E.g.) <FONT>, <B> o Structural Markup: describes the structure of content E.g.) <P> 4

XML (extensible Markup Language) 개요 (2) q Markup language Ø A set of markups that can be placed in a text for a specific purpose Ø E.g., HTML, WML, VRML, SensorML, MathML, VoiceXML, q XML Ø Extensible markup language = meta-markup language Ø A set of rules to build a markup language and to handle the documents o I.e., family of technologies to describe how to define tags, transform documents, retrieve data, present data, and so on q XML document Ø A document having its content demarcated by XML tags Ø Set of new tag definitions with XML tags 5

XML 의역사 1970 GML (IBM) 1986 1991 SGML HTML WWW q 1986: SGML (Standard Generalized Markup Language) à International Standard (ISO) q 1998: XML 1.0 à De Facto Standard (W3C) 1998 XML q 2004: XML 1.1 q 2006: XML 1.1 (2nd Edition) q 2008: XML 1.0 (5th Edition) 6

Example of XML Document (1) q All XML documents are made up of markups and contents Ø Semi-structured documents Ø Markups and contents complement each other Ø Markups create an information entity with partitions Ø Markups create an labeled data in a handy package <?xml version= 1,0?> <letter priority= important > <to>john</to> <subject>cs760</subject> <message> Don t forget to attend the class <emphasis>on Friday </emphasis> Good luck to you. </message> <from>tomas</from> </letter> 7

Example of XML Document (2) 3 BMW 차에대한 XML 문서 2 XML 저작도구 : BMW 차에대한 XML 문서작성 1 실세계의 BMW 차 BMW 8

XML vs. HTML (1) q HTML 은미리정의된 tag 만을사용, XML 은 tag 를확장가능 q HTML tag 들은주로 content 를화면에보여주기위한방법제공, XML tag 들은문서의구조화혹은 content 에대한 labeling 방법제공 q XML 은 tag 명칭의대 / 소문자를구분 화양동 화양동 우편번호라는사실을알기어려움 9

XML vs. HTML (2) q XML 문서 Ø XML tag 를이용해서 labeling 함으로써 content 의의미를표현가능 <zip>450-3490</zip> 화양동 10

XML vs. Other Electronic Documents q HWP 및 MS Word 문서 Ø 비표준화된전용의이진파일형태로저장 Ø 문서구조정보가없고문서내용과스타일이혼합 Ø 외부프로그램에서문서사용및처리의자동화가어려움 q XML 문서 Ø 일반 text 파일형태로저장하여모든컴퓨팅플랫폼에서판독가능 Ø 문서를구조, 내용및스타일로각각분리하여관리 o 문서구조 : DTD나 XML Schema를기반으로정의 (document model) o 문서내용 : document model에맞추어 content 작성 (valid XML document) o 문서스타일 : 문서내용을표현하기위한스타일정의 (XSL, CSS) Ø 외부프로그램에서문서사용및자동화된처리가용이함 11

Benefits of XML Documents q 다른전자문서와비교한 XML 문서의장점 Ø 데이터의독립성 o 문서의구조 (DTD, XML Schema) 와내용 (document) 을분리 Ø 다양한표현 o 동일한문서내용을다양하게표현이가능 (CSS, XSL) Ø 데이터교환이용이 o Text 및개방형웹표준기반 Ø 데이터검색기능강화 o Semi-structured 문서로서데이터검색이용이 (XPath, XQuery) Ø 문서구조의변형 (transform) 이용이 o E.g., XML 문서 à HTML 문서 (XSLT) o E.g., XML 문서 à MS Word, HWP, PDF 등 binary 문서 (XSLT-FO) 12

XML Technology Family 문서구조 DTD XML Schema SOX 문서스타일 XSLT XSLT-FO XSL, CSS 문서 API SAX DOM JDOM 문서링크 XPath XPointer Xlink XML SOAP WSDL UDDI 서비스 파생언어 WML XHTML MathML 보안 Encryption Signature 저장및검색 XML-DBMS NXD XQuery 13

XML Document Structure and Basic Syntax

XML 기본용어 (1) q Element Ø Labeled container of content Ø Basic building block of XML documents 시작태그 (Start tag) Element to <to type = name > Hong Gildong </to> 내용 (Content) 속성 (Attribute) 마침태그 (End tag) 15

XML 기본용어 (2) q 적절한문서 (Well-formed document) Ø 브라우저나다른프로그램에의해처리될수있도록해주는최소한의규약인 XML 기본문법을준수한문서 1) It contains only properly-encoded legal Unicode characters 2) None of the special syntax characters such as "<" and "&" appear except when performing their markup-delineation roles 3) The begin, end, and empty-element tags which delimit the elements are correctly nested, without missing and overlapping 4) The element tags are case-sensitive; the start and end tags must match exactly 5) There is a single root element which contains all the other elements q 유효한문서 (Valid document) Ø 해당문서의문서모델에맞는문서 o o DTD (Document Type Definition) XML Schema 16

적절한 (Well-Formed) 문서의예 q 정확히하나의최상위 (root) 엘리먼트를가져야함 Ø 적절한문서 : <jumin> </jumin> q 태그가올바르게둘러싸여져야함 (correctly nested) Ø 적절한문서 : <jumin><name>kim</name></jumin> Ø 적절하지못한문서 : <jumin><name>kim</jumin></name> q 각엘리먼트가시작태그와마침태그를모두가져야함 Ø 적절하지못한문서 : <name>kim 또는 kim</name> q 시작태그명과마침태그명이같아야함 ( 대 / 소문자구분포함 ) Ø 적절한문서 : <name>kim</name> Ø 적절하지못한문서 : <name>kim</age>, <name>kim</name> 17

Well-formed 및 Valid Document 검사 18

XML 문서구조 <?xml version= 1.0 encoding= euc-kr?> <!DOCUMENT memo [ <!ELEMENT memo (to, )> ]> XML Declaration Document Type Declaration ( 생략가능 ) Prolog ( 생략가능 ) <memo> <to what= name > 홍길동 </to> <date>2002/04/05</date> <contents> 전화요망 </contents> <from> 허준 </from> </memo> Elements (Contents) 19

간단한 XML 문서구조의예 XML 선언 XML 문서내용 (Elements) 20

Example of XML Document XML 선언 XML 문서내용 (Elements) 21

Tree View of the Example Document Structure Root Element Element Attribute Content 22

Structure of XML Documents q XML Document := Prolog? Element q Prolog Ø Tips off the world that the document is marked up in XML q Element Ø Root element (Document element) Ø Other elements 23

Prolog q Prolog := XMLDecl DocTypeDecl? q Top of XML document is graced with special information Ø XML Declaration o The document is marked up in XML o Example <?xml version= 1.0?> Ø Document Type Declaration o Defines name of the root element o Defines DTD (Document Type Definition) reference à document model 24

XML Declaration q XMLDecl := <?xml versioninfo encodinginfo? standaloneinfo??> Ø version o E.g., version= 1.0 Ø encoding o euc-kr : Korean encoding o UTF-8 : 8-bit Unicode (default) Ø standalone o yes : No external file to load o no : Some files to load (default) When there is an External Entity When DTD is in an external file * Note <??> tag comes from SGML q Examples <?xml version= 1.0?> <?xml version= 1.0 encoding= euc-kr?> 25

Document Type Declaration q DocTypeDecl := <!DOCTYPE root-element extid-of-dtd? > ( [ internal-subset ] )? * Note <!!> and [ ] tags come from SGML q Document Type Declaration Ø Defines name of the root element Ø Defines DTD (internal subset) o For document validity checking o Defines ELEMENT and ENTITY declarations q External subset reference Ø extid-of-dtd refers to an external subset for document type declaration 26

Document Type Declaration Example (1) Root Element DTD 27

Document Type Declaration Example (2) External ID of DTD 28

Element: Building Block of XML Documents q Element := <name (att1= value1 att2= value2 )? > content </name> q Empty Element := <name (att1= value1 att2= value2 )? /> q Example <Caution class= info > Start, End tag should be pair! Name is case-sensitive! Whitespace in content is preserved! Following element is empty element. <EmptyElement/> </Caution> 29

Element: Building Block of XML (cont d) q Naming rules Ø Starts with a letter or underscore (_) Ø Should not start with xml, Xml, xml, xml,, or XML Ø Contains letters, numbers, hyphen (-), period (.) and underscore (_) q Positioning rules for well-formed documents Ø End tag must come after the start tag Ø Elements should be correctly nested o There should be no overlapping elements o An element s start and end tags must both reside in the same parent 30

Element: Building Block of XML (cont d) q Element definition examples Ø <Err>Case-sensitive</err> à </Err>just do it</err> Ø <1st>Don t Start with Number</1st> à <first> </first> Ø <Xml_tag>Don t Start with xml <Xml_tag> Ø < err></err> à <err></err> Ø <e rr></err> à <err></err> Ø <emptyelement/> o Is equal to <emptyelement></emptyelement> o Is not equal to <emptyelement> </emptyelement> because whitespaces are preserved in XML content 31

Attribute: More Muscle for Elements q Attribute := name = value value Ø Gives elements unique properties Ø There can be many attributes in an element (unordered) Ø Attributes are separated by whitespaces (not comma) Ø Attribute names should be unique within an element Ø If the attribute value itself contains double (or single) quotes we can use single (or double) quotes around them q Examples Ø <letter priority= high type= 1 /> == <letter type= 1 priority= high /> Ø <choice test= msg= hi > or <choice test= msg= hi > Ø <team person= sue person= joe > à <team person1= sue person2= joe > 32

Attribute: More Muscle for Elements (cont d) q Attribute Value Types (in DTD) Ø ID o Validating XML parser warns you if the ID doesn t have a unique value through out the document (attribute no in the example below) Ø IDREF(S) o Validating XML parser warns you if the IDREF points to a nonexistent element (attribute with in the example below) Ø Other types: ENUMERATED, CDATA, ENTITY(S), NMTOKEN(S) q Example <part no= bolt-100 /> <part no= bolt-100 /> <part no= bolt-123 /> <part no= nut-123 > <compatible with= bolt-123 /> <compatible with= bolt-456 /> </part> 33

Entity: Placeholder for Content q Entity Ø Contains a part of XML document Ø Something like macro in C (#define): Declare once, use many times Ø Doesn t add anything semantically to the markup Ø Always eliminate an inconvenience o From standing in impossible-to-type characters o To marking the place where a file should be imported (external entity) q Example in the internal-subset <!DOCTYPE letter... [ ]> <!ENTITY w3url http://www.w3.org/ > <letter> <message>hi. John. W3 URL is &w3url;</message> </letter> <message> Hi. John. W3 URL is http://www.w3.org/ </message> 34

Entity: Placeholder for Content (cont d) Used in DTD 35

Entity: Placeholder for Content (cont d) q Character Entity Ø Predefined o Ampersand(&): amp o Apostrophe( ): apos o Greater than(>): gt o Less than(<): lt o Quotation( ): quot Ø Numbered (Unicode from #0 to #65536) o E.g., cedilla(ç): #231 o Alphabetic, syllabic, ideographic scripts Latin Greek 20,000 Han ideographs 11,000 Hangul ideographs,... Ø Named (user defined) o E.g., <!ENTITY cedilla ç > <!ENTITY name Kim > 36

Entity: Placeholder for Content (cont d) q Mixed-Content Entity Ø Contains content of unlimited length Ø Can include markup as well as text o Internal entity E.g., <!ENTITY phone <number>042-999-9999</number> > o External entity E.g., <!ENTITY signature SYSTEM./signature.xml > 37

Entity: Placeholder for Content (cont d) q Example à External entity 38

Entity: Placeholder for Content (cont d) External entity imported from./signature.xml 39

Entity: Placeholder for Content (cont d) q External Entity Example <!ENTITY part1 SYSTEM./p1.xml > <!ENTITY part2 SYSTEM http://www.bobsbolts.com/p2.xml > <!ENTITY part3 SYSTEM http://www.tomsnuts.com/p3.xml > à Local file à www.bobsbolts.com à www.tomsnuts.com 40

Entity: Placeholder for Content (cont d) q Unparsed Entity Ø Should not be parsed by XML parser o Tells parser not to load the entity s content o Normally used for applications Ø May contain something other than text o E.g.) Binary image files <!ENTITY mypic SYSTEM./erik.gif NDATA GIF> à GIF is name of notation data (NDATA) declared as <!NOTATION GIF SYSTEM image/gif > 41

Entity: Placeholder for Content (cont d) q Parameter Entity Ø Only occur in the document type declaration section o Preceded by % (not by & ) Ø Parameter entity references are immediately expanded in the document type declaration o E.g., without parameter entity <!ELEMENT burns (#PCDATA quote)*> <!ELEMENT allen (#PCDATA quote)*> o E.g., with parameter entity <!ENTITY % pcont "#PCDATA quote"> <!ELEMENT burns (%pcont;)*> <!ELEMENT allen (%pcont;)*> 42

Miscellaneous Markups q Comment := <!-- any_text_and_markup --> Ø Tells parser to ignore those regions Ø Within comments, -- should not occur Ø E.g., <!-- <address>59 Sunspot Avene</address> --> q Processing Instruction := <? keyword data??> Ø Container for data targeted toward specific applications or parsers Ø E.g., <?linebreak?> <?xml version= 1.0?> 43

Miscellaneous Markups (cont d) q CDATA Section := <![CDATA[ any_text_and_markup ]]> Ø Tells parser the section contains no markup o Should be treated as a regular text Ø Within a CDATA section, ]]> should not occur o You can use ]]> instead of ]]> Ø E.g.) Using < and > in CDATA section ]]> with CDATA Section 44

References q XML 1.0 (Fifth Edition) Ø W3C Recommendation 26 Nov. 2008 Ø http://www.w3.org/tr/xml 45