Information. Introduction. Description. Outcome

Similar documents
Office hours: Tuesday and Thursday 3:30-5:00pm. project description, class notes, grades, etc. Web Data Management and XML L1: Introduction 2

Introduction. Leonidas Fegaras University of Texas at Arlington. Web Data Management and XML L1: Introduction 1

Introduction to XML. Leonidas Fegaras. CSE 6331 Leonidas Fegaras XML 1

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

General Course Information. Catalogue Description. Objectives

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CIS 408 Internet Computing (3-0-3)

Data Presentation and Markup Languages

COMP9321 Web Application Engineering

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CSE 336. Introduction to Programming. for Electronic Commerce. Why You Need CSE336

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS. INTRODUCTION TO INTERNET SOFTWARE DEVELOPMENT CSIT 2230 (formerly CSIT 2645)

CSCI 6312 Advanced Internet Programming

XML: Extensible Markup Language

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Course and Contact Information. Course Description. Course Objectives

Chapter 13 XML: Extensible Markup Language

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

CPET 581 E-Commerce & Business Technologies. Topics

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

Introduction to XML 3/14/12. Introduction to XML

Introduction to XML. XML: basic elements

San José State University College of Science / Department of Computer Science Introduction to Database Management Systems, CS157A-3-4, Fall 2017

Author: Irena Holubová Lecturer: Martin Svoboda

ITSC 1319 INTERNET/WEB PAGE DEVELOPMENT SYLLABUS

Computer Science Department

COMP9321 Web Application Engineering

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

San José State University Department of Computer Science CS-174, Server-side Web Programming, Section 2, Spring 2018

XML. Jonathan Geisler. April 18, 2008

Webservices In Java Tutorial For Beginners Using Netbeans Pdf

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

TUTORIAL QUESTION BANK

M359 Block5 - Lecture12 Eng/ Waleed Omar

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

Course and Contact Information. Course Description. Course Objectives

The XML Metalanguage

Chapter 10 Web-based Information Systems

Informatics 1: Data & Analysis

5/19/2015. Objectives. JavaScript, Sixth Edition. Introduction to the World Wide Web (cont d.) Introduction to the World Wide Web

INFS 2150 (Section A) Fall 2018

IT2353 WEB TECHNOLOGY Question Bank UNIT I 1. What is the difference between node and host? 2. What is the purpose of routers? 3. Define protocol. 4.

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Programming the World Wide Web by Robert W. Sebesta

Data Formats and APIs

SERVICE-ORIENTED COMPUTING

Web Programming Paper Solution (Chapter wise)

CSCI 201L Syllabus Principles of Software Development Spring 2018

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward


W3C XML XML Overview

CSC 261/461 Database Systems. Fall 2017 MW 12:30 pm 1:45 pm CSB 601

XML. extensible Markup Language. ... and its usefulness for linguists

Lesson 14 SOA with REST (Part I)

Developing Web Applications and Services Course Syllabus Fall 2015

Inf 202 Introduction to Data and Databases (Spring 2010)

IT6503 WEB PROGRAMMING. Unit-I

Introduction to Data Management CSE 344. Lecture 1: Introduction

CIS 3308 Web Application Programming Syllabus

XML Metadata Standards and Topic Maps

Introduction to Database Systems CSE 414

KINGS COLLEGE OF ENGINEERING 1

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

Introduction to XML. An Example XML Document. The following is a very simple XML document.

CPS352 - DATABASE SYSTEMS. Professor: Russell C. Bjork Spring semester, Office: KOSC 242 x4377

INFSCI 1017 Implementation of Information Systems Spring 2017

CSC Web Technologies, Spring Web Data Exchange Formats

CTI Short Learning Programme in Internet Development Specialist

Lesson 12: JavaScript and AJAX

Syllabus INFO-GB Design and Development of Web and Mobile Applications (Especially for Start Ups)

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

University At Buffalo COURSE OUTLINE. A. Course Title: CSE 487/587 Information Structures

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

Introduction. Web Pages. Example Graph

San Jose State University College of Science Department of Computer Science CS185C, NoSQL Database Systems, Section 1, Spring 2018

XML, DTD, and XPath. Announcements. From HTML to XML (extensible Markup Language) CPS 116 Introduction to Database Systems. Midterm has been graded

Programming the Web 06CS73 INTRODUCTION AND OVERVIEW. Dr. Kavi Mahesh, PESIT, Bangalore. Textbook: Programming the World Wide Web

COMP9321 Web Application Engineering

Cleveland State University

CTI Higher Certificate in Information Systems (Internet Development)

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Government of Karnataka Department of Technical Education Bengaluru. Course Title: Web Programming Lab Scheme (L:T:P) : 0:2:4 Total Contact Hours: 78

Semistructured data, XML, DTDs

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

Course Web Site. 445 Staff and Mailing Lists. Textbook. Databases and DBMS s. Outline. CMPSCI445: Information Systems. Yanlei Diao and Haopeng Zhang

AIM. 10 September

XML Processing & Web Services. Husni Husni.trunojoyo.ac.id

Hours: See Canvas staff information for TA hours.

Introduction to XML. When talking about XML, here are some terms that would be helpful:

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

Call: JSP Spring Hibernate Webservice Course Content:35-40hours Course Outline

CMPUT 391 Database Management Systems. Fall Semester 2006, Section A1, Dr. Jörg Sander. Introduction

Pellissippi State Community College Master Syllabus ACCESSIBLE WEB DESIGN AND COMPLIANCE WEB 2401

Web Programming Spring 2010

Transcription:

Information Introduction Leonidas Fegaras University of Texas at Arlington Class: TuTh 5:30-6:50pm Instructor: Leonidas Fegaras Office: ERB 653 (Engineering Research Bldg) Phone: 817-272-3629 Email: fegaras@cse.uta.edu Office hours: Tuesday and Thursday 4:00-5:20pm GTA: Upa Gupta upa.gupta@mavs.uta.edu Class Web: https://lambda.uta.edu/cse5335/ Visit the class web page often. It will contain reading assignments, project description, class notes, grades, etc. Web Data Management and XML L1: Introduction 1 Description XML has become an important standardization for data representation and information exchange among Internet cooperative applications. This course provides an in depth study of the area of web data management with an emphasis on XML standards and technologies. The course primarily covers the state of the art in designing and building web applications and services, primarily focusing on issues and challenges that revolve around the management and processing of XML data. Web Data Management and XML L1: Introduction 2 Outcome Upon successful completion of this course, students will be able to: use multiple programming languages and current web technologies to develop dynamic web sites develop web services and dynamic web applications that use web services use XML technologies to generate and process complex XML data store, index, and analyze XML data using relational databases use data analysis tools and cloud computing to analyze web data. Web Data Management and XML L1: Introduction 3 Web Data Management and XML L1: Introduction 4

Prerequisites Prerequisite: CSE 3330/CSE 5330 (Database Systems I) or equivalent Students are expected to have a working knowledge of Java SQL basic HTML Students without adequate preparation are at substantial risk of failing this course Grading The final grade will be based on 40% 8 small programming assignments 25% midterm exam 35% final exam (comprehensive) Final grades will be assigned according to the following scale: A: score >= 90, B: 80 <= score < 90, C: score < 80 Sometimes, I use lower cutoff points, depending on the overall performance of the class Your grades will be available on-line on the course web page Web Data Management and XML L1: Introduction 5 Reading Material There is no required textbook but you are expected to read many online tutorials and references Links will be given out in class Many good online tutorials, eg http://www.w3schools.com/default.asp Many books on web programming and XML standards Programming the World Wide Web, by Robert W. Sebesta (7th Edition) covers the first part of the course (web programming) See the syllabus for more recommended books Web Data Management and XML L1: Introduction 6 Exams Both exams are open notes all notes must be securely bound in one notebook no books are allowed The final exam will cover the material from the first lecture up to and including the last lecture Once the exam grades are posted, you will have 10 business days to dispute your grade and get your exam re-evaluated no re-evaluation will be entertained after the 10 day period No makeup exams will be given unless there is a justifiable reason (such as illness, sickness or death in the family) If you miss an exam and you can prove that your reason is justifiable, you should arrange with the instructor to take the makeup exam within a week from the regular exam time. For any other case, you will get a zero grade for the missed exam. Web Data Management and XML L1: Introduction 7 Web Data Management and XML L1: Introduction 8

Programming Assignments There will be eight small programming assignments Each assignment must be done individually Details will be given out in class Late project will be marked 20 points off per day (out of 100 max) So, there is no point submitting a project report more than 4 days late! This penalty cannot be waived, unless there was a case of illness or other substantial impediment beyond your control, with proof in documents from the school Software Most projects will be done in Java (using JDK 6) but some will be done in JavaScript, PHP, and XQuery you are expected to have a working knowledge of Java, SQL, and HTML The software used for the projects is open-source, free, platformindependent, and well-suited for Java: Java/web development platform: Eclipse Java EE IDE for Web Developers (version: Indigo) You can do most of the projects on your PC/laptop under any platform Linux, MAC OS X, MS Windows, etc directions of how to download the required software will be given out in class Although we will briefly talk about it, we will not use Microsoft ASP.NET (Visual Studio, C#, etc), since this framework is platform-dependent (for IIS only) Web Data Management and XML L1: Introduction 9 Cheating All work in this class must be done individually. No copying is permitted Cheating involves giving assistance to or receiving assistance from other students or from other individuals, copying material from the web, etc I strictly adhere to the University of Texas at Arlington rules and guidelines for handling violations of academic dishonesty. Please refer to the pamphlet "CHEATING: Definitions and Consequences" for additional information You are required to sign and return the statement about academic dishonesty If any one is caught for cheating, or indulge in plagiarism or collusion on a programming assignment or on a exam, the grade for the entire course will be an automatic Fail grade (F) Web Data Management and XML L1: Introduction 10 Miscellaneous Special Accommodations: If you require an accommodation based on disability, I would like to meet with you in the privacy of my office, during the first week of the semester, to make sure you are appropriately accommodated. Web Data Management and XML L1: Introduction 11 Web Data Management and XML L1: Introduction 12

Tentative Schedule Introduction and motivation Web application development Dynamic web pages HTTP GET/POST requests HTML forms Client-side programming (JavaScript) XHTML and CSS stylesheets The document object model (DOM) and dynamic HTML Asynchronous server requests (AJAX) and XmlHttpRequest Server-side programming: PHP scripts Cookies and sessions web mashups Servlets (Tomcat) Java Server Pages (JSP) Database connectivity (JDBC) Web Data Management and XML L1: Introduction 13 Tentative Schedule (cont.) Relational databases and XML XML shredding XML publishing XML on commercial databases (Oracle XML DB, SQL Server SQLXML) XML data management Query processing and optimization Updates and View maintenance Integrity constraints XML search engines Information retrieval Web search engines XML ranking Web services RESTful vs SOAP-based web services Standards: SOAP, WSDL, UDDI Axis and JAX-WS Web Data Management and XML L1: Introduction 15 Tentative Schedule (cont.) Cloud computing Distributed file systems (HDFS, Cassandra) The Map-Reduce framework (Hadoop, Hive, Pig) Amazon Web services and Elastic Compute Cloud (EC2) XML standards DTD and XML Schema XPath XML programming (DOM, SAX, StAX) XSLT XQuery Java/XML data binding (JAXB) XML data modeling Native XML storage management Indexing techniques Xindice and Berkeley DB XML Web Data Management and XML L1: Introduction 14 Traditional DB Applications Typically business oriented Large amount of data Data is well-structured, normalized, with predefined schema Large number of concurrent users (transactions) Simple data, simple queries, and simple updates Typically update intensive Small transactions High performance, high availability, scalability Data integrity and security are of major importance Good administrative support, nice GUIs Web Data Management and XML L1: Introduction 16

Document Applications Human friendly: what-you-see-is-what-you-get paradigm Focus on presentation Information is divided into multiple small documents Mostly static Implicit structure: section, subsection, paragraph, etc Meta-data: title, author, date, indexing keywords, etc Content structure: form/layout, inter-relationships, references Tagging: eg, <p> for new paragraph markup languages: HTML, XML,... Operations: retrieving, editing, spell-checking, printing, etc Information Retrieval: simple keyword search most successful in web search engines (eg, Google) Internet Applications Internet applications use heterogeneous, complex, hierarchical, fast-evolving, unstructured/semistructured data access mostly read-only data require long transactions (business processes) need 100% availability manage millions of users world-wide have high-performance requirements are concerned with security (encryption) are concerned with presentation/interaction on web browser like to customize data in a personalized manner expect to gain user s trust for business-to-consumer transactions Web Data Management and XML L1: Introduction 17 Electronic Commerce Both business-to-business (B2B) and business-to-consumer (B2C) interactions Focus on selling and buying: Order management Product catalogs Product configuration Sales and marketing Education and training Web services Web Data Management and XML L1: Introduction 19 Web Data Management and XML L1: Introduction 18 Other Web Applications Web Services Resource-oriented: RESTful Web Services SOAP-based: SOAP, WSDL, UDDI Web integration Heterogeneous data sources and types Thousands of web-accessible data sources Dynamic data Data warehouses Web publishing Access different types of content from browsers (PDF, HTML, XML) Structured, dynamic, customized/personalized content Integration with application Accessible via major gateways and search engines Application integration Transformation between different data formats (eg, XML, HTML) Integration of multiple applications Web mashups Web Data Management and XML L1: Introduction 20

Current Internet Application Architectures Architecture: Server-Tier: relational databases and gateways to diverse data sources, such as, files, OLE/DB etc. Use of enterprise servers Middle-Tier: provides data integration & distribution, query, etc. Consists of a web server or an application server Client-Tier: mostly a web browser, may run scripts Characteristics: Customization is achieved at the server site (customer data in a database) with some data at the client site (cookies) Load balancing is typically hardware based (multiple servers, DNS routers) Web Data Management and XML L1: Introduction 21 Is HTML Appropriate for Web Interactions? For machine-to-machine interactions, you want to exchange data not interested in data presentation Need to be able to extract data fragments and construct new ones difficult to do this in HTML (see XHTML and Ajax) Need to be able to update and transform HTML Need a universal data representation that is: Good for data exchange among web applications sent through the Internet without transformation (no data marshaling) Powerful enough to capture complex web data Suitable for storage on a database server Amenable to querying and updating Described by powerful schema languages required for validation of web services Adopted by industry supported by standards platform and vendor independent Web Data Management and XML L1: Introduction 23 HTML <html> <head><title>my Web Page</title></head> <body> opening tag hypertext link <h1>introduction</h1> Look at <a href= http://lambda.uta.edu/index.html >this document</a> closing tag <img src= image.jpg width=100 height=50> </body> </html> attribute name attribute value It's a markup language: text (content) + tags (control marks) It is very simple: human readable, can be edited by any editor It reflects document presentation (layout), not the semantics or structure of data Universal: portable to any platform HTML pages are connected through hypertext links HTML pages can be located using web search engines Great for human-to-human and human-to-machine interactions Web Data Management and XML L1: Introduction 22 XML XML (extensible Markup Language) is a textual language for representing and exchanging data on the web It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification Based on SGML XML was developed around 1996 It is called extensible because it is not a fixed format like HTML (a single, predefined markup language) it is actually a metalanguage (a language for describing other languages) which lets you design your own customized markup languages for limitless different types of documents Web Data Management and XML L1: Introduction 24

XML (cont.) XML can be untyped (semistructured), but there are standards for schema conformance DTD XML Schema Without schema, an XML document is well-formed if it satisfies simple syntactic constraints: tags always come in matching start/end tags proper nesting of start and end tags a single root element With a schema, an XML document is valid if its structure conforms to a DTD or an XML Schema Example <people> <person> <name> Leonidas Fegaras </name> <tel> (817) 272-3629 </tel> <email> fegaras@cse.uta.edu </email> <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> </people> Web Data Management and XML L1: Introduction 25 Why XML is so Popular? It looks like HTML simple, human-readable, machine-readable, easy to learn, universal Flexible & extensible, since you can represent any kind of data unlike HTML HTML describes presentation while XML describes content Precise well-formed: properly nested XML tags valid: its structure may conform to a DTD or an XML Schema Supported by the W3C trusted and adopted by industry Many standards around XML: schemas, query languages, etc Web Data Management and XML L1: Introduction 26 Where do the XML data come from? Mostly generated... but few hand-written XML documents Web-Services (SOAP messages, WSDL descriptions) XHTML dumps from relational databases (data publishing) from desktop applications (MS Office XML format docx, pptx,...) configuration files for various apps (eg, for the GNOME desktop) metadata (eg, MPEG-7 metadata) logs, blogs, RSS, stock feeds, news feeds... Web Data Management and XML L1: Introduction 27 Web Data Management and XML L1: Introduction 28

What XML has to do with Databases? XML is an important standardization for data representation and exchange, but we still need to store and query large repositories of XML data data models and schema representations query languages, data indexing, query optimizers updates, view maintenance concurrency, distribution, security, etc Need both heavy-duty databases at the server-side for storing data, and the XML format for exchanging data between applications XML Syntax XML consists of tags and text Text is bounded by tags. CDATA: character data. <title>the Big Sleep</title> <year>1935</year> A CDATA section is text without markup (tags) you can't use < or & but you may use < or & May wrap the text around <![CDATA[ and ]]> <![CDATA[<sender>John Smith</sender>]]> same as: <sender>john Smith</sender> Comments: <!-- declarations for <head> & <body> --> Web Data Management and XML L1: Introduction 29 XML Syntax (cont.) Special characters in CDATA: &#xxx; where xxx is the ASCII octal number of the character eg, for space You may define special characters as entities (in DTD): <!ENTITY nbsp " "> Use in CDATA: Some predefined entities: & < > &apos; " Processing instructions: <?application-name param= value...?> PCDATA (Parsed Character Data): characters to be parsed by the XML parser contains both CDATA and markup that constitute valid XML syntax Web Data Management and XML L1: Introduction 30 XML Syntax (cont.) Tags come in matching pairs: <date>8/25/2004</date> A tagname must start with letter or underscore and can contain only letters, numbers, hyphens, periods, and underscores Name ::= NameStartChar (NameChar)* NameStartChar ::= ":" [A-Z] "_" [a-z] NameChar ::= NameStartChar "-" "." [0-9] For each opening tag there must be a matching closing tag Tags must be properly nested: valid nesting: <person> <name>... </name>... invalid nesting: <person> <name>...... </name> An XML doc must have exactly one element, called the root XML declaration: optional, at the beginning of XML doc <?xml version="1.0" encoding="utf-8"?> Web Data Management and XML L1: Introduction 31 Web Data Management and XML L1: Introduction 32

XML Elements An element is a segment of an XML document between an opening and the matching closing tags <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> An element may contain a mixture of sub-elements and CDATA <title>an <em>element</em> is a segment</title> An abbreviation: for an element with empty content, we can use: <tagname/> instead of: <tagname></tagname> Representing Data Using XML Nesting tags can be used to express various structures, such as a record: <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> We can represent a list by using the same tag repeatedly: <addresses> <person>... <person>... <person>...... </addresses> Web Data Management and XML L1: Introduction 33 XML: in Lisp: XML structure <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> (person (name Ramez Elmasri ) (tel (817) 272-2348 ) (email elmasri@cse.uta.edu )) as a tree data structure: person name tel email Ramez Elmasri (817) 272-2348 elmasri@cse.uta.edu Web Data Management and XML L1: Introduction 35 Web Data Management and XML L1: Introduction 34 Attributes An opening tag may contain attributes typically used to describe the content of an element <author ssn="2787901"> <name>ramez Elmasri</name> </author> You may have multiple attributes in an opening tag but each attribute name must be different It's not always clear when to use attributes <author> <ssn>2787901</ssn> <name>ramez Elmasri</name> </author> ID attributes are special: must be unique within the document An IDref attribute must refer to an existing ID in the same doc Web Data Management and XML L1: Introduction 36

Referencing Elements Using IDs/IDrefs <family> <person id="jane" mother="mary" father="john"> <name> Jane Doe </name> <person id="john" children="jane jack"> <name> John Doe </name> <mother/> <person id="mary" children="jane jack"> <name> Mary Doe </name> <person id="jack" mother="mary" father="john"> <name> Jack Doe </name> </family> Web Data Management and XML L1: Introduction 37