Evaluation of XML Schema Quality in multimedia content publishing domain

Similar documents
29-Jan-15. Faculty of Electrical Engineering and Computer Science. University of Maribor

Two-dimensional Extensibility of SSQSA Framework

Toward Language Independent Worst-Case Execution Time Calculation

If looking for a ebook XML Programming Success in a Day: Beginner's Guide to Fast, Easy, and Efficient Learning of XML Programming (XML, XML

Data science How to prepare engineers for this field

DOWNLOAD OR READ : XML AND XSL TWO 1 HOUR CRASH COURSES QUICK GLANCE PDF EBOOK EPUB MOBI

Course Structure A : General Education Course B : Major Course C : Free Elective Course

Multi-Channel Publishing for AllFusion Gen

Many of these have already been announced via JIT weekly updates on the KB-L listserv, but are gathered here for a more complete look.

A Web-based XML Schema Visualizer José Paulo Leal & Ricardo Queirós CRACS INESCPORTO LA

2nd Semester Examination 2016 Higher National Diploma In Accountancy First Year. Subject Name Morning Date Subject Code Subject Name Evening

Informatics 1: Data & Analysis

1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Introduction on ETSI TC STQ Work

Publishing Technology 101 A Journal Publishing Primer. Mike Hepp Director, Technology Strategy Dartmouth Journal Services

Survey Introduction. Thank you for participating in the WritersUA Skills and Technologies survey!

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

INFORMATION SECURITY MANAGEMENT SYSTEMS CERTIFICATION RESEARCH IN THE ROMANIAN ORGANIZATIONS

TextProc a natural language processing framework

Evolution of XML Applications

Bachelor of Arts Program in Information Science

Shankersinh Vaghela Bapu Institue of Technology

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Searching within EBSCO databases. Urszula Nowicka Training Specialist EBSCO Information Service Phone: (+48)

1. Survey Introduction

TASK FORCE ON SERVICES STATISTICS

Program Generators and Control System Software Development

Digital Humanities Digital Economics. Digital Arts. 05. One elective course from the courses offered at the university faculties

A Holistic Approach to Cyber Security

Growing Our Own Through Collaboration

COMMUNICATIONS Department Chair: Jamie Durler , ext. 240

Android How To Validate Xml Against Schema In Xmlspy

Academic Reference Standards (ARS) for Electronics and Electrical Communications Engineering, B. Sc. Program

INDEX ABOUT US 3 ARAB CERTIFIED QUALITY MANAGER PROGRAM. Body of Knowledge 6 UNESCO ICT INDICATORS 8 MESSAGE FROM THE CHAIRM AN

BS Electrical Engineering Program Assessment Plan By Dan Trudnowski Spring 2018

Sponsoring of Continuing Medical Education in Germany

Internet Traffic Classification Using Machine Learning. Tanjila Ahmed Dec 6, 2017

Master degree program Technical legislation, standardization and quality management

Continuing Professional Development. Standards, principles, and practices

Bachelor of Science in Software Engineering (BSSE) Scheme of Studies ( )

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

Certiport Certification Programs

in Wireless Application Protocol World

Sparse Matrices. Mathematics In Science And Engineering Volume 99 READ ONLINE

STUDIES IN DIGITAL SYSTEMS INVESTMENT ON KNOWLEDGE OF DIGITAL SYSTEMS

UCD APPLICATION INSTRUCTIONS. Non-EU exchange students 2016/17. Please consider the environment before printing this document.

BRA BIHAR UNIVERSITY, MUZAFFARPUR DIRECTORATE OF DISTANCE EDUCATION

1. BACHELOR OF SCIENCE IN COMPUTER SCIENCE EDUCATION

Cleveland State University Department of Electrical and Computer Engineering. CIS 408: Internet Computing

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

A Study on Metadata Extraction, Retrieval and 3D Visualization Technologies for Multimedia Data and Its Application to e-learning

MENTA Mobile Health Platform for a Better Care. ERRIN Health Project Development Session 25 th November 2014

Kelly Morris, CHRL x342. Exams Manager Human Resources Professionals Association

CONTINUING PROFESSIONAL DEVELOPMENT RULES

The Replication Technology in E-learning Systems

Automating Publishing Workflows through Standardization. XML Publishing with SDL

CFASAA231 Use IT to support your role

Illinois State University 2012 Alumni Survey Institution Report

Illinois State University 2014 Alumni Survey Institution Report

CURRICULUM The Architectural Technology and Construction. programme

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

Research Data Management

Some experiences in adding a new language to SSQSA architecture a part of SSQSA front-end. Jozef Kolek, Gordana Rakić, Zoran Budimac

Academic Course Description

How Often and What StackOverflow Posts Do Developers Reference in Their GitHub Projects?

B.V.Patel Institute of Business Management, Computer & Information Technology, UTU

EBS goes social - The Triumvirate Liferay, Application Express and EBS

BTEC LEVEL 4 Higher National Certificate in Business

JAVA CREATE XML DOCUMENT EXAMPLE

Communication (COMM) Communication (COMM) 1

Information Architecture of University Web portal

Curriculum for B.Sc. in Business Information Systems

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

What's New in ActiveVOS 9.0

RESEARCH ANALYTICS From Web of Science to InCites. September 20 th, 2010 Marta Plebani

Energy efficiency of ICTs: EU initiatives and ETSI standards for its assessment

Direct Submit to SafeAssign

Prof.Dr. Sotiraq Dhamo Doc. Julian Naqellari The University of Tirana Accounting Department

DOWNLOAD OR READ : TOMTOM ONE 3RD EDITION EUROPE MAPS PDF EBOOK EPUB MOBI

Informatics 1: Data & Analysis

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

XML. Objectives. Duration. Audience. Pre-Requisites

DOWNLOAD OR READ : MONTHLY REVIEW OF THE U S BUREAU OF LABOR STATISTICS VOLUME 3 PDF EBOOK EPUB MOBI

XF Rendering Server 2008

Lukáš Plch at Mendel university in Brno

University of Richmond Chief Operating Officer Organization Chart August 2017

LEVENT V. ORMAN. Cornell University S.C. Johnson Graduate School of Management Ithaca, NY Telephone: (607)

A Guide for Proposal Writing Call for Proposals of projects using the HPCI System (General Trial Use project using K computer)

Jim Mains Director of Business Strategy and Media Services Media Solutions Group, EMC Corporation

Sharepoint Build Template Homepage

C++: The Complete Beginner's Guide To Learn C++ Programming (computer Coding) By Bruce Berke

Global cybersecurity and international standards

The 2018 (14th) International Conference on Data Science (ICDATA)

Bismarck State College

S.No Description 1 Allocation of subjects to the faculty based on their specialization by the HoD 2 Preparation of college Academic Calendar and

Introduction to XML 3/14/12. Introduction to XML

Claude Balthazard, PhD, C.Psych., CHRL x327

DOWNLOAD OR READ : MULTIMEDIA AND IMAGING DATABASES PDF EBOOK EPUB MOBI

Finite Fields And Applications: 7th International Conference, Fq7, Toulouse, France, May 5-9, 2003, Revised Papers (Lecture Notes In Computer

DOWNLOAD OR READ : WEBSITE LOCALIZATION STANDARD REQUIREMENTS PDF EBOOK EPUB MOBI

Transcription:

Evaluation of XML Schema Quality in multimedia content publishing domain Maja Pušnik, Boštjan Šumak University of Maribor Faculty of Electrical Engineering and Computer Science maja.pusnik@um.si

Motivation 1: combining efforts from projects and research work Optimization of business processes (for the publishing domain) Integration of existing IT systems with new solutions (supported by XML technologies) Measuring quality of IT solutions (adjusted software metrics)

Motivation 2: integration in the learning process XML schemas in the learning process Academic study program (1st year): Basics of web technologies Basics of XML and its connection to HTML Academic study program (3rd year): System convergence and integration Specifics of XML and its use in Java web services and Java applications Professional study program (3rd year) Development of information services Specifics of XML and its use in C# web services ands C# applications Optimization and quality management Second Cycle Bologna Study Programmes (1st year): Business process optimization Business process modeling, simulation, optimization and reporting Second Cycle Bologna Study Programmes (2nd year): Operational research Use of operational research methods for optimization of business process and IT solution optimization

Agenda The multimedia content publishing domain The role of XML technologies XML Schemas and the metric system Quality index Evaluation of the publishing domain

Publishing process

The problems of the publishing process Publishing organizations must provide the same content in various formats in order to meet the needs of their clients large amounts of data poor organization (of knowledge databases) Increasing involvement of multimedia contents The publishing process is performed in both printed and electronic form, however there is an increasing number ebooks (Shaffer, 2012). Little IT support Almost no automation

Multimedia content publishing process quality The publishing process can be optimized, simplified and become more efficient with XML technologies support regardless of the document origin and content Optimization Modeling PUŠNIK, Maja. Using XML technologies for various data format transformations : lecture presented at Workshop in Bohinj (project Software engineering - Computer science education and research cooperation), August 24-29, 2015. 2015 Simulation

The role of XML technologies

XML family of technologies connected XML XML XML HTML XSD HTML PDF ODT DOC DOCX INPUT XSLT XPath XSL-FO CSS OUTPUT epub PDF ODT DOCX

Why use XML technologies Enabling a (semi) process automation Addressing different challenges: technical organizational financial Finding balance between manual and automatic steps

XML technologies in the publishing process (architecture design) Data/ XML content XSLT PDF XSLT HTML PDF.docx Content recognition Transformations (XSLT) Program components External document generating HTML epub

XML schemas in the publishing process XML Schema XML Schema Data/ XML content XSLT PDF XSLT HTML PDF XML Schema XML Schema XML Schema.docx Content recognition Transformations (XSLT) Program components External document generating HTML epub XML Schema XML Schema

Inlining Structured data, within different Global data types formats Global elements

Survey among developers Poor quality of generated or existing XML schemas Quality aspects analysis Need for data types definition Conflict and inconsistencies risk Need for reuse Poor understanding of complex XML schemas Need for multiple schema usage Need for group and reference usage Need for supporting program languages Need for additional technologies Transparency and documentation Structural quality aspect 0 2 4 6 8 10 12 14 16 Cmax Complexity/quality dependence analysis C = f(q) Optimality Focused documentation and categorization Building blocks employment balance Building blocks balance XML Schema Quality Balance between obligatory and supplementary Ability to refactor without disturbing the original skeleton Ability to reapply building blocks Flexibility Cavg Minimalism Reuse Cmin Qmin Qavg Qmax

Optimality XML Schemas Quality index parameters Transparency and documentation Focused documentation and categorization Building blocks employment balance Minimalism XML Schema Quality Balance between obligatory and supplementary Building blocks balance Structural quality aspect Ability to refactor without disturbing the original skeleton Ability to reapply building blocks Reuse Flexibility Aspect 1 Aspect 2 Aspect 3 Aspect 4 Aspect 5 N an = number of annotations X X X N ri_all = number of external XML schemas X X X N E = number of elements X X X X X X N E_g = number of global elements X N E_l = number of local elements X N E_s = number of simple elements X N E_gc = number of global complex elements X N E_gs = number of global simple elements N at = number of all attributes X X X X X X N at_l = number of local attributes X LOC = lines of code X N g = number of all groups X X X N E_group = number of element groups X X N A_group = number of attribute groups X X N re_all = number of references on elements X X N ra_all = number of attribute references X X N rg_all = number of group references X X N r = number of restrictions X N t_i = number of derived data types X X N t = number of all data types X X X X N rt_all = number of all used data types X X X N t_s = number of simple data types N t_c = number of complex data types X X N E_U = number of unbounded elements X X X X Aspect 6

Structural quality aspect (QA1) QQQQ 1 = NN rrrr_aaaaaa + NN EE NN aaaa + NN rr NN tt_ss + NN tt_ss NN tt_cc + NN aaaa + NN tt_ii + NN EE_UU NN EE Transparency and documentation of the XML Schema (QA2) QQQQ 2 = NN aaaa + NN EE gggggggggg + NN AA gggggggggg + NN gg NN EE + NN aaaa NN EE NN aaaa NN EE XML schema optimality quality aspect (QA3) QQQQ 3 = 1 7 (NN EE_ll NN EE + NN aaaa_ll NN aaaa + 1 NN EE_gggg NN EE NN EE_ss + NN EE_gggg NN EE_ss + NN tt NN EE + NN aaaa + NN gg NN EE_gggg + 1 NN EE_UU NN EE XML schema minimalism quality aspect (QA4) QQQQ4 = NN aaaa + NN EE + NN aaaa LLLLLL + NN rrrr_aaaaaa NN tt XML schema reuse quality aspect (QA5) QQQQ5 = NN rrrr_aaaaaa + NN rrrr_aaaaaa + NN rrrr_aaaaaa + NN rrrr_aaaaaa + NN rrrr_aaaaaa + NN tt_ii NN EE + NN aaaa + NN gg + NN tt XML schema flexibility quality aspect (QA6) QA6 = NN EE gggggggggg +NN AAgggggggggg +NN gg+nn rreeaaaaaa +NN rraaaaaaaa + NN rrggaaaaaa +NN rriiaaaaaa + NN AAAA + NN rrrr_aaaaaa NN EE_UU NN EE + NN aaaa + NN tt + NN gg XML Schemas Quality aspects and quality index Q i = 1/6( QA 1 + QA 2 + QA 3 + QA 4 +QA 5 +QA 6 )

Set of domains, using XML schemas D1 - Mathematics and Physics D2 - Materials Science D3 - Telecommunications D4 - Manufacturing D5 - Energy and Electronics D6 - Engineering D7 - IT architecture and design D8 - Traffic D9 - Communications D10 Computer Science D11 Decision Science D12 Medicine D13 - Economics and finance D14 - Law D15 - Social science D16 Health and sport D17 Construction D18 - Librarianship (Library) D19 - Landscape and geography D20 Media, journalism, newspapers D21 - The publishing domain

Comparing publishing domain with average results (of existing set of domains) PARAMETERS Publishing domain - average All domain - average Number of imports 1,400 0,795 number of all elements 86,600 77,727 number of global elements 48,000 26,755 number of local elements 38,600 50,973 number of simple elements 31,800 27,691 number of complex elements 54,800 50,036 number of global complex elements 35,700 19,250 number of global simple elements 12,700 7,514 number of all attributes 26,600 47,655 number of local attributes 26,600 47,091 number of global attributes 0,000 0,564 Lines of code 14969,200 3188,618 number of element groups 1,000 4,364 number of attribute groups 1,300 1,377 number of element references 65,400 69,118 number of references on simple elements 9,000 3,177 number of references on complex elements 56,400 65,941 number of references on attributes 0,000 1,927 number of references on element groups 0,200 7,114 number of references on attribute groups 3,700 11,664 Number of annotations 0,900 0,977 Number of restrictions 30,100 75,118 Number of derived (extended) types 41,800 35,309

QA 6 QA 5 QA 4 QA 3 QA 2 QA 1 Quality aspects: Publishing vs. Average domain 0 0,2 0,4 0,6 0,8 1 Evaluation of the publishing domain Publishing Average QUALITY APSPECTS Publishing domain - average All domain - average Structural quality aspect (QA1) 0,387 0,374 Transparency and documentation of the XML Schema (QA2) 0,005 0,022 XML schema optimality quality aspect (QA3) 0,280 0,214 XML schema minimalism quality aspect (QA4) 0,071 0,072 XML schema reuse quality aspect (QA5) 0,667 0,117 XML schema flexibility quality aspect (QA6) 0,908 0,950

Evaluation of the publishing domain XML schemas from the publishing field are above average: the publishing domain does use XML schemas, the quality of them is above average however they still need to be improved mostly in the quality aspect of: transparency, documentation flexibility

Research questions Does the publishing domain use XML documents and what standard XML schemas are being used? Several XML schemas were found, connected to the publishing field (respectively publishing process) through active research. What is the quality level of XML schemas in the publishing domain? Average quality of XML schema in the publishing field is 39%. How are they compared to XML schemas in other domains such as computer science and other? How can the level of quality be improved? The quality index of 39% is higher than by the average quality index (of all 20 domains, where XML schemas are most common) which is 29% based on an experiment in 2014. Comparing to average XML schemas, the publishing field had lower results only at transparency and documentation quality aspect, all other quality aspects were above average.

Work in progress Publishied work PUŠNIK, Maja, HERIČKO, Marjan, BUDIMAC, Zoran, ŠUMAK, Boštjan. XML Schema metrics for quality evaluation. Computer science and information systems, 2014, vol. 11, no. 4, str. 1271-1289 PUŠNIK, Maja, RAKIĆ, Gordana, BUDIMAC, Zoran, HERIČKO, Marjan. Different approaches for measuring XML Schemas. Collaboration, software and services in information society : proceedings of the 18th International Multiconference Information Society - IS 2015, October 12th, 2015, Ljubljana, Slovenia : volume D. Ljubljana: Institut Jožef Stefan, 2015, str. 17-20. Sent work Maja Pušnik, Marjan Heričko And Boštjan Šumak, Gordana Rakić, XML Schema Quality index in the multimedia content publishing domain, SQAMIA 2016 Gordana Rakić, Zoran Budimac, Marjan Heričko, Maja Pušnik. Towards the XML Schema Measurement Based on Mapping Between XML and OO Domain. SCLIT 2016

Discussion What are the (unpremeditated) problems in the publishing process? What is the existing level of IT support in similar processes? What is the quality of the publishing process? What is the user experience of all involved? How often errors occur and how critical they are? Are XML Technologies the only solution? Are XML Technologies the best solution? Are XML Technologies the most suitable solution for the publishing domain?

Thank you for listening! QUESTIONS? Maja.Pusnik@um.si Bostjan.Sumak@um.si