Standards in Industry John R. Smith IBM The MPEG Open Access Application Format Florian Schreiner, Klaus Diepold, and Mohamed Abo El-Fotouh Technische Universität München Taehyun Kim Sungkyunkwan University For many years, the creation and dissemination of freely distributable digital content has been growing remarkably. Ideas emerging from the Open Source Initiative and elsewhere have helped create the concept of open content, which is analogous to open source software. Open content is not limited to a specific content type. It can be any creative work that allows for the free distribution and modification of the work. The significance of open content can be observed in the success of social networks and content-sharing sites such as Wikimedia Commons. These sites typically support the management of different content licenses, such as available through Creative Commons (see http:// creativecommons.org/). Creative Commons has defined a set of licenses that allow for the free distribution of content but with differentiations on how the content can be used. An author simply selects the license that contains the conditions that he or she wants to apply to the work. Creative Commons licenses are available in several formats, from liberal use to more restricted. Oneinitiativethatpromotesfreedistribution of content but with more restrictive license constraints is Open Access. Open Access Editor s Note The lack of a unified format for digital-content metadata and associated license information is impeding the open access to and dissemination of digital content. The MPEG Open Access Application Format is a new standard from MPEG that standardizes the packaging of digital content and associated rights information into an application format to facilitate open access and interoperable exchange. The standard builds on and MPEG-7 components for file format, package description, metadata, and rights information. This article describes the motivation and composition of the MPEG Open Access Application Format. John R. Smith promotes open publication of publicly-funded scientific research results. This content is distributed and accessed openly, but is not necessarily open content because, for example, commercial use might not be allowed. Thus, an Open Access publication might contain some content whose use is more restricted. Open Access is an international movement that fosters the publication of free-of-charge scientific literature. In 2003, 255 worldwide, scientific institutions and organizations developed and signed a declaration to state their support for the Open Access movement. 1 Despite these and other similar initiatives, there are several problems hindering the exchange and use of open content, which could be solved through attaching metadata to the content. One problem in this area is that, the legal license is only loosely connected to the content itself. For example, in some cases the license is indicated on the Web page only, so this license information gets lost after download. One way around this issue is to insert the license as text or a reference in the content itself. However, the user must read through the whole license text for each content item to understand the terms. The lack of a machine-readable description attached to downloaded content prevents the automatic parsing, processing, searching, and indexing of the license information. Consumers would greatly benefit from simplified management and exchange of content if the information were in a machine-readable format. This would allow for the development of a search engine for open content, independent of the content type or the license used. Many types of free, distributable content are already available, but the different software and operating system platforms in use and the lack of a common, machine-readable format hinder the exchange and promotion of this content. 8 1070-986X/09/$26.00 c 2009 IEEE Published by the IEEE Computer Society
Creation Distribution Consumption Adaptation and aggregation Author Open Access file Community Author Open Access file Feedback The MPEG group identified this problem and developed the Open Access Application Format as part of the MPEG-A standards to provide a general solution for the publication of free, distributable content. 2 Background Formerly the standards in MPEG-A were known as Multimedia Application Formats. Diepold, Pereira, and Chang describe these standards as essentially superformats that integrate selected technologies from MPEG and eventually other standards to provide a comprehensive technical solution for one or more specific application scenarios. 3 During the development of the MPEG-A standards, the MPEG group realized that even though these standards originate in the multimedia sector, some could also be applied in other areas. To reflect this realization, the name of the MPEG-A standards changed to Application Formats. Examples of some Application Formats are the ISO/IEC 23000-6 Professional Archival Application Format and the ISO/IEC 23000-7 Open Access Application Format. 2 The main purpose of an Application Format is to select existing standards and combine them in a single standard. As a result, an Application Format is a concise set of selected technologies that are precisely defined and aligned to each other within the specification. The Application Formats provide interoperability within a specific application scenario and integrate different existing technologies into one extendable solution. One example of these Application Formats is the Open Access Application Format, which is an extendable packaging format designed to ease and promote the use of freely accessible content. The content can be of any type, for example a video file, an image, or a presentation. The author can use the Open Access Application Format to package multimedia content into one single file or enrich the content with descriptive metadata that contains information about the content, author, and publication. A file that conforms to the Open Access Application Format is called an Open Access file. Figure 1 shows a scenario for the use of the Open Access Application Format. The figure illustrates the three basic steps of an Open Access file: creation, distribution, and consumption of the content. One use case for the standard is the release of a presentation together with the recorded video file of the presentation. The author starts with the creation, in which he or she packages content and XML metadata in a single Open Access file. In the second step, the author releases the created file for distribution. Because the attached metadata is machine-readable, it can help improve the visibility of the content in search engines and repositories. When a user views the presentation, the content is consumed, as shown in the third step in Figure 1. However, before consumption, the consumer must view the license information about the permitted use of the content. Other example applications of the format include the publication of e-learning material or publicly funded research results. Benefits and technologies There are many available file formats for publishing content. However, most file formats are designed for a specific type of content. For example, there are separate file formats for documents, presentations, and audio. Although different files might have some common information, this information is not usable without support from the underlying file format. The Open Access Application Format specifies a basis for interoperability for this common information. The following properties help describe the standard: standardized and open file format, packaging of arbitrary data independent of content type, Figure 1. Scenario of the Open Access Application Format. July September 2009 9
Standards in Industry File Format Digital Item Declaration Part 5 Rights Expression Language (REL), Part 9 File Format, Item 1 Digital Item Identification MPEG-7 Multimedia Description Schemes Rights Expression Language Event Reporting Part 15 Event Reporting, and MPEG-7 Part 5 Multimedia Description Schemes. Item 2 Digital Item Identification Figure 2. Generalized file architecture. global identification of the published content, MPEG-7 Multimedia Description Schemes Resource 1 Resource 2 Rights Expression Language legal license information, Event Reporting author and creation information, Figure 2 shows the architecture of the file format with the relationships between the integrated standards. The basic file format is as defined in the Part 9 File Format standard, which requires packaging the resources and the ISO/IEC 21000-2 Digital Item Declaration in a single file. The File Format is based on the ISO file format, a generic and abstract format, which is already widely used. As shown in Figure 2, the resources are attached at the end of the file and contain the content as binary data. The resource metadata is defined within the Digital Item Declaration, which declares a Digital Item for each resource. A Digital Item contains the content metadata and a resource reference. Thus the Digital Item is a superior structure that specifies a reidentifiable object that comprises both the metadata and content. machine-readable rights expressions, adaptation and aggregation of content, Using Open Access Here we describe the most important benefits of the standard and their technical realization. IEEE MultiMedia feedback mechanism for the author, and support for cryptographic signatures. The technologies used in the Open Access Application Format are based on the standards developed in the MPEG group, in particular the standards in and MPEG-7. As shown in Burnett et al., is a framework of standards developed for the delivery and consumption of multimedia content. 4 The MPEG-7 standards specify a rich set of tools for completely describing multimedia content. From and MPEG-7, the following standards are used in the Open Access Application Format: Part 2 Digital Item Declaration, Part 3 Digital Item Identification, Global identification of the content Recognizing and identifying distributed content is a central aspect of the Open Access Application Format, but the approach has to be flexible enough to allow independent content providers to assign globally unique identifiers to the content. Open Access uses Uniform Resource Identifiers (URI) as defined in the ISO/IEC 21000-3 Digital Item Identification standard to satisfy this requirement. Legal license and author information The legal license is a vital part of the publication, because it informs the user about the permissions for the content s use with a clear definition of the type and version of license applied to the content. Informing the user is important because legal licenses are only binding in a human-readable form. Moreover, given the importance of the license being human-readable, 10
the author should be able to explicitly specify the license applied to the published content. The Open Access Application Format supports the assignment of one or more legal licenses by three methods: as text, URI, or Web page. The insertion of the license text into the metadata is a direct approach that lets the user view the license text instantly. However inserting the license text into the metadata might not be efficient, because the consumer has to read, identify, and understand the license. To address this issue, globally unique URIs can identify the license in a machine-readable way. As with content-identification systems, a registration authority is needed to relate the licenses with unique URIs. An example of such a URI could be http://creativecommons.org/licenses/ by/3.0/, which is the identifier of one of the licenses defined by Creative Commons. In the Open Access Application Format, all information about the legal license is specified using the MPEG-7 Part 5 Multimedia Description schemes. This description additionally contains information about the author and content-creation date. The author information includes, for example, name, address, email, and Web site. Beyond this information, attribution for the creation of the content is assigned to the author, which fulfills the requirement of many legal licenses that require the declaration of the attribution to the original author of the content. Rights expressions The rights expressions used in the Open Access Application Format are defined in the ISO/ IEC 21000-5/Amd3 REL Open Access Content (OAC) profile, 5 which is a profile of the ISO/IEC 21000-5 REL. 6 The REL is a machine-readable format describing what actions are allowed to be exercised on the content. Generally, the Open Access Application Format doesn t specify how the rights information is used in an implementation of the standard. However, one example for the use of the REL is notifying the consumer if he or she is allowed to perform a certain action. The OAC profile is based on the requirements of the Open Access Application Format. The profile consists of a limited set of elements taken from the REL and several extension elements that support publication of content with the Open Access Application Format. The profile includes support for adaptation, copying, and notifications about copyright and commercial use. The MPEG group created a separate profile for this set of rights expressions to ease the integration of the profile in other applications. One other application using the OAC profile is the MPEG-A standard Media Streaming Application Format. 7 The main goal of the OAC profile is to support open licenses such as the ones from Creative Commons. Rodriguez and Delgado showed the possibility of interoperability between Creative Commons licenses and the REL; their result is included in the OAC profile. 8 The work by Rodriguez and Delgado explains how Creative Commons licenses can be expressed with the machine-readable rights expressions of the REL. The mapping allows for representing the basic parameters of the Creative Commons licenses interpreted by the content author, but the rights expressions have no legal relationship with the Creative Commons licenses. The reason the rights expressions have no legal relationship with the Creative Commons licenses is because of the general difficulties in the juridical interpretation of legal licenses, which depend on many parameters such as the country of jurisdiction or extraordinary license terms. Adaptation and aggregation of content The Open Access Application Format explicitly supports the adaptation and aggregation of content to enable an author to freely combine created, adapted, and aggregated items into a single Open Access file and publish it. An aggregation is performed when an author copies published content to another file and combines it with other items. The adaptation of content means that an author chooses to modify content from another author and publish the modified content. Figure 3 (next page) shows an example for adaptation and aggregation in the Open Access Application Format. Two facilities in the Open Access Application Format support adaptations: rights expressions and related identifiers. The OAC profile contains two rights to specify if content may be adapted and under which conditions the adaptation may be performed. The related identifiers declare a relationship between the adapted and the original item after an adaptation. Figure 3 also demonstrates the related identifiers; they allow a user to retrieve related content and enable the legal attribution to the author of the original content. July September 2009 11
Standards in Industry Adaptation and aggregation Author Copy Adapt Distribution Has adaptation Is adaptation Related identifiers Conclusion The Open Access Application Format is an integrated exchange format that improves the management and exchange of content. It can enable Web sites, archives, or search engines to be able to perform automatic processing, indexing, and presentation of content. This method also allows the interoperable exchange of content between providers or between users. For the scientific community, this capability would enhance the visibility and comparability of the content and thus improve the organization and exchange of scientific literature. This would be helpful not only for the scientific community, but also other communities and organizations whose goal is to efficiently distribute and exchange open content. MM Figure 3. An example of adaptation and aggregation of different content. The green and orange stars are aggregated; the red star was adapted. After publication, the related identifiers create a link between the original and the adapted star. IEEE MultiMedia Feedback mechanism The feedback mechanism is optional and allows the author to receive feedback from the published Open Access files. The mechanism uses ISO/IEC 21000-15 Event Reporting to define a request for sending a report on specific events. The report is sent back to the author and gives him or her feedback about the content s use. Figure 1 shows the feedback mechanism in the Open Access Application Format scenario. Cryptographic signatures The integrity and authenticity of distributed files is critical. Consumers like to know that the content they receive is really the same as what the author published. The Open Access Application Format includes cryptographic signatures conformant to the World Wide Web Consortium XML Signature Syntax and Processing standard to ensure content authenticity and integrity. While the use of signatures is optional, an author might decide to sign the content and metadata and allow the consumer to verify the signature. Reference software The reference software that we developed is available for the Open Access Application Format in Java. The software, which is published as open source at http://sourceforge.net/projects/ openaccessaf, works as a file editor that supports the creation and consumption of Open Access files conformant to the standard. The software implements all parts of the standards and demonstrates the use of the provided metadata. References 1. Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities; http://www.zim. mpg.de/openaccess-berlin/berlindeclaration.html. 2. ISO/IEC 23000-7, Information Technology Multimedia Application Format (MPEG-A)-Part 7: Open Access Application Format, MPEG Requirements Group, 2008. 3. K. Diepold, F. Pereira, and W. Chang, MPEG-A: Multimedia Application Formats, IEEE Multi- Media, vol. 12, no. 4, 2005, pp. 34-41. 4. I. Burnett et al., : Goals and Achievements, IEEE Trans. Multimedia, vol. 10, no. 4, 2003, pp. 60-70. 5. ISO/IEC 21000-5/Amd3, Information Technology Multimedia Framework ()-Part 5: Rights Expression Language, Amendment 3: OAC (Open Access Content) Profile, MPEG Requirements Group, 2008. 6. X. Wang et al., The Rights Expression Language, IEEE Trans. Multimedia, vol. 7, no. 3, 2005, pp. 408-417. 7. ISO/IEC 23000-5, Information Technology Multimedia Application Format (MPEG-A)-Part 5: Media Streaming Application Format, MPEG Requirements Group, 2008. 8. E. Rodriguez and J. Delgado, Towards the Interoperability Between REL and Creative Commons Licenses, Proc. 2nd Int l Conf. Automated Production of Cross Media Content for Multi-Channel Distribution (Axmedis), IEEE CS Press, 2006, pp. 45-52. Contact author Florian Schreiner at schreiner@ tum.de. Contact editor John R. Smith at jsmith@us. ibm.com. 12
This article was featured in For access to more content from the IEEE Computer Society, see computingnow.computer.org. Top articles, podcasts, and more. computingnow.computer.org