Metadata. Considerations for digital libraries. Tefko Saracevic, Ph.D. Tefko Saracevic

Similar documents
Metadata Workshop 3 March 2006 Part 1

RDF and Digital Libraries

Metadata Standards and Applications. 4. Metadata Syntaxes and Containers

Semantic Web Systems Ontology Matching. Jacques Fleuriot School of Informatics

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Metadata: The Theory Behind the Practice

7.3. In t r o d u c t i o n to m e t a d a t a

Information retrieval concepts Search and browsing on unstructured data sources Digital libraries applications

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Guidelines for Developing Digital Cultural Collections

Digital Library Curriculum Development Module 4-b: Metadata Draft: 6 May 2008

Building Consensus: An Overview of Metadata Standards Development

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005

Creating descriptive metadata for patron browsing and selection on the Bryant & Stratton College Virtual Library

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

User Interaction: XML and JSON

Metadata Cataloging. regarding items. For the assignment, I chose to outline some fields from three different

Taskgroup A progress report

Digitisation Standards

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of

Labelling & Classification using emerging protocols

METADATA AT DIGITAL-INFORMATIVE ERA

Anne J. Gilliland-Swetland in the document Introduction to Metadata, Pathways to Digital Information, setting the Stage States:

A tool for Entering Structural Metadata in Digital Libraries

Metadata Overview: digital repositories

Semantic Web: vision and reality

Creating and Maintaining Metadata Vocabularies for Networkbased

Metadata for the Web A Necessary Evil? CS 431 March 2, 2005 Carl Lagoze Cornell University

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

Semantic Web Lecture Part 1. Prof. Do van Thanh

User Interaction: XML and JSON

Comparison and mapping of VRA Core and IMS Meta-data

DCMI Abstract Model - DRAFT Update

Chinese-European Workshop on Digital Preservation. Beijing (China), July 14 16, 2004

XETA: extensible metadata System

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Proposal for Implementing Linked Open Data on Libraries Catalogue

Contribution of OCLC, LC and IFLA

Role of Social Media and Semantic WEB in Libraries

User Interaction: XML and JSON

Introduction to XML 3/14/12. Introduction to XML

IoT Standards Ecosystem, What s new?

Content structure breakdown, re-use of components and the importance of metadata Carl Mann

Data formats for exchanging classifications UNSD

Structured documents

Semantic Interoperability for e-research in the Sciences, Arts and Humanities. Imperial College March 30 th Data Webs:

METAINFORMATION INCORPORATION IN LIBRARY DIGITISATION PROJECTS

The Dublin Core Metadata Element Set

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

ISO INTERNATIONAL STANDARD. Information and documentation Managing metadata for records Part 2: Conceptual and implementation issues

Device Independent Principles for Adapted Content Delivery

From Strings to Things

GETTING STARTED WITH DIGITAL COMMONWEALTH

ISO/IEC INTERNATIONAL STANDARD. Information technology Software asset management Part 2: Software identification tag

The Semantic Web. Ovidiu Chira. IDIMS Report 21 st of February 2003

From Open Data to Data- Intensive Science through CERIF

METADATA POLICIES FOR THE DESCRIPTION OF DIGITAL FOLKLORE COLLECTIONS

Integration of resources on the World Wide Web using XML

1-9 August 2003, Berlin

Data Partnerships to Improve Health Frequently Asked Questions. Glossary...9

3. Technical and administrative metadata standards. Metadata Standards and Applications

Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

Metadata for Electronic Information Resources

Draft for discussion, by Karen Coyle, Diane Hillmann, Jonathan Rochkind, Paul Weiss

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia framework (MPEG-21) Part 21: Media Contract Ontology

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

A Gentle Introduction to Metadata

The Power of Metadata Is Propelling Digital Imaging Beyond the Limitations of Conventional Photography

MARC Futures. International Workshop: MARC 21 Experiences, Challenges, and Visions May Sally H. McCallum Library of Congress

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

UNIMARC and XML at the BN. Nuno Freire José Borbinha Hugo Manguinhas INESC-ID

Towards the Semantic Web

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Metadata harmonization for fun and profit

NEW TECHNOLOGIES AND MARC

Metadata for Digital Collections: A How-to-Do-It Manual

A Dublin Core Application Profile in the Agricultural Domain

MARKUP LANGUAGES. A brief history of Markup languages

Data Exchange and Conversion Utilities and Tools (DExT)

Semantic Web Mining and its application in Human Resource Management

MarcOnt - Integration Ontology for Bibliographic Description Formats

XF Rendering Server 2008

TEXAS A&M AGRILIFE EXTENSION SERVICE PROCEDURES

Full file at

Multi-agent Semantic Web Systems: Data & Metadata

10 Steps to Building an Architecture for Space Surveillance Projects. Eric A. Barnhart, M.S.

Digital Libraries at Virginia Tech

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 5: Multimedia description schemes

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Alphabet Soup: A Metadata Overview Melanie Schlosser Metadata Librarian

XML Update. Royal Society of the Arts London, December 8, Jon Bosak Sun Microsystems

Joining the BRICKS Network - A Piece of Cake

Where is the Semantics on the Semantic Web?

Exploring the Concept of Temporal Interoperability as a Framework for Digital Preservation*

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview

This document is a preview generated by EVS

Digital Imaging and Communications in Medicine (DICOM) Part 1: Introduction and Overview

CommonLook Office GlobalAccess Quick Start Guide using Microsoft PowerPoint

This document is a preview generated by EVS

Transcription:

Metadata Considerations for digital libraries, Ph.D. 1

ToC Definitions Context & problems: digital resources libraries & the Web Precursors Types of metadata Examples Issues in applications 2

Definitions Data about data glib, content-free catchphrase, but very popular Structured data about information resources describing their elements and functions Value added information which enables information objects to be: identified represented managed accessed & searched preserved 3

What? Machine understandable information for digital resources (particularly on the Web) - emphasis on machine description of what a resource* or part is all about e.g. labeling title, author, source, subjects a simple description what is metadata? *subsumes: textual documents, pictures, illustrations, movies, simulations, art objects, artifacts, software anything that has an identity 4

Force for metadata developments: the Web Fastest growing technology in history Explosive growth of WWW provided ubiquity of information and access but also information chaos & anarchy growing difficulty in identifying, searching & retrieving lost in an ocean metaphors 5

Problem To organize & search the Web needed: knowledge about the structure of data but Web data & databases fuzzy structures vary widely; no consistency constantly evolve over time lack of agreement about meaning of even simple terms & concepts in structure 6

Solution Some standardized description or language to increase functionality a mechanism for a more precise description of things on the Web Future: Going from machine-readable to machine-understandable semantic Web missing in original Web architecture METADATA! 7

metadata 8

Where? In volatile digital environments metadata describe electronic resources, texts & multimedia metadata exist or have meaning only in relation to the referenced document or object provide information about the object 9

Why? To standardize description of electronic resources in collection(s) in order to: aid in identification, organization, & location of objects (documents) enable effective search of variety of objects distributed all over sometimes also to provide controls (e.g. validation, rights, provenance, ratings...) 10

Importance Standard metadata descriptions are a prerequisite to common use effective searching intelligent roaming by agents validation, ratings, 11

Precursor: markup languages SGML Standard General Markup Language granddaddy (standard in 1986) marks elements within documents derived from old markups for typesetting adapted by communities producing electronic documents machine independent main reason for success transportable from one hardware & software to another; substitutes strings many extensions & specific applications 12

Principles ALL markup language must specify what markup means what markup is allowed what markup is required how markup is distinguished from text All markup languages & applications follow these principles underlying concepts are fairly simple but they get very confusing real fast. 13

Followed by extensions HTML (technical specification) HTML (Wikipedia) Most famous & successful allows for metatags in the Head but these are not used much, even discouraged by some in the body could be indirect XML (technical) XML (Wikipedia) Extensible Markup Language data format for structured document interchange & interoperability on WWW increases functionality of SGML & combines with ease of use of HTML 14

Standards for metadata Many standards developed or proposed Depends on need of a domain & purpose in application Conflicts between need for specialized standards domain or community specific, and generic standards enabling resource sharing/use/discovery across domains 15

Formal groups Who specifies metadata standards? national & international standards organizations - ISO, ANSI, NISO Informal groups WWW Consortium (W3C) Dublin Core Metadata Initiative Standards at the Library of Congress 16

Proliferation Currently: proliferation of metadata standards activities -many domains a lot of confusion & incompatibility in document description & libraries coordination through liaisons & a number of projects in the U.S & internationally strength: domain experts involvement weakness: limited perspective; re-invention 17

Sample of metadata projects Encoded Archival Description (EAD) Text Encoding Initiative (TEI) - international consortium for standards for digital texts Geospacial data - Federal Geographic Data Committee Z39.50 standards information retrieval Understanding metadata NISO publication exactly aimed at what the title says includes simple description of various standards with examples also listing of metadata sites 18

Libraries In libraries metadata have a rich tradition long preceding the Web (but not called metadata) cataloging rules, standards widely applied MARC (Machine Readable Cataloging) a computer-readable format that is used for bibliographic records enabled worldwide exchange of cataloging records but long standing problems with searching 19

Dublin* Core Metadata Initiative making it easier to find information international initiative to define a core set of metadata for description of digital resources Web oriented wide interest & a lot of work, but not widely applied on the Web * named for a 1995 conference in Dublin, Ohio, seat of OCLC, not Molly Malone s fair city set of 15 elements: Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights 20

How does it look like? A Dublin Core record for a short poem, encoded as part of a Web e page using the <META> tag in HTML (from: Univ of Queensland Introduction to Metadata) Dublin Core metadata <HTML>!4.0! <HEAD> <TITLE>Song of the Open Road</TITLE> <META NAME="DC.Title" CONTENT="Song of the Open Road"> <META NAME="DC.Creator" CONTENT="Nash, Ogden"> <META NAME="DC.Type" CONTENT="text"> <META NAME="DC.Date" CONTENT="1939"> <META NAME="DC.Format" CONTENT="text/html"> <META NAME="DC.Identifier" CONTENT="http://www.poetry.com/nash/open.html"> </HEAD> <BODY><PRE> I think that I shall never see A billboard lovely as a tree. Indeed, unless the billboards fall I'll never see a tree at all. </PRE></BODY> </HTML> 21

Crosswalks: tables showing similarities & differences between metadata schemes coping with different metadata standards Examples: Comparing schemes MARC to Dublin Core Crosswalk by LoC Metadata Standards Crosswalk by Getty A Repository of Metadata Crosswalks D-Lib Magazine 22

Semantic Web a hope for a future Web Effort by W3C (World Wide Web Consortium) led by Tim Berners-Lee, developer of the Web The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Based on Resource Description Framework (RDF) So far more a vision of extension of the Web & creation of various tools than real viable application and a bit nebulous as well Not clear how may be applied widely, even if viable 23

Library interoperability Library catalogs mostly bound by proprietary software Middleware needed e.g. protocols (based on Z39.50) provide for interaction of clients with many servers (catalogs) Problems remain with semantic interoperability 24

Metadata & digitization Metadata assignment (cataloging) a key component in digitization of resources and electronic publishing Choices: a spectrum of possibilities to select & apply metadata Search for automation in assigning metatags to speed up the process and make it economical as yet progress incremental connection with cataloging, indexing 25

Decisions, decision How & what to plan for metadata creation in conjunction with digital libraries? Target audience? Scope and depth? What to adopt? plug-in a scheme? How to integrate metadata projects? Needed skills? training? staffing? 26

Issue: $$$$ Costs of metadata: HUGE operations, making decisions are complex & involved large effort - time, personnel learning many new things included Cooperative activities essential Libraries pushed out of libraries 27

Criticisms of metadata Too complicated Subjective & depends on context There is no end to metadata Other methods e.g. automatic by search engines, accomplish search & discovery effectively & efficiently so who needs metadata? 28

In conclusion Effective access to digital resources depends on metadata Today, there are efforts to derive metadata automatically, using Natural Language Processing (NLP) methods Maybe automation of assigning metadata is the future? 29

Dedicated to: Jorge Luis Borges 1899-1986 The Library of Babel explore the idea 30

One of delightful Jorge Louis Borges quotes These ambiguities, redundancies, and deficiencies recall those attributed by Dr. Franz Kuhn to a certain Chinese encyclopedia entitled Celestial Emporium of Benevolent Knowledge. On those remote pages it is written that animals are divided into (a) those that belong to the Emperor, (b) embalmed ones, (c) those that are trained, (d) suckling pigs, (e) mermaids, (f) fabulous ones, (g) stray dogs, (h) those that are included in this classification, (i) those that tremble as if they were mad, (j) innumerable ones, (k) those drawn with a very fine camel's hair brush, (l) others, (m) those that have just broken a flower vase, (n) those that resemble flies from a distance. -- Essay: "The Analytical Language of John Wilkins" 31

32