Metadata Catalogue Issues. Daan Broeder Max-Planck Institute for Psycholinguistics

Similar documents
Building metadata components

Phase 1 RDRDS Metadata

Presentation to Canadian Metadata Forum September 20, 2003

Expressing language resource metadata as Linked Data: A potential agenda for the Open Language Archives Community

Metadata and DCR. <CMD_Component /> Dieter Van Uytvanck. Max Planck Institute for Psycholinguistics


First metadata-enabled service in Croatian Webspace

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Some challenges ahead for the Open Language Archives Community

The Biblioteca de Catalunya and Europeana

Metadata Proposals for Corpora and Lexica

Metadata Infrastructure for Language Resources and Technology

Getting Started with the Digital Commonwealth. Robin L. Dale Director of Digital & Preservation Services LYRASIS

Wittenburg, Peter; Gulrajani, Greg; Broeder, Daan; Uneson, Marcus

Metadata Workshop 3 March 2006 Part 1

RDDS: Metadata Development

A Repository of Metadata Crosswalks. Jean Godby, Devon Smith, Eric Childress, Jeffrey A. Young OCLC Online Computer Library Center Office of Research

RDF and Digital Libraries

EXTENDING OAI-PMH PROTOCOL WITH DYNAMIC SETS DEFINITIONS USING CQL LANGUAGE

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Common presentation of data from archives, libraries and museums in Denmark Leif Andresen Danish Library Agency October 2007

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

The Value of Metadata

Developing Shareable Metadata for DPLA

Developing an Automatic Metadata Harvesting and Generation System for a Continuing Education Repository: A Pilot Study

Increasing access to OA material through metadata aggregation

From community-specific XML markup to Linked Data and an abstract application profile: A possible path for the future of OLAC

B2FIND: EUDAT Metadata Service. Daan Broeder, et al. EUDAT Metadata Task Force

ISLE Metadata Initiative (IMDI) PART 1 B. Metadata Elements for Catalogue Descriptions

Metadata on the Web. Miroslav Milinovic SRCE Zagreb, Croatia 4th CARNet Users Conference CUC 2002, Zagreb, September, 2002

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

Metadata Tools Supporting Controlled Vocabulary Services

Metadata Standards and Applications

The Metadata Challenge:

How to contribute information to AGRIS

METALIS, an OAI Service Provider

Creating a National Federation of Archives using OAI-PMH

Metadata aggregation for digital libraries

Just for the record, CMDI should be about semantic interoperability

INTRO INTO WORKING WITH MINT

Based on the functionality defined there are five required fields, out of which two are system generated. The other elements are optional.

1. General requirements

Design of The PORTA EUROPA Portal (PEP) Pilot Project

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

Design and Implementation of the Japanese Virtual Observatory (JVO) system Yuji SHIRASAKI National Astronomical Observatory of Japan

The Open Archives Initiative in Practice:

Research Data Repository Interoperability Primer

Using the WorldCat Digital Collection Gateway with CONTENTdm

OLAC: Accessing the World s Language Resources

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Managing very large Multimedia Archives and their Integration into Federations

Networking European Digital Repositories

OAI-PMH. DRTC Indian Statistical Institute Bangalore

Metadata and Encoding Standards for Digital Initiatives: An Introduction

B2FIND and Metadata Quality

Interoperability for Digital Libraries

The Virtual Language Observatory!

Developing the ICSU World Data System (WDS)

7.3. In t r o d u c t i o n to m e t a d a t a

D-SPIN Report R2.2b: The German Resource Landscape and a Portal

Metadata for the Web A Necessary Evil? CS 431 March 2, 2005 Carl Lagoze Cornell University

An easy option? OAI static repositories as a method of exposing publishers metadata to the wider information environment Elpub 2006

The DART-Europe E-theses Portal

Open Archives Forum - Technical Validation -

An e-infrastructure for Language Documentation on the Web

Chinese-European Workshop on Digital Preservation. Beijing (China), July 14 16, 2004

The NCAR Community Data Portal

Networking European Digital Repositories

The Design of a DLS for the Management of Very Large Collections of Archival Objects

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

Search Interoperability, OAI, and Metadata

Using the WorldCat Digital Collection Gateway

Using metadata for interoperability. CS 431 February 28, 2007 Carl Lagoze Cornell University

Integrating Access to Digital Content

SobekCM. Compiled for presentation to the Digital Library Working Group School of Oriental and African Studies

Adding OAI ORE Support to Repository Platforms

D-SPIN. D-SPIN Report 2.1: Formation of Centres

Problem: Solution: No Library contains all the documents in the world. Networking the Libraries

Data Exchange and Conversion Utilities and Tools (DExT)

Multi-agent Semantic Web Systems: Data & Metadata

Data is the new Oil (Ann Winblad)

Metadata Overview: digital repositories

Comparing Open Source Digital Library Software

Dryad Curation Manual, Summer 2009

Data Discovery Service for VI-SEEM

ISLE Metadata Initiative (IMDI) PART 3 A. Vocabulary Taxonomy and Structure

RVOT: A Tool For Making Collections OAI-PMH Compliant

GETTING STARTED WITH DIGITAL COMMONWEALTH

OAI Repository Cataloging Procedures and Guidelines

The GeoPortal Cookbook Tutorial

The Open Archives Initiative and the Sheet Music Consortium

Metadata DB (Catalog DB)

Research on the Interoperability Architecture of the Digital Library Grid

Single New Type for All Creative Works

Workshop B: Application Profiles Canadian Metadata Forum September 28, 2005

AGGREGATIVE DATA INFRASTRUCTURES FOR THE CULTURAL HERITAGE

The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries

Centres Network Formation

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Presented by Dr Joanne Evans, Centre for Organisational and Social informatics Faculty of IT, Monash University Designing for interoperability

Transcription:

Metadata Catalogue Issues Daan Broeder Max-Planck Institute for Psycholinguistics

Introduction Methods of registering resources Metadata Making metadata interoperable Exposing metadata Facilitating resource discovery

Methods or Registring Resources By using content Google a.o. for text-type files By using (structured) metadata: Metadata catalogue Using either a very simple MD set like DublinCore (DC) 15 elements Or more elaborate ones like TEI, IMDI, OLAC, MPEG7,

Metadata set Dublin Core Content IPR Instance Title Creator Date Subject Publisher Type Description Contributor Format DC Set Language Rights Identifier Relation Coverage Source DC Example DC.Title = Building with Lego DC.Title /Alternative = Lego series part I DC.Creator = L. Smith DC.Subject /LCSH = Building DC.Description/Abstract =. DC.Language/ ISO639-2 = eng

Metadata set DC example <dc:identifier>oai:www.mpi.nl:1839/00-0000-0000-0000-0009-e</dc:identifier> <dc:title>si011020</dc:title> <dc:date>1972-03-26</dc:date> <dc:description> in der Kueche beim Fruehstueck, SIM moechte Kuchen haben </dc:description> <dc:coverage>germany</dc:coverage> <dc:publisher> Corpus Manager - Max-Planck Institute for Psycholinguistics </dc:publisher> <dc:description> in der Kueche beim Fruehstueck, SIM moechte Kuchen haben </dc:description> <dc:language>rfc1766:x-sil-ger</dc:language> <dc:contributor>annotator</dc:contributor> <dc:format>text/x-chat</dc:format>.

Metadata set IMDI example <?XML > <METATRANSCRIPT DATE= 2005-11-28" FormatId="IMDI 3.0".> <Session><Name>Hamadi siblings</name><title> Hamadi siblings DBD_ARY_02_21_03_012 excerpt</title> <MDGroup> <Location><Continent>Europe</Continent> <Country>Netherlands</Country> <Region>Goirle, Brabant</Region></Location>. <Content> <Genre>Discourse</Genre><SubGenre>Conversation</SubGenre>. </Content> <Actors> <Actor><Name>Abdelkrim Hamadi</Name>. <Language>Dutch<Language><Language>Arabic, Moroccan Spoken</Language></Actor> <Actor> <Actor></Actors> <Resources> </Resources>.

Metadata pidginization & creolization Conflicts between. Simplicity for interoperability Expressiveness for complete description modularity Linguistics History of Arts Musea extensibility interoperability DC

Metadata Interoperability Goals: Metadata search/browsing across repository boundaries as if there is one virtual domain. Appropriate terminology for users of all repositories when specifying a MD query or viewing MD records Possible solutions: All repositories use the same metadata set Not very practical. For sufficiently accurate description different (sub-)domains need different sets. There are many existing metadata repositories with own sets & procedures that are difficult to change. Already difficult to use one set only in a single repository Use a single set for exchange (->mapping by producer required) Use an interoperable metadata framework

Metadata Mapping Metadata crosswalk Effective if same discipline Many mapping rules OLAC IMDI Switching across, use pivot set. Gemini Information loss Limited number of mapping rules DC AVM TEI Astronomy Visualization Metadata (AVM) UK geo-spatial metadata (Gemini)

Metadata Framework: a component approach Adequate description of resources by using components from existing MD sets (or create new ones but only ) Dance Descr. Location Descr. Dance Descr. Good expressivity Possible problematic consistency Participant Descr. Participant. Descr Text Descr. Language Descr. Recording Descr. Country MotherTongue dcr:10001 dcr:10002 MD Component registry ISO standard DCR: concept registry

Metadata Exposure OAI - Open Archives Initiative IMDI Framework Repository Systems: Fedora, DSpace, (OAI + METS built-in) Data Grid solutions: SRB,

Open Archives Initiative OAI OAI-PMH (protocol for metadata harvesting) low-barrier interoperability framework for archives Originally dev. for e-print servers to expose MD for documents OAI data providers offer metadata for harvesting to OAI service providers Different metadata sets are allowed but DC MD should always be available Harvesting uses XML over HTTP protocol

OAI metadata distribution MD Search Service Provider MD CATALOG others Web-services REST interf. OAI repository OAI-PMH OAI Harvester OAI repository MD DB OAI repository Data Provider 1 Data Provider 3 Data Provider 2

OAI Easy to implement, but. Easy to expand on: exchange within own community, OLAC metadata Relying only on the required metadata part (DC) gives information loss. However communities may agree on their own MD set and see the DC only as a courtesy. Or mapping may take place at the SP

Metadata in XML files Relations represented by embedded URL links DBs only as Web helpers Server IMDI metadata framework C S S S IMDI Harvester HTTP MD CATALOG Web Server C C C S S S S S Language Experiment Actor Session } IMDI metadata repository C Web Server C M M M M T T T T }resources repository A C C S S S S S repository B Distribution by: Embedded URL links Webserver Low tech!!!

Accessing the Metadata Catalogue With OAI or IMDI we can expose the metadata to whoever requires it. How can we access it efficiently both for the service provider and the user?

Complete Catalog Every service provider harvests all data providers Offers a MD search interface working only on its own catalog high availability solution requires many resources SP can map from different offered sets into its own catalogue Or the SP requires the DP to deliver one specific MD set. DP SP SP DP DP DP

Federated Search Federated Search A portal has no metadata catalog but broadcasts a metadata query to all appropriate SPs Portal can transform MD queries to appropriate specific formats used by different SPs Present merged result set to the user. DP SP P DP SP DP/SP

The End