Using Metadata for the Interlinking of Digitized Mathematics

Similar documents
A Dublin Core Application Profile for Scholarly Works (eprints)

RDA Resource Description and Access

A long tradition in publication of mathematics. Les Annales de Gergonne, first purely mathematical journal ever published ( )

Expressing the Scholarly Works (Eprints) DC Application Profile using the DSP wiki syntax

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Increasing access to OA material through metadata aggregation

Introduction to the database zbmath - Zentralblatt MATH

Open Access Statistics : an examination how to generate interoperable usage information from distributed open access services

Customising Location of Knowledge. Ann Apps and Ross MacIntyre MIMAS, The University of Manchester, UK

The bx Scholarly Recommender Service. Nettie Lagace Product Manager, Ex Libris

The NUMDAM program. Thierry Bouche. MSRI workshop, April 16th 2005, Berkeley. Cellule MathDoc & institut Fourier, Grenoble

Taking D2D Services to the Users with OpenURL, RSS, and OAI-PMH. Chuck Koscher Technology Director, CrossRef

Metadata Standards and Applications

DML-CZ: From Scanned Image to Mathematical Knowledge Sharing. WDML vision. Petr Sojka. June 29th, 2005

Scopus. Information literacy in Chemistry. J une 14, 2011

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy

OAI-ORE. A non-technical introduction to: (

Customising Location of Knowledge

Joint Steering Committee for Development of RDA. Related document: 5JSC/RDA/Scope/Rev/4

Open Access Statistics: Interoperable Usage Statistics for Open Access Documents

The DNER Technical Architecture: scoping the information environment

Two Traditions of Metadata Development

DC-Text - a simple text-based format for DC metadata

The DCMI Metadata Registry

Getting technical an overview

<Metadata>! What is Crossref metadata Who uses Crossref metadata services How Crossref provides this content

BIBLID (2004) 93:1 pp (2004.6) 209. NBINet NBINet 92

From EuDML to WDML. Next steps. Thierry Bouche. Cellule MathDoc & institut Fourier, Université de Grenoble

DOIs for Research Data

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

OpenURL Implementations. OpenURL Implementations. After OpenURl. Before OpenURL. OpenURL Implementation: Link Resolution That Users Will Love

Ontology Servers and Metadata Vocabulary Repositories

DOIs for Scientists. Kirsten Sachs Bibliothek & Dokumentation, DESY

Integration of Disciplinary Repository, Institutional Repository and National Portal

UK Institutional Repository Search Project

Questionnaire for effective exchange of metadata current status of publishing houses

OpenAIRE Guidelines Promoting Repositories Interoperability and Supporting Open Access Funder Mandates

Digital repositories as research infrastructure: a UK perspective

Workshop B: Application Profiles Canadian Metadata Forum September 28, 2005

Functional Requirements - DigiRepWiki

Library Workshop Welcome!

RDA work plan: current and future activities

Key databases and sources for finding relevant references and financial data for your MSc Maths Finance dissertation

Hello, I m Melanie Feltner-Reichert, director of Digital Library Initiatives at the University of Tennessee. My colleague. Linda Phillips, is going

Formalizing Dublin Core Application Profiles Description Set Profiles and Graph Constraints

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton

zbmath Training Guide

Survey of Existing Services in the Mathematical Digital Libraries and Repositories in the EuDML Project

Bengkel Kelestarian Jurnal Pusat Sitasi Malaysia. Digital Object Identifier Way Forward. 12 Januari 2017

Chinese-European Workshop on Digital Preservation. Beijing (China), July 14 16, 2004

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

Scuola di dottorato in Scienze molecolari Information literacy in chemistry 2015 SCOPUS

Resource Description and Access Setting a new standard. Deirdre Kiorgaard

By: 1. Golnessa GALYANI MOGHADDAM. 2. Dr. Shalini R. URS, Professor. Abstract:

Getting (slightly) technical an introduction to linking. Ross MacIntyre Mimas Service Manager The University of Manchester

CrossRef tools for small publishers

Performing searches on Érudit

Main focus of the of the presentation

OAI-PMH. DRTC Indian Statistical Institute Bangalore

Providing Semantic and Bibliographic Data for Library Discovery. Cathy Dolbear Senior Link Architect, Data Strategy Oxford University Press

Integration of Chinese Digital Resources in An English Environment: Status-Quo and Prospect

Problem: Solution: No Library contains all the documents in the world. Networking the Libraries

What do we need to build a successful knowledge base?

Metadata Standards & Applications. 7. Approaches to Models of Metadata Creation, Storage, and Retrieval

The Dublin Core Metadata Element Set

Software Requirements Specification for the Names project prototype

SCOPUS. Scuola di Dottorato di Ricerca in Bioscienze e Biotecnologie. Polo bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche

Contribution of OCLC, LC and IFLA

Description Set Profiles

Infrastructure for the UK

ADDING MACHINE-READABLE BIBLIOGRAPHIC METADATA TO SCHOLARLY ARTICLES

Key databases and sources for finding relevant references for your MSc Applied Mathematics dissertation

Building Virtual Collections

Information Standards Quarterly

GNU EPrints 2 Overview

Citation Services for Institutional Repositories: Citebase Search. Tim Brody Intelligence, Agents, Multimedia Group University of Southampton

Metadata for general purposes

Information or What is stuff? CS 431 Architecture of Web Information Systems. Carl Lagoze Cornell University Spring 2008

OpenData Hackathon Δημόσια, Ανοικτά Δεδομένα H εμπειρία του Εθνικού Κέντρου Τεκμηρίωσης

Archon - A Digital Library that Federates Physics Collections

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button.

Open Archives Initiative Object Reuse and Exchange Technical Committee Meeting, May 29, Edited by: Carl Lagoze & Herbert Van de Sompel

Repository Metadata: Challenges of Interoperability

Standards in Library Automation and Networking

IRUS-UK: Improving understanding of the value and impact of institutional repositories

How Primo Works VE. 1.1 Welcome. Notes: Published by Articulate Storyline Welcome to how Primo works.

Robin Wilson Director. Digital Identifiers Metadata Services

Interoperability Framework Recommendations

DCMI Abstract Model - DRAFT Update

Reimplementing the Mathematics Subject Classification (MSC) as a Linked Open Dataset

How to contribute information to AGRIS

Bib-1 configuration guideline for Japanese Z39.50 library application

Smart Federated Search for Egyptian Knowledge Bank

Using an Application Profile Based Service Registry

MathWebSearch at NTCIR-11: Keywords, Frontend, & Scalability

Metadata Overview: digital repositories

BIBLIOGRAPHIC REFERENCE DATA STANDARD

B2FIND and Metadata Quality

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

A service oriented view of the JISC Information Environment

Transcription:

Using Metadata for the Interlinking of Digitized Mathematics Thomas Fischer State and University Library Göttingen, Germany 1

Overview The Problem: Search vs. Access The Situation: Players and Communication Suggested Solutions: Metadata Standards Resolving Services Registries 2

Description of the Problem Two kind of searches : Trying to find the answer to some question: What are the different differential structures on a 4-manifold? Trying to find an article: Where is Characteristic numbers of 3-manifolds by Milnor and Thurston available? The first kind requires well formulated searches against some appropriate databases. The second is in some sense trivial. The latter question is the one this talk is dealing with: If you know what you want, how do you get it? 3

Players: Sources of electronic literature Publishers Publishers usually produce electronic versions of mathematical journals Many publishers produce backfiles : retrodigitized versions of printed journals Preprint servers Preprint servers collect electronic versions of mathematical papers since the early 90 s There are several different of them with no unified structure Digitization centres There are several (mostly national) initiatives to produce retrodigitized versions of historical and recent mathematics 4

Sources of printed mathematics literature Local libraries Provide more or less extensive collections of published journals and books Provide additional services if the desired material is not available Authors May provide copies of their papers to interested researchers Remote libraries Provide articles and books through inter-library loan Some provide remote scanning services, delivering the scanned object to the user s desktop electronically Publishers and bookstores Provide options to buy books 5

Researchers interest Immediate access, if possible Most authoritative version No or no additional or lowest possible cost High quality of presentation (order of importance may depend on institutional or personal preferences) 6

Searching for Milnor/Thurston: Try Google Try Google Scholar Try SpringerLink or ScienceDirect Zentralblatt Math: Enseign. Math., II. Sér. 23, 249-254 (1977) [ISSN 0013-8584] Google: Enseign. Math http://www.unige.ch/math/ensmath/em_en/welcome.html http://retro.seals.ch/cntmng?type=pdf&aid=c1:36817 (presented by seals: swiss electronic academic library service, with some problems ) 7

Problems Background idea: World Digital Mathematics Library But now: No unified discovery or access scheme No uniform standards of reference No uniform quality standards 8

Not solved or not solvable? Quality standards for retrodigitization seem to be hard to enforce: Different players: publishers, scientific communities, digitization centres Money involved in quality of scanning and administration of complex metadata Different scopes and long term orientation 9

Available building blocks Review journals (Mathematical Reviews, Zentralblatt MATH) Metadata standards (Dublin Core) Communication protocol (OAI-PMH) Reference standard (OpenURL) Willingness to co-operate 10

Goal: unified access Look up the requested paper in Zentralblatt or MathReviews (they might cover preprints and other gray literature for that purpose) Receive a link to a resolving service Obtain the appropriate copy (digital or printed) Necessary: communication network with sufficient precision 11

New developments in the world of metadata Dublin Core: Abstract Model Dublin Core Application Profile: the eprint Application Profile minidml, a DC-based custom format Dublin Core Simple, enhanced by best practises 12

Dublin Core Abstract Model 1 New: Description sets contain several related descriptions Allows distinct descriptions e.g. for Creator, Journal and Article With this, authority files become easier to build and manage http://dublincore.org/documents/abstract-model/ 13

Dublin Core Abstract Model 2 A description set is a set of one or more descriptions about one or more resources. A description is made up of one or more statements (about one, and only one, resource) and zero or one resource URI (a URI reference that identifies the resource being described). Each statement instantiates a property/value pair and is made up of a property URI (a URI reference that identifies a property), zero or one value URI (a URI reference that identifies a value of the property), zero or one vocabulary encoding scheme URI (a URI reference that identifies the class of the value) and zero or more value representations of the value. 14

<descriptionset> <description resourceuri="http://arxiv.org/abs/math/0612096"> <title>constructing Smooth Loop Spaces<title> <creator descriptionref="theauthor"/> <date>2006-12-04</date> <subject>differential Geometry; Algebraic Topology</subject> </description> <description descriptionid="theauthor"> <firstname>andrew</firstname> <lastname>stacey</lastname> <affiliation>university of Sheffield</affiliation> </description> </descriptionset> 15

Eprints Application Profile 1 Eprint: a scientific or scholarly research text (as defined by the Budapest Open Access Initiative) a DC Application Profile for describing an eprint Each description set describes only one eprint (i.e. one ScholarlyWork entity). Extends the Dublin Core set by numerous additional fields related to scholarly work Uses concept of related description for entities that allow a distinctive separate description, e.g. Creator, Funder, Affiliated Institution Incorporates parts of the Functional Requirements for Bibliographic Records (FRBR), in particular the distinction between Work, Expression, Manifestation and Item. http://www.ukoln.ac.uk/repositories/digirep/index/eprints _Application_Profile 16

Why FRBR? FRBR disentangles some problems with bibliographic references: The Work is the abstraction of the original product, the ideas. An Expression is a version of this, e.g. the original one, a translation, the third revision A Manifestation is a preprint or a printed and/or digital version in a journal or book. An Item (copy) is the actual physical object: the article in the journal on the shelf, the specific digital copy on a particular server. Different properties refer to different levels, think e.g. of a title and a title of the translation, page numbers, publisher, URL 17

Eprints Application Profile 2 Provides as comprehensive a format for describing a scholarly work as desired Is well adapted to electronic sources, not based on printed matter Can be mapped to Dublin Core simple (with some losses) Probably the optimal data format available, but not easy to implement 18

The minidml format A DC-based metadata format enriched by the separation of some elements Beyond DC simple, minidml provides references: different identifier schemes, e.g. <identifier scheme="oai"> <citation>ann. Inst. Fourier 1, 1-4 (1949)</citation> plus separate subfields for citation <reviewid> giving MR and Zbl numbers With 20+ elements plus some schemes the most relevant information on a scholarly article can be captured http://minidml.mathdoc.fr/ 19

DC Simple enriched by Best Practices 1 Recommended Practice for Creating Unqualified Dublin Core Records, for Mathematical Literature (unwieldy title!) in preparation by Thierry Bouche, NUMDAM (Numérisation de documents anciens mathématiques) Thomas Fischer, Staats- und Universitätsbibliothek (SUB), Göttingen Claude Goutorbe, Cellule MathDoc, Grenoble David Ruddy, Project Euclid, Cornell University Library Based on experience with and analysis of different OAI metadata schemes used by digitization centres Goal: provide rules for DC simple that make the OAI metadata most useful 20

DC Simple enriched by Best Practices 2 Some suggestions: Use UTF-8 for special characters outside of mathematical formulas, not the TeX encoding Use prefixes to clarify the meaning of data, e.g. isbn:, msc:, bibliographiccitation: Use last name first rule for names; use only one version of the name for each author Link to reference journals in the <relation> element, using appropriate prefix, e.g. mr:, zbl:, jfm: Give full bibliographic citation information in identifier field 21

Bibliographic Citation and OpenURL OpenURL: NISO standard Z39-88(2004) Basically of the form (without the line break): ctx_ver=z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal& <key1>=<value1>&<key2>=<value2>&...&<keyn>=<valuen> Essentially a (search) request with standardized fields Used for Encoding Bibliographic Citation Information in Dublin Core Metadata (http://dublincore.org/documents/dc-citation-guidelines/) ContextObject in SPAN (COinS): Embedding Citation Metadata in HTML (http://ocoins.info/) SFX: context-sensitive link server from Ex Libris (http://www.exlibrisgroup.com/sfx_openurl.htm) crossref: an infrastructure for linking citations across publishers (http://www.crossref.org/03libraries/16openurl.html) 22

Build a registry! OAI Data providers: Provide data with sufficient information, using a full format like the eprint application profile or enriched format like minidml and modify the required DC Simple format according to the recommendations OAI Service providers: Collect data from the available sources and provide an OpenURL service to retrieve the documents Review journals: Enrich the review data with an appropriate OpenURL, directed to a resolver at one of the registries 23

A possible communication scheme Publisher Preprint server Digitization centre Library OAI-MPH User Registry A Registry B OpenURL Mathematical Reviews Zentralblatt MATH 24

Some basic tasks ahead: Digitization centres: Get the pages straight and the page numbers right Get full and correct data OAI Service providers: Collect data and analyze and organize appropriately, in particular match different versions of the same article Review journals: Unify the references to journals 25

Some references Andy Powell, Mikael Nilsson, Ambjörn Naeve, Pete Johnston: Dublin Core Abstract Model (http://dublincore.org/documents/abstract-model) Pete Johnston, Andy Powell: Expressing Dublin Core metadata using XML (30.5.2006) (http://dublincore.org/documents/2006/05/29/dc-xml) Digital Library Federation and the National Science Digital Library: Best Practices for Shareable Metadata (August 2005) (http://comm.nsdl.org/download.php/653/shareablemetadatabestpractic es.doc) JISC, UKOLN, cetis: EPrint Application Profile (September 2006) (http://www.ukoln.ac.uk/repositories/digirep/index/eprints_application _Profile) ViFa Math /VLib Math: http://www.vifamath.de/ (English version in preparation) Digitization Registry: http://digreg.mathguide.de/ (recommendations for OAI data to appear here) 26

Thank you for your attention! Thomas Fischer State and University Library Göttingen, Germany fischer@sub.uni-goettingen.de 27