Taking D2D Services to the Users with OpenURL, RSS, and OAI-PMH Chuck Koscher Technology Director, CrossRef ckoscher@crossref.org
Scholarly Publishing Trends Everything is online if it s not online, it doesn t exist Everything is interlinked if it s not linked it doesn t exist Breaking of barriers between academic and consumer behavior user expectations are set by Google, ebay, etc. Journal brand strong but moving to article economy Economic models changing Open Access Technical Reports and other grey lit are now findable Books going online Find-ability precedes usability, you can not use what you can not find" STM-TMR 2006 Amanda Spiteri, Marketing Director Elsevier
Getting noticed requires a store window One window: Your Web page but there are billions of web pages Users must know the URL User might have brand affinity User must read their RSS feeds Content may be indexed by a search engine
There are lots of windows among others
Metadata distribution via standardized methods is the bridge to these windows for your content Strength Complexity Targeted use RSS Wide adoption, great support, browser integrations, mass-user appeal. Simple to create and distribute. Just create an XML file and stick it on your web server Distribution of newsy data most often for human consumption OpenURL All inclusive specification, well positioned for advanced or diverse applications. Simple to complex syntax, only the more basic examples are human readable. Software implementation can be complex, lots of decision paths. Distribution of metadata or content of individual items, most likely implemented as part of a linking system. OAI-PMH Robust well thought out transaction model. Very extensible and adaptable. Wide spread adoption within the industry. Implementation is moderate to complex. Good frameworks (OCLC) available. Requires substantial resources (compute and human) for any non-trivial repository. Distribution of large volumes of metadata most likely to automated harvesters.
OpenURL is packaging OpenURL is a transport syntax (a box), a way to send Meta Data Context Object is an internal wrapper (box within a box) Complexity stems from the number of ways you can accomplish the same task: send metadata to a service (a resolver) OpenURL Meta Data OpenURL Context Object Meta Data OpenURL referent reference OpenURL Context Object referent reference OpenURL context reference Context Object OpenURL context reference Context Object Meta Data Meta Data Meta Data referent reference Meta Data
OpenURL basic example http://www.crossref.org/openurl?url_ver=z39.88-2004 &rft_id=info:doi/10.1361/15477020418786&noredirect=true
OpenURL: In-Line context object example http://www.crossref.org/openurl? url_ver=z39.88-2004 &url_tim=2004-01-09 &url_ctx_fmt=info:aofi/fmt:akev:amtx:actx &ctx_ver=z39.88-2004 &ctx_enc=info:aofi/enc:autf-8 &ctx_id=345871 &ctx_tim=2002-03-20t08:a55:a12z &rft_val_fmt=info:aofi/fmt:akev:mtx:journal &rft.atitle=isolation+of+a+common+receptor+for+coxsackie+b &rft.jtitle=science &rft.aulast=bergelson &rft.auinit=j &rft.date=1997 &rft.volume=275 &rft.spage=1320 &rft.epage=1323 &rfe_val_fmt=info:ofi/fmt:kev:mtx:journal &rfe.atitle=p27-p16+chimera:+a+superior+antiproliferative &rfe.jtitle=molecular+therapy&rfe.aulast=mcarthur &rfe.aufirst=james&rfe.date=2001 &rfe.volume=3 &rfe.issue=1 &rfe.spage=8 &rfe.epage=13 &req_ref_fmt=http://lib.caltech.edu/fmt/ldap-mtx.html &req_ref=http://ldap.caltech.edu/janed/record.txt
http://www.crossref.org/openurl NISO Z39.88-2004 OpenURL is a very comprehensive framework! CrossRef implemented the San Antonio Profile #1 The basic inline by value model might address a high percentage of actual needs By consolidating metadata in one place (CrossRef), publishers have created an ideal circumstance for a single resolver to reach a large amount of content. An OpenURL solution is not embodied in a single place. It is a community of contributors using a common language. OpenURL is the Esperanto of linking. No CrossRef account needed, available free to the public Number of resolutions in 2006 => 608,756
OAI-PMH is a set of commands used to pull metadata from a compliant repository Verb Identify ListMetadata Formats ListSets ListIdentifiers ListRecords GetRecord Use Ask a repository to tell you about itself. Ask a repository which formats (XML schemas) data is available in. Compliant repositories support Dublin Core. Ask a repository to list the hierarchical structure it uses to organize itself Ask a repository to list the identifiers in the whole repository or a particular set Ask a repository to return the metadata for all records in the repository or those in a given set Ask the repository for the metadata of a given identifier. Example oai.crossref.org/oaihandler?verb=identify oai.crossref.org/oaihandler?verb=listmetadata Formats oai.crossref.org/oaihandler?verb=identify oai.crossref.org/oaihandler?verb=listidentifiers oai.crossref.org/oaihandler?verb=listrecords& SetSpec=10.1002:300:1999 oai.crossref.org/oaihandler?verb=getrecord &metadataprefix=cr_unixml &identifier=info:doi/10.1002/jnr.490010101
OAI-PMH sample responses - Identify
OAI-PMH sample responses ListSets ListSets&resumptionToken=1160597811347!698!205002
OAI-PMH sample response verb=getrecord&metadataprefix=cr_unixml&identifier=info:doi/10.1002/jnr.490010101
OAI-PMH sample response verb=listidentifiers&metadataprefix=cr_unixml&set=10.1002:297:2004
CrossRef s OAI-PMH Mission December 2005 CrossRef announced a Web Services initiative Provide a central point for the distribution of metadata from 100s of publishers, for millions of identifiers Utilize common/existing distribution protocols and technology Targeted at consumers of mass quantities of metadata. Active: MS Academic Live and Scirus (search engines) Looking: EBSCO, Euopean Biomatics Institute, others Is not open (e.g. it is not free), uses IP authentication for access control Recipient identified by 2 IP address ranges Content can be selectively mapped to a recipient (opt-in/opt-out) at the publisher or title level
RSS CrossRef is not currently operating any RSS feeds (we have Blogs which are kinda sorta the same thing) Members view RSS feeds as a way to reach out and touch end users and bring them to the member s site For end uses: OpenURL is like plumbing ( Intel inside ), they really don t care OAI-PMH is a what? RSS they ve probably heard of (blogs) and may even know how to use CrossRef members have recognized the need to establish guidelines on content composition by feed type. e.g. a TOC feed should be organized the same way from one publisher to the next in order to avoid end user confusion. (a NISO initiative?)
RSS syndication Of course RSS is used for syndication as well Example: Syndication feed Google accepts RSS (Real Simple Syndication) 2.0 and Atom 0.3 feeds. Generally, you would use this format only if your site already has a syndication feed. Note that this method may not let Google know about all the URLs in your site, since the feed may only provide information on recent URLs. http://www.google.com/support/webmasters/bin/answer.py?answer=34656&ctx=sibling Google uses the <link> field in your feed to gather URLs from your site and uses the modified date field (the <pubdate> field for RSS feeds and the <modified> date for Atom feeds) to learn when each URL was last modified Make sure that the feed is located in the highest-level directory you want search engines to crawl
Conclusion Bringing users to content requires metadata distribution Be complete (article title, all authors, citations) Be accurate (author=given-name + surname, not the entire byline) Use a widely accepted (and expressive) format: NLM, DC, CrossRef Position metadata for discovery Aggregated distribution like CrossRef s PMH service Register as a PMH data provider (http://www.openarchives.org/data/registerasprovider.html) Find syndication channels (syndication.iop.org, Feedzilla, MedicineNet)