Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by:
Presentation services: subject, media-specific, data, commercial portals Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Searching, harvesting, embedding Aggregator services: national, commercial Learning object creation, re-use Resource discovery, linking, embedding Harvesting metadata Publication Research & e-science workflows Deposit / selfarchiving The scholarly knowledge cycle. Liz Lyon, Ariadne, July 2003. Liz Lyon (UKOLN, University of Bath), 2005 Repositories : institutional, e-prints, subject, data, learning objects Validation Peer-reviewed publications: journals, conference proceedings This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0 Learning & Teaching workflows Deposit / selfarchiving Resource discovery, linking, embedding Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Validation Quality assurance bodies
JISC Vision : a global landscape of federated repositories Multi-disciplinary, crosssectoral National, institutional Different platforms Many format types: data, eprints, images, geospatial e-framework and Information Environment context Define common + domainspecific + repository services Interoperability based on open standards, software tools From Andy Powell: http:///distributed-systems/jiscie/arch/presentations/jiie-jcs-2005/ heterogeneous - metadata formats, content formats, identifiers, packaging standards homogeneous - metadata formats, content formats, identifiers, packaging standards repository repository repository repository repository fusion layer repository federator portal portal portal portal portal
JISC-funded content providers institutional content providers external content providers authentication/authorisation (Athens) service registries metadata schema registries identifier services brokers aggregators catalogues indexes provision institutional profiling services terminology services shared infrastructure OpenURL link servers media-specific portals institutional portals end-user desktop/browser subject portals learning management systems presentation fusion Andy Powell (UKOLN, University of Bath), 2005 JISC Information Environment architecture This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0
Update on JISC DR activity 1 Commissioned reports: Review (Feb 2005), Roadmap (April 2006), Linking UK Repositories (June 2006) 4M DR Programme 2005 21 Projects: some working with data, VERSIONS (of eprints) DR support at UKOLN : wiki http:///repositories/digirep/index/jisc_digital_repository_wiki Advocacy Package (autumn 2006) Project synthesis, collecting user scenarios, developing use cases, scoping/evaluating reference models: OAIS? Standards (and harmonisation) eprints Dublin Core Application Profile Working Group Remote deposit API Working Group (Mellon New York meeting) UK IR cross search service (eprints)
e-research: understanding business process Project StORe: Source-to-Output Repositories (Edinburgh) primary data : research publications Survey questionnaire RepoMMan: Repository Metadata and Management (Hull) Survey questionnaire and interviews Activity diagram R4L Repository for the Laboratory (Southampton) Crystallography workflow analysis, automated data capture, user deposit scenarios RAW DATA DERIVED DATA RESULTS DATA
http:///projects/ebank-uk/ ebank UK Project Promote open access crystallography data Aggregator service harvests OAI metadata from institutional data repository (e-crystals archive) Service linking from data to derived research publication Embedding ebank service in learning workflows: pedagogy Future federation plans for crystallography data repositories UKOLN (lead), University of Southampton, University of Manchester
ebank Metadata Publication Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier InChI Compound Class & Keywords Specifies which datasets are present in an entry Application Profile http:///projects/ebank-uk/schemas/ DOIs, data citation http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145
Discovering data: Domain identifier: International Chemical Identifier (INChI) code Google molecule using INChI Slide from Simon Coles Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k
Data descriptions Validation, publication & discovery of data models & schema Metadata packaging standards METS MPEG 21 DIDL Complex object model? Semantic descriptions Formal controlled vocabularies High-level and domain ontologies Inter-disciplinary discovery Informal social network approaches folksonomies
Adding value: repository services Tools: for deposit, normalisation, manipulation, transformation.. Linking, annotation, visualisation Aggregators: generic, (sub-) disciplinary Knowledge extraction: Mining (data, text, structures) National Centre for Text Mining NaCTeM Modelling (economic, climate, mathematical, biological ) Analysis (statistical, lexical, gene.)
JISC DR update 2 OpenDOAR Directory of Open Access repositories: Universities of Nottingham and Lund Interim Repository Access management systems integration: Shibboleth New funding 2006: Capital Programme Roadmap, Repositories & Preservation Programme 14M over 3 years but current Call: Repositories Support Project Tools & Innovation Strand Discovery to Delivery Strand Data Curation and Preservation
Digital repositories, OA & preservation Long-term access: trust, responsibility, policy Trusted DR Audit Checklist for Certification Draft Research Libraries Group-NARA Taskforce Defined criteria under 4 categories Organisation Functions, processes & procedures Designated community & usability Technologies & technical infrastructure UK Digital Curation Centre: advice, tools & services RepInfo Registry http://www.dcc.ac.uk/ CASPAR Preservation Framework
Political, cultural, socio-legal, IPR Funding bodies position on OA: Research Councils RCUK statement, Research Assessment Exercise (RAE), IRRA Institutional OA position: Business drivers? University of Southampton Self-Archiving Policy and a mandate (not a recommendation) Legal responsibilities as publisher, IPR, TrustDR, licences, automated Digital Rights Management DRM Culture & human factors: Sharing culture? Multidisciplinary teams: computer scientists, domain scientists, digital library experts, statisticians/modellers e.g. ebank project Lessons learnt: e-science Human Factors Audit Report (to be published 2006) Roy Kawalsky, Loughborough