Riding the Wave: Move Beyond Text TIB's strategy in the context of non-textual materials. Uwe Rosemann, Irina Sens IATUL Conference Singapur

Similar documents
Semantic Retrieval of the TIB AV-Portal. Dr. Sven Strobel IATUL 2015 July 9, 2015; Hannover

TIB AV-Portal. Margret Plank 19th of January 2015 TACC Meeting

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017

ZB MED. Libraries and the Information Infrastructure in Germany: Nutrition Environment Agriculture. Medicine Health

Robin Wilson Director. Digital Identifiers Metadata Services

DURAARK. Ex Libris conference April th, 2013 Berlin. Long-term Preservation of 3D Architectural Data

Paradigm shifts in Information Access - beyond classical scholarly publication

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

Every Bit Counts. Publication and Citation of Data in the Earth Sciences MG&G Data Systems Advisory Committee Meeting 2009 Jens Klump et al.

TIB AV-Portal: A Trusted Home for Conference Recordings

RESEARCH ANALYTICS From Web of Science to InCites. September 20 th, 2010 Marta Plebani

EUREKA European Network in international R&D Cooperation

[ PARADIGM SCIENTIFIC SEARCH ] A POWERFUL SOLUTION for Enterprise-Wide Scientific Information Access

Das Fraunhofer. Fraunhofer MOEZ 1 Fraunhofer MOEZ

DataCite Persistent links to scientific data. Jan Brase

Building the CIARD Framework for Data and Information Sharing: the case of France & INRA

Specific requirements on the da ra metadata schema

ULISSE. Carlo Albanese Telespazio. ASI Workshop on Space Foundations Rome, 29 May 2012

Open Archives Forum - Technical Validation -

The WorldWideScience Alliance: An international partnership to improve access to scientific and technical information

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Damien Lecarpentier CSC-IT Center for Science, Finland EUDAT User Forum, Barcelona

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

FIZ KARLSRUHE. A large non-academic infrastructure institution in Germany and member of the Leibniz Association OUR MISSION OUR UNIQUE SELLING POINT

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES

National Data Repositories (NDR)

DOIs for Research Data

Digital Preservation Policy. Principles of digital preservation at the Data Archive for the Social Sciences

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Data Discovery - Introduction

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

DRIVER Step One towards a Pan-European Digital Repository Infrastructure

DOI for Astronomical Data Centers: ESO. Hainaut, Bordelon, Grothkopf, Fourniol, Micol, Retzlaff, Sterzik, Stoehr [ESO] Enke, Riebe [AIP]

RDM through a UK lens - New Roles for Librarians?

Enterprise Multimedia Integration and Search

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Archivierung und Publikation von Forschungsdaten mit RADAR

RADAR A Repository for Long Tail Data

EUDAT Training 2 nd EUDAT Conference, Rome October 28 th Introduction, Vision and Architecture. Giuseppe Fiameni CINECA Rob Baxter EPCC EUDAT members

Quality Assessment in Digital Libraries Challenges and Chances

Search Framework for a Large Digital Records Archive DLF SPRING 2007 April 23-25, 25, 2007 Dyung Le & Quyen Nguyen ERA Systems Engineering National Ar

CNI Fall 2015 Membership Meeting, Washington, D.C. Archivportal-D. The National Platform for Archival Information in Germany

bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg

PROJECT FINAL REPORT. Tel: Fax:

ISAN: the Global ID for AV Content

re3data.org Registry of Research Data Repositories

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

re3data.org Registry of Research Data Repositories Peter Schirmbacher Humboldt-Universität zu Berlin ETD Hong Kong, September 25.

Metadata Requirements to document Data Analyses and Syntax Files in a Virtual Research Environment (VRE) - The use case soeb 3

The Virtual Observatory and the IVOA

Digital repositories as research infrastructure: a UK perspective

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

EUDAT & SeaDataCloud

Digital preservation activities at the German National Library nestor / kopal

For Attribution: Developing Data Attribution and Citation Practices and Standards

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST

Developing a social science data platform. Ron Dekker Director CESSDA

Searchable. Readable. Relatable. E-Journal Platform for Japanese Academic Societies

Research data in library context Dr Jan Brase, Head of R&D

ARKive-ERA Project Lessons and Thoughts

ACCELERATE YOUR SHAREPOINT ADOPTION AND ROI WITH CONTENT INTELLIGENCE

Crossing the Archival Borders

What Membership Means

Introduction

OPEN SCIENCE AT THE SWEDISH RESEARCH COUNCIL. Sofie Björling Director of the Dept of Research Infrastructures, NPR Open Access

AN INFORMATION SYSTEM FOR RESEARCH DATA IN MATERIAL SCIENCE

Data publication and discovery with Globus

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

EUDAT Towards a Collaborative Data Infrastructure

Networking European Digital Repositories

ebooks Preservation at Scholars Portal Kate Davis & Grant Hurley Scholars Portal, Ontario Council of University Libraries

Linked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library

Caring for research data and what about software? Peter Doorn, director DANS

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

The National Digital Library Finna Among Digital Research Infrastructures in Finland

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Towards a joint service catalogue for e-infrastructure services

Data Staging and Data Movement with EUDAT. Course Introduction Helsinki 10 th -12 th September, Course Timetable TODAY

General Overview & Annex 1: Global Smart Grid Inventory

Using digital library techniques - Registration of scientific primary data -

critically examined in the Federal Archives.

Bengkel Kelestarian Jurnal Pusat Sitasi Malaysia. Digital Object Identifier Way Forward. 12 Januari 2017

Extending the Facets concept by applying NLP tools to catalog records of scientific literature

Persistent Identifiers for Audiovisual Archives and Cultural Heritage

Chemotion funded by. Göttingen eresearch Toolbox Series - Electronic Note Keeping. Nicole Jung.

Research Data Management

Future Trends of ILS

Copyright 2008, Paul Conway.

Context Aware Computing

Data ownership within governance: getting it right

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Scientific Data Curation and the Grid

Services pour identifier et valoriser : l enregistrement de données via DataCite

CHAPTER 8 Multimedia Information Retrieval

Evaluation in Quaero. Edouard Geoffrois, DGA Quaero Technology Evaluation Manager. Quaero/imageCLEF workshop Aarhus, Denmark Sept 16 th, 2008

Networking European Digital Repositories

Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library

Transcription:

Riding the Wave: Move Beyond Text TIB's strategy in the context of non-textual materials Uwe Rosemann, Irina Sens IATUL Conference Singapur

Outline TIB Role and functions Requirements Politicians - Funders Users Examples for solutions DataCite AV-Portal chemocr Visual search Long-time preservation 2

TIB Hannover Some of the facts = German National Library of Science and Technology engineering, architecture, chemistry, information technology, mathematics and physics Founded in 1959 Financed by Federal Government and all Federal States 3

Main Building 4

Marstall Building 5

Marstall Building a former horse stable 6

Castle 7

Main Stacks 8

TIB Hannover Some of the facts 11 mio annual acquisition budget 18,500 journal subscriptions 7 mio items Staff: ca. 175 FTE 9

Global Network TechLib 10

Customers Europe 10% Germany 71% USA 14% World 5% 11

Main Services Provision of scientific content full texts, document delivery, interlibrary loan Scientific retrieval portal GetInfo Long-time preservation DOI Services for research data Research and development 12

Veränderungen im Wissenschaftsprozess Jim Gray, escience Group, Microsoft Research 13

A Gap A widening gap in the scientific record between published research in a text document and the data that underlies it As a result, datasets are Difficult to discover Difficult to access Scientific information gets lost 14

Requirements - Politics Knowledge is power; Europe must manage the digital assets its researchers generate. 15

Riding the wave How Europe can gain access from the rising tide of scientific data. Final report of the High Level Expert Group on Scientific Data. 16

Strategy Move beyond text Scientific Films 3D Objects Software Simulation Research Data Text 17

Move beyond text Consequences for TIB Research communities produce many types of scientific and technical information Each has its own unique characteristics and life cycle Must become capable of accepting and managing new media formats 18

GetInfo Portal for Science and Technology 45 Mio metadata index 150 Mio metadata in external sources 1,8 Mio documents AV-Media GetInfo mobile 19

Move beyond text Consequences for TIB We have to open our portal to this non-textual information 20

Joint Science Conference statement An increasingly important user needs addressed by TIB is the systematic collection, registration, archiving, indexing, and optimized provision of audio-visual materials using the latest technical possibilities. TIB is the appropriate institution to build up expertise in the area of non-textual materials. Systematic acquisition of scientific objects Object specific search and presentation Long-term archiving Development of standards (in collaboration) Applied research (visual search, search, automatic Content analysis etc) 21

How have we been preparing? Infrastructure for research data- DataCite Visual Search tools for AV-media 3D-Objects Architecture chemocr Visual access to research data 22

Collaboration Research Data In 2005, the TIB became a non-commercial DOI registration agency for research data In 2010, the TIB became co-founder of the international Datacite consortium to establish easier access to scientific research data on the Internet. Mission Citability of research data High visibility of the data Easy re-use and verification of the data sets Increasing quality of published papers 23

DataCite Members Australia Canada Denmark France Germany Italy The Netherland Sweden Switherland UK USA Korea (affiliated) 24

DataCite Structure International DOI Foundation Member DataCite Managing Agent (TIB) Member Institution Member Institution Associate Stakeholder Data Centre Data Centre Data Centre Data Centre Data Centre Data Centre Works with 25

Example: EHEC virus 26

Example: EHEC virus 27

How have we been preparing? Infrastructure for research data- DataCite Visual Search tools for AV-media 3D-Objects Architecture chemocr Visual access to research data 28

AV media A N A L Y S I S visual structural auditive source: Scorupka, Sascha, Experiment der Woche, 2011 Object detection & clustering Genre analysis Intelligent Character Recognition (ICR) Scene/ shot detection Speaker detection Automatic Speech Recognition (ASR) Semantic and content based indexing on extracted features on extracted text 29

Keyframes Annotation Machine learning using visual features I N D E X I N G Textual Metadata ICR derived text Audio Transkript Genre Classes Graphical : Animation Graphical : Drawing Graphical : Diagram Real : Outdoor Real : Indoor Real : Lecture / Conference Real : Interview Real : Buildings... e.g. person xy location xy subject xy domain xy... Named Entity Recognition Mapping Ontologies Taxonomies Thesauri e.g. bibliographic, geographical, encyclopedic data 30

AV media D I S P L A Y Faceted search Info ASR Explorative/ Semantic search Keyframe navigation Navigation on audio text Die Technik des Lasers ist aus dem heutigen Alltag kaum mehr wegzudenken.... Variable Strahl- Aufweitungssysteme in Berlin Adlershof - der Stadt für Wissenschaft, Wirtschaft und... 31

How have we been preparing? Infrastructure for research data- DataCite Visual Search tools for AV-media 3D-Objects Architecture chemocr Visual access to research data 32

3D objects an excursion to Architecture 33

Visual search tools visual search content based indexing 34

Content based indexing segmentation with form-primitives extraction of room connectivity graphs 35

Visual search attributed graph 3D sketch result visualization 36

How have we been preparing? Infrastructure for research data- DataCite Visual Search tools for AV-media 3D-Objects Architecture chemocr Visual access to research data 37

Information Retrieval in Chemistry Search for chemical structures how? Chemists are used to drawing? 38

Textual and non-textual chemical information Table with reaction scheme Chemical Names 2a-i: Derivates from the reaction Linked entities from the table Chemical structure Reaction scheme 39

Non-textual data processing chemocr image data chemical structure data CLiDE chemocr 40

Information retrieval in chemistry Text AND Formulas 41

How have we been preparing? Infrastructure for research data- DataCite Visual Search tools for AV-media 3D-Objects Architecture chemocr Visual access to research data 42

Numerical data Zeit [h] T [ C] 1 12 2 13 3 12 4 12 5 13 6 35 7 17 8 11 9 10 10 12 11 13 12 13 13 12 14 12 15 12 16 11 17 11 18 10 19 10 20 11 21 11 22 10 23 12 24 12 43

Visual access to research data 44

Last but not least. Long Term Preservation Digital texts AV media 3D objects etc. 45

Conclusion Dissemination of Scientific and Technical Information has been a foundational mission. The methods have completely changed, but the mission remains the same. 46

Conclusion Ultimate Goal: Interlinking and Search Across All Types of Digital Assets. 47

Questions? 48