DALA Project: Digital Archive System for Long Term Access
|
|
- Albert Barton
- 6 years ago
- Views:
Transcription
1 2010 International Conference on Distributed Framework for Multimedia Applications (DFmA) DALA Project: Digital Archive System for Long Term Access Mardhani Riasetiawan 1,2, Ahmad Kamil Mahmood 2 1 Master of Information Technology Program, Universitas Gadjah Mada Jalan Grafika No 2, Kompleks UGM Yogyakarta, Indonesia 1 mardhani@mti.ugm.ac.id 2 Computer and Information Science Department, Universiti Teknologi PETRONAS Bandar Seri Iskandar 31750, Perak, Malaysia 2 kamilmh@petronas.com.my Abstract The scientific environment have face the current issues in the data landscape change. The volume, type, formats of scientific data now has being exploding. The industry standard require the long term access and use of digital data for many reasons such as proof of scientific experiments, evidence and proof of findings. It is need to be interactive, reproducible, collaborative, and dynamic and have the reputation and influences for collaborative research. In this research, we introduce DALA (Digital Archive System for Long Term Access) project as the scientific collaboration framework. The project has built for manage and preserve scientific data based on GForge system, and implement several data process. The frameworks have capabilities in handling data collection, data processing, and data archiving. The research aimed is to the new framework for scientific collaboration works. Keywords Scientific, Framework, DALA, GForge System. I. INTRODUCTION The world of science is very dynamics. One innovations or discovery in one area is also will have many discoveries in other area. The innovation itself can count in every minute and every second. The innovation and scientific works have involving many data from many resources. Interdisciplinary or multidisciplinary research has needs on data collection, for example the needs on data share and use of data. The need of long term access and use of digital scientific data has become the major interest in every organization. The data also has become very sensitive when the data need to share and access for another organization. The data is also need to change because of the change on the data landscape, technological matters, and user needs of dynamic data. The digital scientific data have previous and future challenges. The data that produce by research work need to answering the demand on data collaboration. The research need to work with other research, it can be similar or multidisciplinary research. It is a must, because the world now needs the problem solving research and high quality research. Lee Dirks [1] giving the sample of the future needs of scientific digital data, for example is research reporting. The research reports now have characteristics. It must have multi perceptions and dynamically can be customized to each user. The report also can have an access and use, in the term of seeing or following the research workflows and outputs from the lab experiments. The report can be exported into an electronic lab workbench in order to reconstruct the same or similar experiment. It is also giving capabilities to researcher working with multiple reports and having the ability to mash up data and workflows across experiments. The researches have facilities to implement new analyses methods and visualizations and to perform new experiments. The digital scientific data is a proof of evidence. We need it as evidence of research experiments. We use digital data as proof of findings that presenting the high quality results. The data representing the work of the research, open opportunities for another research, and enable to enhance the previous research by developing better research in the future. In this paper, we explain the development of the DALA (Digital Archive System for Long Term Access) project for Digital Scientific Framework collaboration. The research goal is establishing the framework for collaborative research by developing the framework technology to collecting the scientific data, processing the data, analysing the data, visualizing the data and archiving the data. The framework developed using the GForge system, the collaborative development environment. We also developed and combined with preservation system to provide the dynamic digital data in storage and access systems. The paper is presented in several sections. In the section which title Science 3.0 and Scientific Digital, we explain the main concepts of scientific collaboration and needs on preservation system. In the section 3, we explain the development features, and system architectures of the framework. The conclusions and future directions are presenting in the last section II. A COLLABORATIVE SCIENTIFIC WORK The word Science 3.0 is following the term of web 2.0, web 3.0. It is representing the collaborative work in many scientific works. Lee Dirks [2] giving the sample of the needs on high quality reporting. He give sample in the live reporting with characteristics such as have multi perceptions and dynamically can be customized to each user. The report gives capability to access and use research workflows and experience materials. We can analyze the similarity of the research with another. The researches have capabilities to enhance the current research to solve the future problems..
2 Several projects regarding the infrastructure of scientific collaboration actually have implemented. CERN [3] as the European Organization for Nuclear Research is the project that developing new models of data collaboration and supporting with the high technology infrastructures. San Diego Super Computing (SDSC) [4] also giving the models how the high infrastructures of computation giving support to analyse the high volume of data to analyse and get the information in advance issues such as earthquakes prediction, bio-fuel process, and molecules extractions. Open source software is one of the good samples how the collaboration can be from early stage of development until the release. It is also show us on how the software is always keeping a live with a lot of contribution from other researcher. It is can found in the Sourceforge website ( that giving us a lot of software resources to support the researcher interest. III. DESIGN ARCHITECTURES The design architecture will explain the system architecture of the frameworks, the software architectures, and implementation. In the Figure 1, we were showing the system architecture design for the collaboration framework of scientific work. The architecture adapted from Science-Forge research that introduced a collaborative scientific framework design [9]. The system architectures is consist of three components; there application, framework application, PC based cluster, and storage. Storage is the component that handling the data archives process. The storage works under the framework to manage and execute the dynamic data presentation. The main component is the framework application. It is handled the data collection activities, data processing and give the direction the data archive to storage. Access interface is use for managing the data administration from users to access the data. It is functionally important to make sure the data is accessible secure methods. Access node is components to verify and the data presenting to users as needed. We use PC Based Cluster in the implementation to run the system design architectures. We use MySQL Cluster [7] as storage, and GForge system for framework application. The implementation of PC based cluster is using the MPI configuration to allow the workload of the process and allocates the activities. The preservation system is build as the framework components. We developed the preservation components in Gforge. The component functions are to ensure the data have capabilities in access and use as in original formats. A. Software Architecture The software architecture consist of three layer that shown in Figure 3. The storage is the basic layer. The layer used. We installed the MySQL Cluster in the one machines/pc in the same environment with PV Cased cluster. Figure 1. System Architectures (adapted from Science-Forge Framework) [9] Application Layer Collection s Framework Layer PC based Cluster Storage Layer processing Framework Application Cluster based Storage The Framework Application layer is the major component in the design. GForge framework is used because have capabilities in data handling, data processing and data archiving. The Gforge features framework is shown in the Figure 2. It is having several features; there are collaboration, authentication, access, control, output, and source code. We developed the preservation system for data handling and management process. The preservation systems manage the data collection process, data processing, data analysis, data visualization, and data archive. The preservation system has build based on Open Archival Information System [8] to ensure the standards acceptance in scientific digital data subjects. Figure 2. GForge Framework [6] (source: Archive The preservation system component consist of several parts of system, there are input, preservation planning, data management, ingest, and archival storage
3 Figure 3. The Software Architecture The implementation of the framework has several flows, shown in Figure 4. collection: using the input component in preservation system, the data will collect based on the originality of the data. The information is about the data version, application of data, operating system, and any other information related with the data preservation. processing: the data processing consist of two steps, there are data analysis and data visualization. In data analysis, we analysed the data based on the context, provenance, appearances, structure, and behaviour of the data. It is very important to understand the preservation needs of each of the data. Based on these analyses, we can result the preservation action to the visualization actions for each data. The information is also important for end user; they will know well the technological environment need to run the data. archiving: the data will store and keep in the data management system that also being digital repositories, digital library and preservation. The data is managed in term of long term use and access in the future for other researcher. The illustration of the data handling implementation is shown in Figure 4. IV. USE CASE The research has developed by following the research flow and data production in each step of the research. In the first steps, we developed the research flow based on common research activities. The research is consisting of problem identification problems, problem formulation and statements, data gathering, data analysis, and reports and publications. These steps based on data perspectives in scientific works. Figure 4. The Implementation Process [9] Every steps of research work is producing the data. The problems identification steps are producing the literatures data such as downloaded paper, journal, and technical reports. The data can be form as PDF and word processing files. The problem statements steps are producing the research questions file, scopes, assumptions and research proposals. The next steps are data gathering. We usually use survey, collecting data from tools, doing in lab-experiments, and make simulation in software. The analysis processes are producing a lot of data, i.e. images, logs, databases, configurations, and formula. The reports and publication are producing the research report such as technical reports, lab reports, and publication in conferences and journals, and presentations file. The picture of research work can be seen in Figure 5. The second step of implementation, we doing implementation scenario for every data we use in the experiments. The implementation scenario consist of tree major steps, there are data collection, data processes and data archives. collection is entering the each data that produced in every steps of research into DALA systems. In data collection every data will generate based on the document types and formats types. The model of data classification is shown in Figure 6.
4 Reports & Publications Technical report, journal, paper, t ti fil Logs, database, configurations, formula, simulations Analysis Problem Identification Digital Literature files: paper, journals, and technical reports Survey materials, raw data, experiments results, software Research questions, scope, assumptions, related research Gathering Figure 5. The Digital in Scientific Flows Problem Statements The third step on the implementation is data processing. The DALA systems have method to processing the data into 2 processes, there are data analysis and data visualization. analysis is processing the data by analysis the metadata and preservation information for each of data. The metadata is capturing the complete information about the digital data,; there are the author, the creation date, the change date, the description of data, file version, the application, and other information that related for the data. The preservation information is collecting the information about the appearances, content, context, structures, behavior, and context. The result of the analyses process is needed to determine the visualization process. The data will have the information that shown the metadata and preservation information as description in data management and archives. The data can be arrange, classify and manage based on the metadata and preservation information, not only the categories. The flow of the data into the data analysis process is shown in Figure 7. Digital data Analysis E-Journal Metadata Preservation Info Theses & Dissertation classifications Management Preservation Document Type Scientific Scientific Format Type Conference, Meeting, Lecturer E-Records Text Images Numeric Video, Audio Output Model Figure 6. The Classification of Digital Figure 7. The Analysis Processes The process on data management and archives are using the GForge framework for managing every data. The data archives are consist of digital repositories, digital library and digital preservation. The data that has stored in the system will get the facilities as repositories; it is meaning that the data will be continuous data. Every update of the data will store and manage based on the latest release to the earlier release. It is very helpful for the researcher to understand the context and the behaviour of the data. The change of every data is showing up to giving user understand ability of use of the data. The digital library is visualization the digital data as like library. Based on the metadata and preservation information every data can search and access easily. The relation between data more easy to detected and made researcher flexible and fast search for every data they wanted. Preservation is keeping the data into original formats, and keeping the information that embedded in the data. The data archive is very important feature for ensure the data that has entering into the DALA system can access and share now and in the future. The Figure 8 is showing the flow and the relation between visualization processes and data archive.
5 Repositories the original data list of data release change of data comments and involvements contributions logs Digital Library Search by data types Search by format types Related projects User based data Figure 8. The Visualization and Archive Features V. RESULT Preservation appearances content context behaviour structure The DALA project has shown the design, components, and implementation of digital data management system with focus on the long term accessibility and usability. DALA used the grid environment to manage and anticipate the large amount of digital data to be handled. The combination between new preservation application, Gforge framework application, and MySQL Cluster is the good practices in maximize the open source resources that available in the scientific world.. The research is implemented the preservation system based on the OAIS standards and implement the new preservation architectures. We also have introduced the methods in preservation system for creating multi functional archive system. We have implemented several functions such as using the flow of data handling process. The process consist of data collection, data processing, data analysis based on several criteria, data visualization, and data archiving with combination of digital repositories, digital library and digital preservation. The other results are on the data processing. DALA system has proceed the metada and preservation system for the data management and data preservation, that needed for the data visualizations and data archives. Metada data information has captured several information such as author information, the creation date, the change date, the description of data, file version, the application, and other information that related for the data. The preservation information is collecting the information about the appearances, content, context, structures, behaviour, and context. DALA system has presenting the core feature such as the original data, list of data release, change of data, comments and involvements, contributions, and logs. The other feature for supporting digital library are search by data types, search by format types, related projects, and user based data. The preservation are supported several features, there are appearances, data content, data context, data behaviour, and data structure. In the future, the challenges come from the different type of data to be handled. The exploding volume of the data also will be the issues in the infrastructure and storage facilities. The advanced issues will very challenging to address collaboratively. The issues such as the integration of the systems, multi sites access, secure access, and technological preservation is currently still under discussion. REFERENCES [1] IDC Report: The Exploding Digital Universe. John Gantz, Chief Research Officer, IDC. Available at [2] L Dirks. eresearch, Semantic Computing, and the Cloud. Avaliable at [3] CERN, The European Organization for Nuclear Research: [4] SDSC. Available at [5] SourceFoerge. Available at [6] Gforge. Available at [7] MySQL Cluster. Available at [8] OAIS. The Open Archival Information System. Available at [9] M Riasetiawan, AK Mahmood. Science-Forge: a Collaborative Scientific Framework Design.IEEE ISIEA 2010.Penang, Malaysia. Unpublished and will be presented at 3-6 th October 2010.
The Analyses on Dynamic and Dedicated Resource Allocation on Xen Server
TELKOMNIKA, Vol.14, No.1, March 2016, pp. 280~285 ISSN: 1693-6930, accredited A by DIKTI, Decree No: 58/DIKTI/Kep/2013 DOI: 10.12928/TELKOMNIKA.v14i1.2321 280 The Analyses on Dynamic and Dedicated Resource
More informationInvenio: A Modern Digital Library for Grey Literature
Invenio: A Modern Digital Library for Grey Literature Jérôme Caffaro, CERN Samuele Kaplun, CERN November 25, 2010 Abstract Grey literature has historically played a key role for researchers in the field
More informationInge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license
Inge Van Nieuwerburgh OpenAIRE NOAD Belgium Tools&Services OpenAIRE EUDAT can be reused under the CC BY license Open Access Infrastructure for Research in Europe www.openaire.eu Research Data Services,
More informationN. Marusov, I. Semenov
GRID TECHNOLOGY FOR CONTROLLED FUSION: CONCEPTION OF THE UNIFIED CYBERSPACE AND ITER DATA MANAGEMENT N. Marusov, I. Semenov Project Center ITER (ITER Russian Domestic Agency N.Marusov@ITERRF.RU) Challenges
More informationResearch Data Management: lessons learned - and still to learn
Research Data Management: lessons learned - and still to learn SWITCH Research Data Management (RDM) Workshop, 15. Dezember 2014 Dr., ETH-Bibliothek, ETH Zürich 15.12.2014 1 Overview Digital Curation Office
More informationTransferring vital e-records to a trusted digital repository in Catalan public universities (the iarxiu platform)
Transferring vital e-records to a trusted digital repository in Catalan public universities (the iarxiu platform) Miquel Serra Fernàndez Archive and Registry Unit, University of Girona Girona, Spain (Catalonia)
More informationPromoting Open Standards for Digital Repository. case study examples and challenges
Promoting Open Standards for Digital Repository Infrastructures: case study examples and challenges Flavia Donno CERN P. Fuhrmann, DESY, E. Ronchieri, INFN-CNAF OGF-Europe Community Outreach Seminar Digital
More informationScalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *
Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute
More informationConducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository
Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Robert R. Downs and Robert S. Chen Center for International Earth Science Information
More informationCREATING DIGITAL REPOSITORIES PRESENTED BY CHAMA MPUNDU MFULA CHIEF LIBRARIAN NATIONAL ASSEMBLY OF ZAMBIA
CREATING DIGITAL REPOSITORIES PRESENTED BY CHAMA MPUNDU MFULA CHIEF LIBRARIAN NATIONAL ASSEMBLY OF ZAMBIA Introduction Digital repositories (DR) are commonly referred to as institutional repositories or
More informationIRVLA The Irish Virtual Research Library and Archive project.
IRVLA The Irish Virtual Research Library and Archive project. A presentation to the HII International Advisory Committee John Mc Donough IVRLA Project Manager Outline Background. Scope. The Vision Thing.
More informationExtending the Facets concept by applying NLP tools to catalog records of scientific literature
Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of
More informationData Curation Profile Human Genomics
Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date
More informationDigital repositories as research infrastructure: a UK perspective
Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation
More informationData management Backgrounds and steps to implementation; A pragmatic approach.
Data management Backgrounds and steps to implementation; A pragmatic approach. Research and data management through the years Find the differences 2 Research and data management through the years Find
More informationDexterity: Data Exchange Tools and Standards for Social Sciences
Dexterity: Data Exchange Tools and Standards for Social Sciences Louise Corti, Herve L Hours, Matthew Woollard (UKDA) Arofan Gregory, Pascal Heus (ODaF) I-Pres, 29-30 September 2008, London Introduction
More informationArchives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment
Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information
More informationEmerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018
Emerging Trends in Records Management Technology Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018 The Paperless World Source: Le Trefle ad (2013) Emerging Trends in Records Management Technology
More informationHigh Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research
High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European
More informationDigital Curators: Who, What, & How
Digital Curators: Who, What, & How A Perspective from OCLC Programs & Research Robin L. Dale 19 April 2007 DigCCurr 2007 Chapel Hill, NC Libraries and Curation Responsibilities I am prepared to predict
More informationDigital Curation and Preservation: Defining the Research Agenda for the Next Decade
Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent
More informationData publication and discovery with Globus
Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,
More informationLTR TWG & the Cloud PRESENTATION TITLE GOES HERE
LTR TWG & the Cloud PRESENTATION TITLE GOES HERE Roger Cummings Co-Chair, LTR TWG LTR TWG Introduction! TWG full chartered in mid 2008! Mission! The TWG will lead storage industry collaboration with groups
More informationirods for Data Management and Archiving UGM 2018 Masilamani Subramanyam
irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam Agenda Introduction Challenges Data Transfer Solution irods use in Data Transfer Solution irods Proof-of-Concept Q&A Introduction
More informationCertification. F. Genova (thanks to I. Dillo and Hervé L Hours)
Certification F. Genova (thanks to I. Dillo and Hervé L Hours) Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share,
More informationInteroperability & Archives in the European Commission
Interoperability & Archives in the European Commission By Natalia ARISTIMUÑO PEREZ Head of Interoperability Unit at Directorate- General for Informatics (DG DIGIT) European Commission High value added
More informationAn overview of the OAIS and Representation Information
An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and
More informationDigital Preservation: How to Plan
Digital Preservation: How to Plan Preservation Planning with Plato Christoph Becker Vienna University of Technology http://www.ifs.tuwien.ac.at/~becker Sofia, September 2009 Outline Why preservation planning?
More informationThe e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,
Barbara Sierman Digital Preservation Officer Madrid, 16-03-2006 e-depot in practice Short introduction of the e-depot 4 Cases with different aspects Characteristics of the supplier Specialities, problems
More informationThe OAIS Reference Model: current implementations
The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation
More informationThe DL.org Quality Working Group
Quality Interoperability The DL.org Quality Working Group Sarah Higgins, Digital Curation Centre, University of Edinburgh, Quality WG Testimonial Giuseppina Vullo, HATII, University of Glasgow, Quality
More informationPerspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe
Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe Stephane Berghmans, DVM PhD 31 January 2018 9 When talking about data, we talk about All forms of research
More informationWriting a Data Management Plan A guide for the perplexed
March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,
More informationRUtgers COmmunity REpository (RUcore)
RUtgers COmmunity REpository (RUcore) A FEDORA-based Institutional Repository To Support Multidisciplinary Collections CNI Task Force Meeting April 3 4, 2006 Arlington, VA Ronald C. Jantz Rutgers University
More informationHorizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1
Horizon 2020 Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts Deliverable D4.1 Data management plan (open research data pilot) Date of preparation: 2015/09/30 Start date
More informationMERIL An e-infrastructure to connect Research Infrastructures
MERIL An e-infrastructure to connect Research Infrastructures 10th eurocris Strategic Seminar 11 Sept 2012, Brussels Valérie Brasse, ESF, MERIL IS/QA Officer 1 Introduction The e-infrastructures activity,
More informationImportance of cultural heritage:
Cultural heritage: Consists of tangible and intangible, natural and cultural, movable and immovable assets inherited from the past. Extremely valuable for the present and the future of communities. Access,
More informationThe Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti
Information Systems International Conference (ISICO), 2 4 December 2013 The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria
More informationDIGIT.B4 Big Data PoC
DIGIT.B4 Big Data PoC DIGIT 01 Social Media D02.01 PoC Requirements Table of contents 1 Introduction... 5 1.1 Context... 5 1.2 Objective... 5 2 Data SOURCES... 6 2.1 Data sources... 6 2.2 Data fields...
More informationEnabling Interaction and Quality in a Distributed Data DRIS
Purdue University Purdue e-pubs Libraries Research Publications 5-11-2006 Enabling Interaction and Quality in a Distributed Data DRIS D. Scott Brandt Purdue University, techman@purdue.edu James L. Mullins
More informationData Curation Handbook Steps
Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business
More informationGrid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms
Grid Computing 1 Resource sharing Elements of Grid Computing - Computers, data, storage, sensors, networks, - Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem
More informationLASDA: an archiving system for managing and sharing large scientific data
LASDA: an archiving system for managing and sharing large scientific data JEONGHOON LEE Korea Institute of Science and Technology Information Scientific Data Strategy Lab. 245 Daehak-ro, Yuseong-gu, Daejeon
More informationSusan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October
Susan Thomas, Project Manager An overview of the project Wellcome Library, 10 October 2006 Outline What is Paradigm? Lessons so far Some future challenges Next steps What is Paradigm? Funded for 2 years
More informationRobin Dale RLG
Robin Dale RLG Robin.Dale@notes.rlg.org Diversity of applications (commercial, home-grown, operational, etc.) in the organization, structure and encoding of documents and data Complexity varies greatly
More informationACET s e-research Activities
18 June 2008 1 Computing Resources 2 Computing Resources Scientific discovery and advancement of science through advanced computing Main Research Areas Computational Science Middleware Technologies for
More informationOpen Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector
Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Relais Culture Europe mfoulonneau@relais-culture-europe.org Community report A community report on
More informationKepler and Grid Systems -- Early Efforts --
Distributed Computing in Kepler Lead, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, (Joint work with Matthew Jones) 6th Biennial Ptolemy Miniconference Berkeley,
More informationEducating a New Breed of Data Scientists for Scientific Data Management
Educating a New Breed of Data Scientists for Scientific Data Management Jian Qin School of Information Studies Syracuse University Microsoft escience Workshop, Chicago, October 9, 2012 Talk points Data
More informationAn Introduction to Digital Preservation
An Introduction to Digital Preservation Manfred Thaller Universität zu* Köln March 23 rd, 2009 *University at not of Cologne Modern information technology allows all memory institutions to make substantial
More informationProject GRACE: A grid based search tool for the global digital library
Project GRACE: A grid based search tool for the global digital library Frank Scholze 1, Glenn Haya 2, Jens Vigen 3, Petra Prazak 4 1 Stuttgart University Library, Postfach 10 49 41, 70043 Stuttgart, Germany;
More informationPreservation Planning in the OAIS Model
Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract
More informationProtecting Future Access Now Models for Preserving Locally Created Content
Protecting Future Access Now Models for Preserving Locally Created Content By Amy Kirchhoff Archive Service Product Manager, Portico, ITHAKA Amigos Online Conference Digital Preservation: What s Now, What
More informationEUDAT. Towards a pan-european Collaborative Data Infrastructure
EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland CESSDA workshop Tampere, 5 October 2012 EUDAT Towards a pan-european Collaborative
More informationRequirements for data catalogues within facilities
Requirements for data catalogues within facilities Milan Prica 1, George Kourousias 1, Alistair Mills 2, Brian Matthews 2 1 Sincrotrone Trieste S.C.p.A, Trieste, Italy 2 Scientific Computing Department,
More informationCARED Safety Confirmation System Training Module. Prepared by: UGM-OU RESPECT Satellite Office Date: 22 October 2015
CARED Safety Confirmation System Training Module Prepared by: UGM-OU RESPECT Satellite Office Date: 22 October 2015 Table of Contents Introduction... 3 Who are we?... 3 Our Programs and Experience... 3
More informationNational Centre for Text Mining NaCTeM. e-science and data mining workshop
National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?
More informationImplementing Trusted Digital Repositories
Implementing Trusted Digital Repositories Reagan W. Moore, Arcot Rajasekar, Richard Marciano San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA 92093-0505 {moore, sekar, marciano}@sdsc.edu
More informationISO INTERNATIONAL STANDARD. Information and documentation Records management Part 1: General
Provläsningsexemplar / Preview INTERNATIONAL STANDARD ISO 15489-1 First edition 2001-09-15 Information and documentation Records management Part 1: General Information et documentation «Records management»
More informationARKive-ERA Project Lessons and Thoughts
ARKive-ERA Project Lessons and Thoughts Semantic Web for Scientific and Cultural Organisations Convitto della Calza 17 th June 2003 Paul Shabajee (ILRT, University of Bristol) 1 Contents Context Digitisation
More informationDigital Commons Workshop for Depositors
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Digital Commons / Institutional Repository Information Digital Commons - Information and Tools 5-4-2006 Digital Commons
More informationBig Data, exploiter de grands volumes de données
Big Data, exploiter de grands volumes de données mardi 3 juillet 2012 Daniel Teruggi, Head of Research dteruggi@ina.fr Ina: Institut National de l Audiovisuel Institut national de l audiovisuel Missions:
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationViewpoint Review & Analytics
The Viewpoint all-in-one e-discovery platform enables law firms, corporations and service providers to manage every phase of the e-discovery lifecycle with the power of a single product. The Viewpoint
More informationDigital preservation activities at the German National Library nestor / kopal
Digital preservation activities at the German National Library nestor / kopal Entretiens de la BnF, Paris Dec. 8th, 2006 1 Our digital heritage consists of...... text information such as books, journals,
More informationCommon approaches to management. Presented at the annual conference of the Archives Association of British Columbia, Victoria, B.C.
Common approaches to email management Presented at the annual conference of the Archives Association of British Columbia, Victoria, B.C. Agenda 1 2 Introduction and Objectives Terms and Definitions 3 Typical
More informationSurveying the Digital Library Landscape
Surveying the Digital Library Landscape Trends and Observations Michael J. Giarlo An overview of the digital library landscape Courtesy of Wikimedia Commons (public domain) Fundamental Concepts Kahn/Wilensky
More informationElectronic Records Archives: Philadelphia Federal Executive Board
Electronic Records Archives: Philadelphia Federal Executive Board L. Reynolds Cahoon Assistant Archivist for HR and IT and Chief Information Officer 18 March 2004 Agenda (The Mission) Electronic Records
More informationVI-SEEM Data Repository. Presented by: Panayiotis Charalambous
SIMDAS AND VI-SEEM WORKSHOP ON DATA MANAGEMENT AND SEMANTIC STRUCTURES FOR CROSS-DISCIPLINARY RESEARCH IN THE SEEM REGION VRE for regional Interdisciplinary communities in Southeast Europe and the Eastern
More informationDIGITAL ARCHIVES & PRESERVATION SYSTEMS
DIGITAL ARCHIVES & PRESERVATION SYSTEMS Part 4 Archivematica (presented July 14, 2015) Kari R. Smith, MIT Institute Archives Session Overview 2 Digital archives and digital preservation systems. These
More informationForensic Analysis Approach Based on Metadata and Hash Values for Digital Objects in the Cloud
Forensic Analysis Approach Based on Metadata and Hash Values for Digital Objects in the Cloud Ezz El-Din Hemdan 1, Manjaiah D.H 2 Research Scholar, Department of Computer Science, Mangalore University,
More informationHigh Performance Computing on MapReduce Programming Framework
International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming
More informationDL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza
DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:
More informationLong-term digital preservation of UNSWorks
Long-term digital preservation of UNSWorks UNSW Library Arif Shaon, Maude Frances CAUL Community Days 2014 UNSW Australia The University of New South Wales at a Glance: https://www.unsw.edu.au/sites/default/files/documents/unsw4009_miniguide_2012_aw2_v2.pdf
More informationISO INTERNATIONAL STANDARD. Information and documentation Records management processes Metadata for records Part 1: Principles
INTERNATIONAL STANDARD ISO 23081-1 First edition 2006-01-15 Information and documentation Records management processes Metadata for records Part 1: Principles Information et documentation Processus de
More informationData Discovery - Introduction
Data Discovery - Introduction Why (benefits of reusing data) How EUDAT's services help with this (in general) Adam Carter In days gone by: Design an experiment Getting Your Data Conduct the experiment
More informationUniversity of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version
University of British Columbia Library Persistent Digital Collections Implementation Plan Final project report Summary version May 16, 2012 Prepared by 1. Introduction In 2011 Artefactual Systems Inc.
More informationCreating synergy through private cloud
Creating synergy through private cloud The Wits cloud architecture Prof Derek W. Keats Deputy Vice Chancellor (Knowledge & Information Management) The University of the Witwatersrand, Johannesburg http://kim.wits.ac.za
More informationDIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM
OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:
More informationHow to make your data open
How to make your data open Marialaura Vignocchi Alma Digital Library Muntimedia Center University of Bologna The bigger picture outside academia Thursday 29th October 2015 There is a strong societal demand
More informationDeveloping Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy
Heriot-Watt University Heriot-Watt University Research Gateway Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Publication
More informationDescription Cross Domain - Metadata Schema Registry Presentation to ISO Working Group Sydney, 2 November 2004
Description Cross Domain - Metadata Schema Registry Presentation to ISO 23081 Working Group Sydney, 2 November 2004 Outline InterPARES 2 Description Cross Domain Metadata Schema Registry Status of prototype
More informationISO INTERNATIONAL STANDARD. Health informatics Service architecture Part 3: Computational viewpoint
INTERNATIONAL STANDARD ISO 12967-3 First edition 2009-08-15 Health informatics Service architecture Part 3: Computational viewpoint Informatique de santé Architecture de service Partie 3: Point de vue
More informationGrid Computing Systems: A Survey and Taxonomy
Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical
More informationPreservation and Access of Digital Audiovisual Assets at the Guggenheim
Preservation and Access of Digital Audiovisual Assets at the Guggenheim Summary The Solomon R. Guggenheim Museum holds a variety of highly valuable born-digital and digitized audiovisual assets, including
More informationGEOSS Data Management Principles: Importance and Implementation
GEOSS Data Management Principles: Importance and Implementation Alex de Sherbinin / Associate Director / CIESIN, Columbia University Gregory Giuliani / Lecturer / University of Geneva Joan Maso / Researcher
More informationArchive to the Cloud: Hands on Experience with Enterprise Vault.cloud
Archive to the Cloud: Hands on Experience with Enterprise Vault.cloud Description See first-hand how Enterprise Vault.cloud, Symantec's hosted archiving service, can help address mailbox management, email
More informationScholarly Big Data: Leverage for Science
Scholarly Big Data: Leverage for Science C. Lee Giles The Pennsylvania State University University Park, PA, USA giles@ist.psu.edu http://clgiles.ist.psu.edu Funded in part by NSF, Allen Institute for
More informationEuropean Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy
European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy Metadata Life Cycle Statistics Portugal Isabel Morgado Methodology and Information Systems
More informationDeveloping a Research Data Policy
Developing a Research Data Policy Core Elements of the Content of a Research Data Management Policy This document may be useful for defining research data, explaining what RDM is, illustrating workflows,
More informationA High-Level Distributed Execution Framework for Scientific Workflows
A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones 2 1 San Diego Supercomputer Center, UCSD, U.S.A.
More informationLong-term preservation for INSPIRE: a metadata framework and geo-portal implementation
Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation INSPIRE 2010, KRAKOW Dr. Arif Shaon, Dr. Andrew Woolf (e-science, Science and Technology Facilities Council, UK) 3
More informationRISE SICS North Newsletter 2017:3
RISE SICS North Newsletter 2017:3 Visa webbversion RISE SICS North Newsletter 2017:3 News in short We have released our web http://ice.sics.se and our web shop for Openstack We have started our first H2020
More informationStorage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan
Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality
More informationMetadata Framework for Resource Discovery
Submitted by: Metadata Strategy Catalytic Initiative 2006-05-01 Page 1 Section 1 Metadata Framework for Resource Discovery Overview We must find new ways to organize and describe our extraordinary information
More informationInvesting in a Better Storage Environment:
Investing in a Better Storage Environment: Best Practices for the Public Sector Investing in a Better Storage Environment 2 EXECUTIVE SUMMARY The public sector faces numerous and known challenges that
More informationAn Approach to Software Preservation
An Approach to Software Preservation PV 2009, Madrid Arif Shaon, Brian Matthews, Juan Bicarregui, Catherine Jones (STFC), Jim Woodcock (Univ of York) 1 December, 2009 1 Science and Technology Facilities
More informationGiovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France
Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data
More informationThe Necessity of a New Culture of Electronic Publishing C A S L I N
Humboldt-University at Berlin Computer and Media Services The Necessity of a New Culture of Electronic Publishing C A S L I N 2004 Dr. Peter Schirmbacher Humboldt-University at Berlin Computer and Media
More informationMitigating Risk of Data Loss in Preservation Environments
Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and
More information