DALA Project: Digital Archive System for Long Term Access

Size: px
Start display at page:

Download "DALA Project: Digital Archive System for Long Term Access"

Transcription

1 2010 International Conference on Distributed Framework for Multimedia Applications (DFmA) DALA Project: Digital Archive System for Long Term Access Mardhani Riasetiawan 1,2, Ahmad Kamil Mahmood 2 1 Master of Information Technology Program, Universitas Gadjah Mada Jalan Grafika No 2, Kompleks UGM Yogyakarta, Indonesia 1 mardhani@mti.ugm.ac.id 2 Computer and Information Science Department, Universiti Teknologi PETRONAS Bandar Seri Iskandar 31750, Perak, Malaysia 2 kamilmh@petronas.com.my Abstract The scientific environment have face the current issues in the data landscape change. The volume, type, formats of scientific data now has being exploding. The industry standard require the long term access and use of digital data for many reasons such as proof of scientific experiments, evidence and proof of findings. It is need to be interactive, reproducible, collaborative, and dynamic and have the reputation and influences for collaborative research. In this research, we introduce DALA (Digital Archive System for Long Term Access) project as the scientific collaboration framework. The project has built for manage and preserve scientific data based on GForge system, and implement several data process. The frameworks have capabilities in handling data collection, data processing, and data archiving. The research aimed is to the new framework for scientific collaboration works. Keywords Scientific, Framework, DALA, GForge System. I. INTRODUCTION The world of science is very dynamics. One innovations or discovery in one area is also will have many discoveries in other area. The innovation itself can count in every minute and every second. The innovation and scientific works have involving many data from many resources. Interdisciplinary or multidisciplinary research has needs on data collection, for example the needs on data share and use of data. The need of long term access and use of digital scientific data has become the major interest in every organization. The data also has become very sensitive when the data need to share and access for another organization. The data is also need to change because of the change on the data landscape, technological matters, and user needs of dynamic data. The digital scientific data have previous and future challenges. The data that produce by research work need to answering the demand on data collaboration. The research need to work with other research, it can be similar or multidisciplinary research. It is a must, because the world now needs the problem solving research and high quality research. Lee Dirks [1] giving the sample of the future needs of scientific digital data, for example is research reporting. The research reports now have characteristics. It must have multi perceptions and dynamically can be customized to each user. The report also can have an access and use, in the term of seeing or following the research workflows and outputs from the lab experiments. The report can be exported into an electronic lab workbench in order to reconstruct the same or similar experiment. It is also giving capabilities to researcher working with multiple reports and having the ability to mash up data and workflows across experiments. The researches have facilities to implement new analyses methods and visualizations and to perform new experiments. The digital scientific data is a proof of evidence. We need it as evidence of research experiments. We use digital data as proof of findings that presenting the high quality results. The data representing the work of the research, open opportunities for another research, and enable to enhance the previous research by developing better research in the future. In this paper, we explain the development of the DALA (Digital Archive System for Long Term Access) project for Digital Scientific Framework collaboration. The research goal is establishing the framework for collaborative research by developing the framework technology to collecting the scientific data, processing the data, analysing the data, visualizing the data and archiving the data. The framework developed using the GForge system, the collaborative development environment. We also developed and combined with preservation system to provide the dynamic digital data in storage and access systems. The paper is presented in several sections. In the section which title Science 3.0 and Scientific Digital, we explain the main concepts of scientific collaboration and needs on preservation system. In the section 3, we explain the development features, and system architectures of the framework. The conclusions and future directions are presenting in the last section II. A COLLABORATIVE SCIENTIFIC WORK The word Science 3.0 is following the term of web 2.0, web 3.0. It is representing the collaborative work in many scientific works. Lee Dirks [2] giving the sample of the needs on high quality reporting. He give sample in the live reporting with characteristics such as have multi perceptions and dynamically can be customized to each user. The report gives capability to access and use research workflows and experience materials. We can analyze the similarity of the research with another. The researches have capabilities to enhance the current research to solve the future problems..

2 Several projects regarding the infrastructure of scientific collaboration actually have implemented. CERN [3] as the European Organization for Nuclear Research is the project that developing new models of data collaboration and supporting with the high technology infrastructures. San Diego Super Computing (SDSC) [4] also giving the models how the high infrastructures of computation giving support to analyse the high volume of data to analyse and get the information in advance issues such as earthquakes prediction, bio-fuel process, and molecules extractions. Open source software is one of the good samples how the collaboration can be from early stage of development until the release. It is also show us on how the software is always keeping a live with a lot of contribution from other researcher. It is can found in the Sourceforge website ( that giving us a lot of software resources to support the researcher interest. III. DESIGN ARCHITECTURES The design architecture will explain the system architecture of the frameworks, the software architectures, and implementation. In the Figure 1, we were showing the system architecture design for the collaboration framework of scientific work. The architecture adapted from Science-Forge research that introduced a collaborative scientific framework design [9]. The system architectures is consist of three components; there application, framework application, PC based cluster, and storage. Storage is the component that handling the data archives process. The storage works under the framework to manage and execute the dynamic data presentation. The main component is the framework application. It is handled the data collection activities, data processing and give the direction the data archive to storage. Access interface is use for managing the data administration from users to access the data. It is functionally important to make sure the data is accessible secure methods. Access node is components to verify and the data presenting to users as needed. We use PC Based Cluster in the implementation to run the system design architectures. We use MySQL Cluster [7] as storage, and GForge system for framework application. The implementation of PC based cluster is using the MPI configuration to allow the workload of the process and allocates the activities. The preservation system is build as the framework components. We developed the preservation components in Gforge. The component functions are to ensure the data have capabilities in access and use as in original formats. A. Software Architecture The software architecture consist of three layer that shown in Figure 3. The storage is the basic layer. The layer used. We installed the MySQL Cluster in the one machines/pc in the same environment with PV Cased cluster. Figure 1. System Architectures (adapted from Science-Forge Framework) [9] Application Layer Collection s Framework Layer PC based Cluster Storage Layer processing Framework Application Cluster based Storage The Framework Application layer is the major component in the design. GForge framework is used because have capabilities in data handling, data processing and data archiving. The Gforge features framework is shown in the Figure 2. It is having several features; there are collaboration, authentication, access, control, output, and source code. We developed the preservation system for data handling and management process. The preservation systems manage the data collection process, data processing, data analysis, data visualization, and data archive. The preservation system has build based on Open Archival Information System [8] to ensure the standards acceptance in scientific digital data subjects. Figure 2. GForge Framework [6] (source: Archive The preservation system component consist of several parts of system, there are input, preservation planning, data management, ingest, and archival storage

3 Figure 3. The Software Architecture The implementation of the framework has several flows, shown in Figure 4. collection: using the input component in preservation system, the data will collect based on the originality of the data. The information is about the data version, application of data, operating system, and any other information related with the data preservation. processing: the data processing consist of two steps, there are data analysis and data visualization. In data analysis, we analysed the data based on the context, provenance, appearances, structure, and behaviour of the data. It is very important to understand the preservation needs of each of the data. Based on these analyses, we can result the preservation action to the visualization actions for each data. The information is also important for end user; they will know well the technological environment need to run the data. archiving: the data will store and keep in the data management system that also being digital repositories, digital library and preservation. The data is managed in term of long term use and access in the future for other researcher. The illustration of the data handling implementation is shown in Figure 4. IV. USE CASE The research has developed by following the research flow and data production in each step of the research. In the first steps, we developed the research flow based on common research activities. The research is consisting of problem identification problems, problem formulation and statements, data gathering, data analysis, and reports and publications. These steps based on data perspectives in scientific works. Figure 4. The Implementation Process [9] Every steps of research work is producing the data. The problems identification steps are producing the literatures data such as downloaded paper, journal, and technical reports. The data can be form as PDF and word processing files. The problem statements steps are producing the research questions file, scopes, assumptions and research proposals. The next steps are data gathering. We usually use survey, collecting data from tools, doing in lab-experiments, and make simulation in software. The analysis processes are producing a lot of data, i.e. images, logs, databases, configurations, and formula. The reports and publication are producing the research report such as technical reports, lab reports, and publication in conferences and journals, and presentations file. The picture of research work can be seen in Figure 5. The second step of implementation, we doing implementation scenario for every data we use in the experiments. The implementation scenario consist of tree major steps, there are data collection, data processes and data archives. collection is entering the each data that produced in every steps of research into DALA systems. In data collection every data will generate based on the document types and formats types. The model of data classification is shown in Figure 6.

4 Reports & Publications Technical report, journal, paper, t ti fil Logs, database, configurations, formula, simulations Analysis Problem Identification Digital Literature files: paper, journals, and technical reports Survey materials, raw data, experiments results, software Research questions, scope, assumptions, related research Gathering Figure 5. The Digital in Scientific Flows Problem Statements The third step on the implementation is data processing. The DALA systems have method to processing the data into 2 processes, there are data analysis and data visualization. analysis is processing the data by analysis the metadata and preservation information for each of data. The metadata is capturing the complete information about the digital data,; there are the author, the creation date, the change date, the description of data, file version, the application, and other information that related for the data. The preservation information is collecting the information about the appearances, content, context, structures, behavior, and context. The result of the analyses process is needed to determine the visualization process. The data will have the information that shown the metadata and preservation information as description in data management and archives. The data can be arrange, classify and manage based on the metadata and preservation information, not only the categories. The flow of the data into the data analysis process is shown in Figure 7. Digital data Analysis E-Journal Metadata Preservation Info Theses & Dissertation classifications Management Preservation Document Type Scientific Scientific Format Type Conference, Meeting, Lecturer E-Records Text Images Numeric Video, Audio Output Model Figure 6. The Classification of Digital Figure 7. The Analysis Processes The process on data management and archives are using the GForge framework for managing every data. The data archives are consist of digital repositories, digital library and digital preservation. The data that has stored in the system will get the facilities as repositories; it is meaning that the data will be continuous data. Every update of the data will store and manage based on the latest release to the earlier release. It is very helpful for the researcher to understand the context and the behaviour of the data. The change of every data is showing up to giving user understand ability of use of the data. The digital library is visualization the digital data as like library. Based on the metadata and preservation information every data can search and access easily. The relation between data more easy to detected and made researcher flexible and fast search for every data they wanted. Preservation is keeping the data into original formats, and keeping the information that embedded in the data. The data archive is very important feature for ensure the data that has entering into the DALA system can access and share now and in the future. The Figure 8 is showing the flow and the relation between visualization processes and data archive.

5 Repositories the original data list of data release change of data comments and involvements contributions logs Digital Library Search by data types Search by format types Related projects User based data Figure 8. The Visualization and Archive Features V. RESULT Preservation appearances content context behaviour structure The DALA project has shown the design, components, and implementation of digital data management system with focus on the long term accessibility and usability. DALA used the grid environment to manage and anticipate the large amount of digital data to be handled. The combination between new preservation application, Gforge framework application, and MySQL Cluster is the good practices in maximize the open source resources that available in the scientific world.. The research is implemented the preservation system based on the OAIS standards and implement the new preservation architectures. We also have introduced the methods in preservation system for creating multi functional archive system. We have implemented several functions such as using the flow of data handling process. The process consist of data collection, data processing, data analysis based on several criteria, data visualization, and data archiving with combination of digital repositories, digital library and digital preservation. The other results are on the data processing. DALA system has proceed the metada and preservation system for the data management and data preservation, that needed for the data visualizations and data archives. Metada data information has captured several information such as author information, the creation date, the change date, the description of data, file version, the application, and other information that related for the data. The preservation information is collecting the information about the appearances, content, context, structures, behaviour, and context. DALA system has presenting the core feature such as the original data, list of data release, change of data, comments and involvements, contributions, and logs. The other feature for supporting digital library are search by data types, search by format types, related projects, and user based data. The preservation are supported several features, there are appearances, data content, data context, data behaviour, and data structure. In the future, the challenges come from the different type of data to be handled. The exploding volume of the data also will be the issues in the infrastructure and storage facilities. The advanced issues will very challenging to address collaboratively. The issues such as the integration of the systems, multi sites access, secure access, and technological preservation is currently still under discussion. REFERENCES [1] IDC Report: The Exploding Digital Universe. John Gantz, Chief Research Officer, IDC. Available at [2] L Dirks. eresearch, Semantic Computing, and the Cloud. Avaliable at [3] CERN, The European Organization for Nuclear Research: [4] SDSC. Available at [5] SourceFoerge. Available at [6] Gforge. Available at [7] MySQL Cluster. Available at [8] OAIS. The Open Archival Information System. Available at [9] M Riasetiawan, AK Mahmood. Science-Forge: a Collaborative Scientific Framework Design.IEEE ISIEA 2010.Penang, Malaysia. Unpublished and will be presented at 3-6 th October 2010.

The Analyses on Dynamic and Dedicated Resource Allocation on Xen Server

The Analyses on Dynamic and Dedicated Resource Allocation on Xen Server TELKOMNIKA, Vol.14, No.1, March 2016, pp. 280~285 ISSN: 1693-6930, accredited A by DIKTI, Decree No: 58/DIKTI/Kep/2013 DOI: 10.12928/TELKOMNIKA.v14i1.2321 280 The Analyses on Dynamic and Dedicated Resource

More information

Invenio: A Modern Digital Library for Grey Literature

Invenio: A Modern Digital Library for Grey Literature Invenio: A Modern Digital Library for Grey Literature Jérôme Caffaro, CERN Samuele Kaplun, CERN November 25, 2010 Abstract Grey literature has historically played a key role for researchers in the field

More information

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license Inge Van Nieuwerburgh OpenAIRE NOAD Belgium Tools&Services OpenAIRE EUDAT can be reused under the CC BY license Open Access Infrastructure for Research in Europe www.openaire.eu Research Data Services,

More information

N. Marusov, I. Semenov

N. Marusov, I. Semenov GRID TECHNOLOGY FOR CONTROLLED FUSION: CONCEPTION OF THE UNIFIED CYBERSPACE AND ITER DATA MANAGEMENT N. Marusov, I. Semenov Project Center ITER (ITER Russian Domestic Agency N.Marusov@ITERRF.RU) Challenges

More information

Research Data Management: lessons learned - and still to learn

Research Data Management: lessons learned - and still to learn Research Data Management: lessons learned - and still to learn SWITCH Research Data Management (RDM) Workshop, 15. Dezember 2014 Dr., ETH-Bibliothek, ETH Zürich 15.12.2014 1 Overview Digital Curation Office

More information

Transferring vital e-records to a trusted digital repository in Catalan public universities (the iarxiu platform)

Transferring vital e-records to a trusted digital repository in Catalan public universities (the iarxiu platform) Transferring vital e-records to a trusted digital repository in Catalan public universities (the iarxiu platform) Miquel Serra Fernàndez Archive and Registry Unit, University of Girona Girona, Spain (Catalonia)

More information

Promoting Open Standards for Digital Repository. case study examples and challenges

Promoting Open Standards for Digital Repository. case study examples and challenges Promoting Open Standards for Digital Repository Infrastructures: case study examples and challenges Flavia Donno CERN P. Fuhrmann, DESY, E. Ronchieri, INFN-CNAF OGF-Europe Community Outreach Seminar Digital

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Robert R. Downs and Robert S. Chen Center for International Earth Science Information

More information

CREATING DIGITAL REPOSITORIES PRESENTED BY CHAMA MPUNDU MFULA CHIEF LIBRARIAN NATIONAL ASSEMBLY OF ZAMBIA

CREATING DIGITAL REPOSITORIES PRESENTED BY CHAMA MPUNDU MFULA CHIEF LIBRARIAN NATIONAL ASSEMBLY OF ZAMBIA CREATING DIGITAL REPOSITORIES PRESENTED BY CHAMA MPUNDU MFULA CHIEF LIBRARIAN NATIONAL ASSEMBLY OF ZAMBIA Introduction Digital repositories (DR) are commonly referred to as institutional repositories or

More information

IRVLA The Irish Virtual Research Library and Archive project.

IRVLA The Irish Virtual Research Library and Archive project. IRVLA The Irish Virtual Research Library and Archive project. A presentation to the HII International Advisory Committee John Mc Donough IVRLA Project Manager Outline Background. Scope. The Vision Thing.

More information

Extending the Facets concept by applying NLP tools to catalog records of scientific literature

Extending the Facets concept by applying NLP tools to catalog records of scientific literature Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information

Digital repositories as research infrastructure: a UK perspective

Digital repositories as research infrastructure: a UK perspective Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation

More information

Data management Backgrounds and steps to implementation; A pragmatic approach.

Data management Backgrounds and steps to implementation; A pragmatic approach. Data management Backgrounds and steps to implementation; A pragmatic approach. Research and data management through the years Find the differences 2 Research and data management through the years Find

More information

Dexterity: Data Exchange Tools and Standards for Social Sciences

Dexterity: Data Exchange Tools and Standards for Social Sciences Dexterity: Data Exchange Tools and Standards for Social Sciences Louise Corti, Herve L Hours, Matthew Woollard (UKDA) Arofan Gregory, Pascal Heus (ODaF) I-Pres, 29-30 September 2008, London Introduction

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018 Emerging Trends in Records Management Technology Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018 The Paperless World Source: Le Trefle ad (2013) Emerging Trends in Records Management Technology

More information

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research

High Performance Computing Data Management. Philippe Trautmann BDM High Performance Computing Global Research High Performance Computing Management Philippe Trautmann BDM High Performance Computing Global Education @ Research HPC Market and Trends High Performance Computing: Availability/Sharing is key European

More information

Digital Curators: Who, What, & How

Digital Curators: Who, What, & How Digital Curators: Who, What, & How A Perspective from OCLC Programs & Research Robin L. Dale 19 April 2007 DigCCurr 2007 Chapel Hill, NC Libraries and Curation Responsibilities I am prepared to predict

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

LTR TWG & the Cloud PRESENTATION TITLE GOES HERE

LTR TWG & the Cloud PRESENTATION TITLE GOES HERE LTR TWG & the Cloud PRESENTATION TITLE GOES HERE Roger Cummings Co-Chair, LTR TWG LTR TWG Introduction! TWG full chartered in mid 2008! Mission! The TWG will lead storage industry collaboration with groups

More information

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam Agenda Introduction Challenges Data Transfer Solution irods use in Data Transfer Solution irods Proof-of-Concept Q&A Introduction

More information

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours) Certification F. Genova (thanks to I. Dillo and Hervé L Hours) Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share,

More information

Interoperability & Archives in the European Commission

Interoperability & Archives in the European Commission Interoperability & Archives in the European Commission By Natalia ARISTIMUÑO PEREZ Head of Interoperability Unit at Directorate- General for Informatics (DG DIGIT) European Commission High value added

More information

An overview of the OAIS and Representation Information

An overview of the OAIS and Representation Information An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and

More information

Digital Preservation: How to Plan

Digital Preservation: How to Plan Digital Preservation: How to Plan Preservation Planning with Plato Christoph Becker Vienna University of Technology http://www.ifs.tuwien.ac.at/~becker Sofia, September 2009 Outline Why preservation planning?

More information

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid, Barbara Sierman Digital Preservation Officer Madrid, 16-03-2006 e-depot in practice Short introduction of the e-depot 4 Cases with different aspects Characteristics of the supplier Specialities, problems

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

The DL.org Quality Working Group

The DL.org Quality Working Group Quality Interoperability The DL.org Quality Working Group Sarah Higgins, Digital Curation Centre, University of Edinburgh, Quality WG Testimonial Giuseppina Vullo, HATII, University of Glasgow, Quality

More information

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe Stephane Berghmans, DVM PhD 31 January 2018 9 When talking about data, we talk about All forms of research

More information

Writing a Data Management Plan A guide for the perplexed

Writing a Data Management Plan A guide for the perplexed March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,

More information

RUtgers COmmunity REpository (RUcore)

RUtgers COmmunity REpository (RUcore) RUtgers COmmunity REpository (RUcore) A FEDORA-based Institutional Repository To Support Multidisciplinary Collections CNI Task Force Meeting April 3 4, 2006 Arlington, VA Ronald C. Jantz Rutgers University

More information

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1 Horizon 2020 Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts Deliverable D4.1 Data management plan (open research data pilot) Date of preparation: 2015/09/30 Start date

More information

MERIL An e-infrastructure to connect Research Infrastructures

MERIL An e-infrastructure to connect Research Infrastructures MERIL An e-infrastructure to connect Research Infrastructures 10th eurocris Strategic Seminar 11 Sept 2012, Brussels Valérie Brasse, ESF, MERIL IS/QA Officer 1 Introduction The e-infrastructures activity,

More information

Importance of cultural heritage:

Importance of cultural heritage: Cultural heritage: Consists of tangible and intangible, natural and cultural, movable and immovable assets inherited from the past. Extremely valuable for the present and the future of communities. Access,

More information

The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti

The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti Information Systems International Conference (ISICO), 2 4 December 2013 The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria

More information

DIGIT.B4 Big Data PoC

DIGIT.B4 Big Data PoC DIGIT.B4 Big Data PoC DIGIT 01 Social Media D02.01 PoC Requirements Table of contents 1 Introduction... 5 1.1 Context... 5 1.2 Objective... 5 2 Data SOURCES... 6 2.1 Data sources... 6 2.2 Data fields...

More information

Enabling Interaction and Quality in a Distributed Data DRIS

Enabling Interaction and Quality in a Distributed Data DRIS Purdue University Purdue e-pubs Libraries Research Publications 5-11-2006 Enabling Interaction and Quality in a Distributed Data DRIS D. Scott Brandt Purdue University, techman@purdue.edu James L. Mullins

More information

Data Curation Handbook Steps

Data Curation Handbook Steps Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business

More information

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms Grid Computing 1 Resource sharing Elements of Grid Computing - Computers, data, storage, sensors, networks, - Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem

More information

LASDA: an archiving system for managing and sharing large scientific data

LASDA: an archiving system for managing and sharing large scientific data LASDA: an archiving system for managing and sharing large scientific data JEONGHOON LEE Korea Institute of Science and Technology Information Scientific Data Strategy Lab. 245 Daehak-ro, Yuseong-gu, Daejeon

More information

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October Susan Thomas, Project Manager An overview of the project Wellcome Library, 10 October 2006 Outline What is Paradigm? Lessons so far Some future challenges Next steps What is Paradigm? Funded for 2 years

More information

Robin Dale RLG

Robin Dale RLG Robin Dale RLG Robin.Dale@notes.rlg.org Diversity of applications (commercial, home-grown, operational, etc.) in the organization, structure and encoding of documents and data Complexity varies greatly

More information

ACET s e-research Activities

ACET s e-research Activities 18 June 2008 1 Computing Resources 2 Computing Resources Scientific discovery and advancement of science through advanced computing Main Research Areas Computational Science Middleware Technologies for

More information

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Relais Culture Europe mfoulonneau@relais-culture-europe.org Community report A community report on

More information

Kepler and Grid Systems -- Early Efforts --

Kepler and Grid Systems -- Early Efforts -- Distributed Computing in Kepler Lead, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, (Joint work with Matthew Jones) 6th Biennial Ptolemy Miniconference Berkeley,

More information

Educating a New Breed of Data Scientists for Scientific Data Management

Educating a New Breed of Data Scientists for Scientific Data Management Educating a New Breed of Data Scientists for Scientific Data Management Jian Qin School of Information Studies Syracuse University Microsoft escience Workshop, Chicago, October 9, 2012 Talk points Data

More information

An Introduction to Digital Preservation

An Introduction to Digital Preservation An Introduction to Digital Preservation Manfred Thaller Universität zu* Köln March 23 rd, 2009 *University at not of Cologne Modern information technology allows all memory institutions to make substantial

More information

Project GRACE: A grid based search tool for the global digital library

Project GRACE: A grid based search tool for the global digital library Project GRACE: A grid based search tool for the global digital library Frank Scholze 1, Glenn Haya 2, Jens Vigen 3, Petra Prazak 4 1 Stuttgart University Library, Postfach 10 49 41, 70043 Stuttgart, Germany;

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

Protecting Future Access Now Models for Preserving Locally Created Content

Protecting Future Access Now Models for Preserving Locally Created Content Protecting Future Access Now Models for Preserving Locally Created Content By Amy Kirchhoff Archive Service Product Manager, Portico, ITHAKA Amigos Online Conference Digital Preservation: What s Now, What

More information

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT. Towards a pan-european Collaborative Data Infrastructure EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland CESSDA workshop Tampere, 5 October 2012 EUDAT Towards a pan-european Collaborative

More information

Requirements for data catalogues within facilities

Requirements for data catalogues within facilities Requirements for data catalogues within facilities Milan Prica 1, George Kourousias 1, Alistair Mills 2, Brian Matthews 2 1 Sincrotrone Trieste S.C.p.A, Trieste, Italy 2 Scientific Computing Department,

More information

CARED Safety Confirmation System Training Module. Prepared by: UGM-OU RESPECT Satellite Office Date: 22 October 2015

CARED Safety Confirmation System Training Module. Prepared by: UGM-OU RESPECT Satellite Office Date: 22 October 2015 CARED Safety Confirmation System Training Module Prepared by: UGM-OU RESPECT Satellite Office Date: 22 October 2015 Table of Contents Introduction... 3 Who are we?... 3 Our Programs and Experience... 3

More information

National Centre for Text Mining NaCTeM. e-science and data mining workshop

National Centre for Text Mining NaCTeM. e-science and data mining workshop National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?

More information

Implementing Trusted Digital Repositories

Implementing Trusted Digital Repositories Implementing Trusted Digital Repositories Reagan W. Moore, Arcot Rajasekar, Richard Marciano San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA 92093-0505 {moore, sekar, marciano}@sdsc.edu

More information

ISO INTERNATIONAL STANDARD. Information and documentation Records management Part 1: General

ISO INTERNATIONAL STANDARD. Information and documentation Records management Part 1: General Provläsningsexemplar / Preview INTERNATIONAL STANDARD ISO 15489-1 First edition 2001-09-15 Information and documentation Records management Part 1: General Information et documentation «Records management»

More information

ARKive-ERA Project Lessons and Thoughts

ARKive-ERA Project Lessons and Thoughts ARKive-ERA Project Lessons and Thoughts Semantic Web for Scientific and Cultural Organisations Convitto della Calza 17 th June 2003 Paul Shabajee (ILRT, University of Bristol) 1 Contents Context Digitisation

More information

Digital Commons Workshop for Depositors

Digital Commons Workshop for Depositors University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Digital Commons / Institutional Repository Information Digital Commons - Information and Tools 5-4-2006 Digital Commons

More information

Big Data, exploiter de grands volumes de données

Big Data, exploiter de grands volumes de données Big Data, exploiter de grands volumes de données mardi 3 juillet 2012 Daniel Teruggi, Head of Research dteruggi@ina.fr Ina: Institut National de l Audiovisuel Institut national de l audiovisuel Missions:

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

Viewpoint Review & Analytics

Viewpoint Review & Analytics The Viewpoint all-in-one e-discovery platform enables law firms, corporations and service providers to manage every phase of the e-discovery lifecycle with the power of a single product. The Viewpoint

More information

Digital preservation activities at the German National Library nestor / kopal

Digital preservation activities at the German National Library nestor / kopal Digital preservation activities at the German National Library nestor / kopal Entretiens de la BnF, Paris Dec. 8th, 2006 1 Our digital heritage consists of...... text information such as books, journals,

More information

Common approaches to management. Presented at the annual conference of the Archives Association of British Columbia, Victoria, B.C.

Common approaches to  management. Presented at the annual conference of the Archives Association of British Columbia, Victoria, B.C. Common approaches to email management Presented at the annual conference of the Archives Association of British Columbia, Victoria, B.C. Agenda 1 2 Introduction and Objectives Terms and Definitions 3 Typical

More information

Surveying the Digital Library Landscape

Surveying the Digital Library Landscape Surveying the Digital Library Landscape Trends and Observations Michael J. Giarlo An overview of the digital library landscape Courtesy of Wikimedia Commons (public domain) Fundamental Concepts Kahn/Wilensky

More information

Electronic Records Archives: Philadelphia Federal Executive Board

Electronic Records Archives: Philadelphia Federal Executive Board Electronic Records Archives: Philadelphia Federal Executive Board L. Reynolds Cahoon Assistant Archivist for HR and IT and Chief Information Officer 18 March 2004 Agenda (The Mission) Electronic Records

More information

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

VI-SEEM Data Repository.   Presented by: Panayiotis Charalambous SIMDAS AND VI-SEEM WORKSHOP ON DATA MANAGEMENT AND SEMANTIC STRUCTURES FOR CROSS-DISCIPLINARY RESEARCH IN THE SEEM REGION VRE for regional Interdisciplinary communities in Southeast Europe and the Eastern

More information

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

DIGITAL ARCHIVES & PRESERVATION SYSTEMS DIGITAL ARCHIVES & PRESERVATION SYSTEMS Part 4 Archivematica (presented July 14, 2015) Kari R. Smith, MIT Institute Archives Session Overview 2 Digital archives and digital preservation systems. These

More information

Forensic Analysis Approach Based on Metadata and Hash Values for Digital Objects in the Cloud

Forensic Analysis Approach Based on Metadata and Hash Values for Digital Objects in the Cloud Forensic Analysis Approach Based on Metadata and Hash Values for Digital Objects in the Cloud Ezz El-Din Hemdan 1, Manjaiah D.H 2 Research Scholar, Department of Computer Science, Mangalore University,

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:

More information

Long-term digital preservation of UNSWorks

Long-term digital preservation of UNSWorks Long-term digital preservation of UNSWorks UNSW Library Arif Shaon, Maude Frances CAUL Community Days 2014 UNSW Australia The University of New South Wales at a Glance: https://www.unsw.edu.au/sites/default/files/documents/unsw4009_miniguide_2012_aw2_v2.pdf

More information

ISO INTERNATIONAL STANDARD. Information and documentation Records management processes Metadata for records Part 1: Principles

ISO INTERNATIONAL STANDARD. Information and documentation Records management processes Metadata for records Part 1: Principles INTERNATIONAL STANDARD ISO 23081-1 First edition 2006-01-15 Information and documentation Records management processes Metadata for records Part 1: Principles Information et documentation Processus de

More information

Data Discovery - Introduction

Data Discovery - Introduction Data Discovery - Introduction Why (benefits of reusing data) How EUDAT's services help with this (in general) Adam Carter In days gone by: Design an experiment Getting Your Data Conduct the experiment

More information

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version University of British Columbia Library Persistent Digital Collections Implementation Plan Final project report Summary version May 16, 2012 Prepared by 1. Introduction In 2011 Artefactual Systems Inc.

More information

Creating synergy through private cloud

Creating synergy through private cloud Creating synergy through private cloud The Wits cloud architecture Prof Derek W. Keats Deputy Vice Chancellor (Knowledge & Information Management) The University of the Witwatersrand, Johannesburg http://kim.wits.ac.za

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

How to make your data open

How to make your data open How to make your data open Marialaura Vignocchi Alma Digital Library Muntimedia Center University of Bologna The bigger picture outside academia Thursday 29th October 2015 There is a strong societal demand

More information

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy

Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Heriot-Watt University Heriot-Watt University Research Gateway Developing Seamless Discovery of Scholarly and Trade Journal Resources Via OAI and RSS Chumbe, Santiago Segundo; MacLeod, Roddy Publication

More information

Description Cross Domain - Metadata Schema Registry Presentation to ISO Working Group Sydney, 2 November 2004

Description Cross Domain - Metadata Schema Registry Presentation to ISO Working Group Sydney, 2 November 2004 Description Cross Domain - Metadata Schema Registry Presentation to ISO 23081 Working Group Sydney, 2 November 2004 Outline InterPARES 2 Description Cross Domain Metadata Schema Registry Status of prototype

More information

ISO INTERNATIONAL STANDARD. Health informatics Service architecture Part 3: Computational viewpoint

ISO INTERNATIONAL STANDARD. Health informatics Service architecture Part 3: Computational viewpoint INTERNATIONAL STANDARD ISO 12967-3 First edition 2009-08-15 Health informatics Service architecture Part 3: Computational viewpoint Informatique de santé Architecture de service Partie 3: Point de vue

More information

Grid Computing Systems: A Survey and Taxonomy

Grid Computing Systems: A Survey and Taxonomy Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical

More information

Preservation and Access of Digital Audiovisual Assets at the Guggenheim

Preservation and Access of Digital Audiovisual Assets at the Guggenheim Preservation and Access of Digital Audiovisual Assets at the Guggenheim Summary The Solomon R. Guggenheim Museum holds a variety of highly valuable born-digital and digitized audiovisual assets, including

More information

GEOSS Data Management Principles: Importance and Implementation

GEOSS Data Management Principles: Importance and Implementation GEOSS Data Management Principles: Importance and Implementation Alex de Sherbinin / Associate Director / CIESIN, Columbia University Gregory Giuliani / Lecturer / University of Geneva Joan Maso / Researcher

More information

Archive to the Cloud: Hands on Experience with Enterprise Vault.cloud

Archive to the Cloud: Hands on Experience with Enterprise Vault.cloud Archive to the Cloud: Hands on Experience with Enterprise Vault.cloud Description See first-hand how Enterprise Vault.cloud, Symantec's hosted archiving service, can help address mailbox management, email

More information

Scholarly Big Data: Leverage for Science

Scholarly Big Data: Leverage for Science Scholarly Big Data: Leverage for Science C. Lee Giles The Pennsylvania State University University Park, PA, USA giles@ist.psu.edu http://clgiles.ist.psu.edu Funded in part by NSF, Allen Institute for

More information

European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy

European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy Metadata Life Cycle Statistics Portugal Isabel Morgado Methodology and Information Systems

More information

Developing a Research Data Policy

Developing a Research Data Policy Developing a Research Data Policy Core Elements of the Content of a Research Data Management Policy This document may be useful for defining research data, explaining what RDM is, illustrating workflows,

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones 2 1 San Diego Supercomputer Center, UCSD, U.S.A.

More information

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation INSPIRE 2010, KRAKOW Dr. Arif Shaon, Dr. Andrew Woolf (e-science, Science and Technology Facilities Council, UK) 3

More information

RISE SICS North Newsletter 2017:3

RISE SICS North Newsletter 2017:3 RISE SICS North Newsletter 2017:3 Visa webbversion RISE SICS North Newsletter 2017:3 News in short We have released our web http://ice.sics.se and our web shop for Openstack We have started our first H2020

More information

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality

More information

Metadata Framework for Resource Discovery

Metadata Framework for Resource Discovery Submitted by: Metadata Strategy Catalytic Initiative 2006-05-01 Page 1 Section 1 Metadata Framework for Resource Discovery Overview We must find new ways to organize and describe our extraordinary information

More information

Investing in a Better Storage Environment:

Investing in a Better Storage Environment: Investing in a Better Storage Environment: Best Practices for the Public Sector Investing in a Better Storage Environment 2 EXECUTIVE SUMMARY The public sector faces numerous and known challenges that

More information

An Approach to Software Preservation

An Approach to Software Preservation An Approach to Software Preservation PV 2009, Madrid Arif Shaon, Brian Matthews, Juan Bicarregui, Catherine Jones (STFC), Jim Woodcock (Univ of York) 1 December, 2009 1 Science and Technology Facilities

More information

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data

More information

The Necessity of a New Culture of Electronic Publishing C A S L I N

The Necessity of a New Culture of Electronic Publishing C A S L I N Humboldt-University at Berlin Computer and Media Services The Necessity of a New Culture of Electronic Publishing C A S L I N 2004 Dr. Peter Schirmbacher Humboldt-University at Berlin Computer and Media

More information

Mitigating Risk of Data Loss in Preservation Environments

Mitigating Risk of Data Loss in Preservation Environments Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and

More information