Global Data Sharing The Research Data Alliance Dr. Francine Berman Co Chair, RDA Council Chair, RDA/US Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic Institute 25/02/2016 1 How can we facilitate / accelerate data driven solutions to scientific and societal challenges? Who is most at risk to contract asthma? Image: Lucas Taylor How can we increase wheat yields? How can we respond to large scale earthquakes? How accurate is the Standard Model of Physics?
Making the data available is not enough Infrastructure needed to make data useful Data is not an asset if you don t know what it means. Data is not useful if you can t find it. Data needs to be in the right form for analysis. Data needs to be preserved for results to be reproducible. Accelerating the development of infrastructure worldwide the Research Data Alliance Research Data Alliance (RDA) : Global communitydriven organization whose mission is to build the social and technical bridges (infrastructure) that enable data sharing. Launched: March, 2013 Membership: 3700+ from 110 countries, all sectors, and a broad spectrum of domains
North America 34% South America 1% Africa 3% Asia 9% Australasia 4% The RDA Community: 3700+ members from 110 countries (as of February 2016) 392 991 1274 Europe 49% 1656 2048 2404 2636 2881 3126 3434 3724 125 Japanese Members 7th most represented country May July Aug Oct Nov Jan Feb Apr May July Aug Oct Nov Jan Feb Apr May July Aug Oct Nov Jan Feb Apr Members Organizational Type (Feb 2016) Press & Media 22 Policy/Funding Agency 58 Large Enterprise 85 IT Consultancy/Development 119 Small and Medium Enterprise 212 Other 198 Government/Public Services 583 Academia/Research 2447 TOTAL 3724 Member Professions How does RDA Advance Data Sharing? Who is at risk for asthma? How do we increase agricultural productivity? How accurate is the Standard Model of Physics? What will happen in an earthquake? Interoperability Frameworks Digital Object Identifiers Data Sharing Policy Common Metadata Standards Data Discovery Tools Sustainable Economics Domain and Institutional Repositories Data Analytics Algorithms Data Access and Distribution Policy Curation Practice and Policy Data Citation Standards Auditing, Certification and Reporting Practice Fran Berman
Focus of RDA Interest Groups: What kind of infrastructure is needed to solve problems? Technical SOLUTION Social Archiving multimedia interactive Social/organizational / dynamic data and projects solution aimed at data RDA/CODATA legal interoperability provider RDA/WDS Publishing Data Cost governance, certification, Recovery for Centers metrics/evaluation, cost Research Data Provenance recovery, citation, legal Certification of digital repositories Technical solution aimed at Federated identity management Domain data repositories provider Preservation e-infrastructure Repository repository, platforms fabric, for data research data analytics, Big data analytics infrastructure, data Data Fabric management dissemination, data publication, Community Capability Model Social/organizational Development of Cloud computing in solution aimed at data the Developing World Long tail of research data Quality of urban life Libraries for Research Data Research data needs of Photon and Neutron Science communities Geospatial data consumer data literacy, education, bridging, community, research practices, values/ethics Technical solution aimed at data consumer Biodiversity data integration Digital practices in History and Ethnography interoperability, harmonization, integration, metadata, knowledge Marine Data harmonization RDA/CODATA organization Materials data, infrastructure and interoperability Structural biology Data Provider BENEFICIARY Data Consumer TAB Clustering slides adapted from Beth Plale Focus of RDA Working Groups: Build infrastructure and support its adoption by communities of use Technical SOLUTION Social Data foundation and terminology RDA/WDS publishing data cost recovery for data centers Brokering Governance Repository Audit and Certification DSA-WDS Partnership Standardization of Data Categories and codes Data type registries PID information types Practical policy RDA/WDS Publishing Data workflows RDA/CODATA summer schools in data science and cloud computing in the developing world RDA/WDS Publishing Data Biometrics Metadata standards directory Dynamic Data Citation Data Citation Biosharing Registry Metadata standards catalogue Data description registry interoperability Wheat data interoperability RDA/WDS Publishing Data Services Data Provider BENEFICIARY Data Consumer TAB Clustering slides adapted from Beth Plale
Selected RDA Working Group Recommendations / outputs Working Group Outputs Impact Adopters Dynamic Data Citation Working Group (data consumer, social solution) Dynamic data citation methodology that supports efficient processing of data and linking from publications Researchers can reference precise subsets of changing data NERC, ESIP, CLARIN, Virtual Atomic and Molecular Data Centre Data Type Registries Working Group (data provider, technical solution) Data type model and prototype registry Provides machine readable and researcher accessible registries of data types that support the accurate use of data CNRI, International DOI Foundation, Materials Genome Initiative, Deep Carbon Observatory, EUDAT Wheat Data Interoperability Working Group (data consumer, technical solution) Common framework for Wheat Data Terminology to enable interoperability between distinct data collections Semantically linked terms describing wheat data so researchers can share harvest and related information between data sets and communities Wheat Initiative Information System, FAO AIMS, INRA Who benefits from RDA? RDA has benefits for many groups and is an important organization for addressing data focused challenges Individuals and Researchers Enterprise and Business Policy Makers and Funders
Individuals and Researchers Contribute to the accelerated development of useful data infrastructure more infrastructure to support data sharing more coordination and interoperability Expand professional horizons and improve specific efforts Broader network of collaborators from around the world Broader spectrum of disciplines, perspectives, practices, ideas Help create a broad, synergistic, and effective data community Engage with collaborators from multiple disciplines, multiple sectors, multiple countries, students to senior researchers Develop solutions to common data challenges around the world Improve one s competitive advantage professionally and position oneself for leadership within the broader research community 25/02/2016 11 Enterprise and Business RDA builds cyber infrastructure needed to access and use data world wide RDA builds key components needed for successful businesses Workforce ( human infrastructure ) community with expert skills in managing, using, mining, preserving data Technical infrastructure code, frameworks, tools, models that support data sharing and data interoperability Social infrastructure common vocabularies, metadata categories, etc. that support the development of tools and services RDA Recommendations/outputs lay the groundwork for the creation of new opportunities for innovation in business and research
Policy Makers RDA reduces the cost of sharing data, and increases the supply of sharable research data RDA serves as a vehicle for Accelerating the development of national and global infrastructure needed to accelerate discovery Strengthening international, inter disciplinary and inter sector collaborations Promoting leadership and competitiveness of national research communities within a global environment RDA engagement can help increase national competitiveness RDA Regional and Global Synergy PUSH: RDA/Regions RDA Bring regional data efforts and issues to the international stage Provide greater visibility and opportunities to national researchers PULL: RDA RDA/Regions Deploy RDA infrastructure in regional/national communities to accelerate the development of data sharing and data infrastructure Bring new ideas and collaborations to regional/national researchers to enhance innovation and competitiveness
RDA Regional Efforts Around the World RDA/Europe (EC funding) Training & Events with scientists, industry, policy makers Practitioner engagement Collaboration projects Coordination workshops with RDA/US Plenary hosting Information dissemination and communication RDA global support RDA/Australia (funding from Australian Govt. through ANDS) Data infrastructure coordination and development Information dissemination and communication Plenary hosting RDA global support RDA Regional Efforts Around the World RDA/US (NSF organizational funding; project funding from NSF, NIST, Sloan, MacArthur) Student and Early Career Professional Program Targeted outreach workshops with data enabled communities and organizations Adoption Amplification projects for RDA deliverables Plenary hosting Coordination Workshops with RDA/EU Information dissemination and communication RDA global support
Thank you to our hosts! 6 RDA Recommendations/outputs to be presented: Repository Audit and Certification DSA WDS RDA/WDS Publishing Data Bibliometrics RDA/WDS Publishing Data Services RDA/WDS Publishing Data Workflows Wheat Data Interoperability Recommendations RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World Interim Recommendations + 10 adoption presentations 30 international speakers over 5 plenary sessions 6 outputs & 10 adoption cases 8 Working Group meetings 25 Interest Group meetings 10 Birds of a Feather 9 Joint meetings 2 Organisational Member meetings RDA for Newcomers Meeting https:///plenary meetings/rda seventh plenary meeting.html Thank You Research Data Alliance Email enquiries@ Web www. Twitter @resdatall LinkedIn www.linkedin.com/in/researchdat aalliance Slideshare http://www.slideshare.net/researc hdataalliance Facebook https://www.facebook.com/pages/ Research Data Alliance/459608890798924