Volume 20, Number 1/2
Table of Contents
Building Global Infrastructure for Data Sharing and Exchange Through the Research Data Alliance
Rensselaer Polytechnic Institute
Australian National Data Service
The Association of Commonwealth Universities
Research Data Alliance: Vision and Mission
Today's technologies are enabling the collection and analysis of previously unimaginable quantities of data. This data is transforming all sectors private, public, academic through the development of new approaches, applications, and services. The ubiquity of today's data is not just transforming what is, it is transforming what will be laying the groundwork to drive new innovation. In this sense, perhaps the greatest potential, and most compelling need, for leveraging the capabilities of digital data can be found in the research domain.
Are you at higher risk for asthma in Mexico City or Los Angeles? What are the best predictors of wheat productivity? Which neighborhoods are most likely to sustain damage in an earthquake? Today, such questions are increasingly addressed by combining data from disparate domains using complex models and new approaches to data analysis. The ability of researchers to share and combine key data sets is the foundation from which new approaches to problem solving can be developed.
Data sharing and exchange allow us to uncover connectedness in what was previously unconnected. Combining health, environmental, population and other data to address asthma risk in large urban areas requires infrastructure that supports the access, use, re-use, management, coordination, and stewardship of relevant data sets. Simply making the data available is insufficient for the coherent sharing and interpretation of that data. To make things more challenging, different research communities have disparate data standards, policies, and practices. Consequently, sufficient enabling data infrastructure, both technical and social, is required to integrate data sets from distinct communities and enable collaboration across those communities, just as new technical infrastructure and common agreements were required to connect together the computer networks that form today's Internet.
To address the growing global need for data infrastructure, the Research Data Alliance (RDA, rd-alliance.org) was planned and launched in 2013 as an international, community-powered organization. RDA's vision is of researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society. RDA's mission is to build the social and technical bridges that enable data sharing. These are accomplished through the creation, adoption and use of the social, organizational, and technical infrastructure needed to reduce barriers to data sharing and exchange.
In practice, RDA members focus on both the technical infrastructure needed for data sharing and exchange (including its underlying structures and components persistent digital identifiers, shared metadata frameworks, etc.), as well as the social infrastructure needed for community collaboration (common policy and organizational practice, harmonization of standards, common approaches to data access and preservation, etc.). Short-term (12-18 month) Working Groups come together to:
- Create concrete pieces of infrastructure that accelerate data sharing and exchange for a specific but substantive target community.
- Adopt the infrastructure within the target community.
- Use the infrastructure to accelerate data-driven innovation.
The focus of the RDA Working Group deliverables are impact and implementation. In addition, longer-lived RDA Interest Groups provide discussion forums in topical areas that spawn Working Groups as specific pieces of needed infrastructure are identified.
RDA in Brief
RDA is an emerging, rapidly growing international organization of researchers, data scientists, and organizations. It is community-driven, and RDA membership is open at no cost to any individual who subscribes to the RDA principles of openness, consensus-based decision making, balanced representation, technology neutrality, and harmonization across communities and technologies. Organizations may also join as Organizational Members (with voting privileges within the Organizational Assembly) or as collaborating Organizational Affiliates.
RDA is guided by an elected Council of nine senior states-people. Council works closely with RDA membership, an elected Technical Advisory Board, and Organizational Members and Affiliates (the Organizational Assembly) to encourage and endorse focused Working Groups and broader Interest Groups.
Working and Interest groups are the heart of RDA. Working Groups conduct short-lived, 12-18 month efforts that implement specific tools, code, best practices, standards, etc., at multiple institutions. Interest Groups have a broader scope and longer life. They work to define common issues and interests that ultimately lead to the creation of more focused Working Groups. In the Fall of 2013, there were roughly three dozen RDA Interest and Working Groups formed to discuss a broad range of topics, from persistent identifier information types to agricultural data interoperability to toxico-genomics data. The number of Working and Interest Groups continues to grow rapidly.
The RDA organization and operation has been guided by an international steering committee of government agencies from the U.S., EU and Australia (the RDA Colloquium RDAC) who are not formally a part of the RDA. U.S. participation (RDA/US) has been sponsored by RDAC member the National Science Foundation, EU participation (RDA/EU, formerly iCORDI) through funding from the European Commission, and Australian participation has been funded by the Australian Government through the Australian National Data Service (ANDS). Other groups including Chalmers University, the US National Institute of Standards and Technology, and Microsoft Research have provided additional support for the RDA Plenaries. The organizational structure for RDA is shown in the figure below. More information can be found on the RDA website at rd-alliance.org.
RDA Emergence and Rapid Growth
RDA came about as data communities and international agencies sought to accelerate research innovation and the development of enabling data infrastructure at a time of unprecedented growth of digital research data (the RDA community defines "research data" broadly as any digital data used in the conduct of research). In 2011 and 2012, the need for more effective infrastructure to accelerate research data sharing and exchange worldwide drove discussions between the U.S. National Science Foundation and National Institute of Standards and Technology, the European Commission, the Australian Government, and others. These discussions, and the increasing need to develop and coordinate global research data infrastructure, were being explored through the Data Access and Interoperability Task Force (DAITF) and summarized in a White Paper describing a "Data Web Forum". Recognition from the community that broader effort was needed to accelerate the development and adoption of effective infrastructure prompted organizers in the US, EU and AU to propose an international effort to develop the Research Data Alliance that could actualize and expand on the ideas of DAITF and the Data Web Forum concept, driving the development, adoption, and use of infrastructure to accelerate global sharing and exchange of open access research data. The original RDA Organizing Committee Fran Berman and Beth Plale from the US, John Wood, Leif Laaksonen, Juan Bicarregui and Peter Wittenburg from the EU, and Ross Wilkinson and Andrew Treloar from Australia came together in August 2012 for the planning and organization of the nascent organization.
Over 2012 and 2013, the tremendous community interest in, and growth of, the RDA was much greater than anticipated. RDA held its first Plenary and Launch in March 2013 in Gothenburg, Sweden, attended by 240 participants from 31 countries. The second RDA Plenary in September 2013 was held in Washington, DC and attended by nearly 400 RDA members. As of September 2013, the RDA Forum included roughly 1,300 members from 53 countries and all sectors.
In the wake of this rapid growth, RDA is now working to develop an effective organization to meet its mission as well as a support model to cover at least the first 5 years a critical period for organizations in which proof of impact is essential for success. Metrics of success during this period for the organization focus on:
- The development of a continuing and expanding pipeline of data infrastructure, adopted and used by the community to accelerate data sharing and exchange
- Increasing usefulness of the RDA as a "neutral space" for coordinating organizational and individual efforts throughout the data community that have the potential of increasing the prevalence and impact of data infrastructure
- Development of an agile, lean and effective organization that can support expansion and increasing cohesiveness among the data community world-wide and regionally
In this Special Issue of D-Lib Magazine, we highlight a few of the RDA groups and activities in more detail. We provide a lens on Working Groups and Interest Groups through a general description by Technical Advisory Board co-Chair Beth Plale, and specific descriptions of the Data Type Registries Working Group (Daan Broeder and Larry Lannom), Agricultural Interoperability Interest Group (Johannes Keizer) and Language Codes and Categories Working Group (Simon Musgrave). Mark Parsons from the RDA Secretariat provides an overview of the second (most recent) RDA Plenary held September 16-18, 2013 in Washington, D.C. At that Plenary, a number of data organizations including DataCite, CODATA, WDS, and FORCE11, came together to discuss the forming of a common agenda for Data Citation. Jan Brase and Adam Farquhar from DataCite describe the RDA Data Citation Summit herein.
Finally, an organization is only as effective as its participants. We encourage interested community members, data stakeholders, and others to visit rd-alliance.org, to join the RDA, increase its impact and its usefulness to the community, and help maximize the potential of a world in which innovation is increasingly driven by the access, sharing and exchange of digital research data.
About the Guest Editors
Fran Berman is Chair, Research Data Alliance/US, and co-Chair, RDA Council. She is the Edward P. Hamilton Distinguished Professor in Computer Science, and Director of the Center for a Digital Society, at Rensselaer Polytechnic Institute. Dr. Berman is a Fellow of the ACM and the IEEE. In 2009, she was the inaugural recipient of the ACM/IEEE-CS Ken Kennedy Award for "influential leadership in the design, development, and deployment of national-scale cyberinfrastructure." Prior to joining Rensselaer, Dr. Berman was High Performance Computing Endowed Chair at UC San Diego. From 2001 to 2009, she served as Director of the San Diego Supercomputer Center. From 2007-2010, she served as co-Chair of the US-UK Blue Ribbon Task Force for Sustainable Digital Preservation and Access. From 2009 to 2012, she served as Vice President for Research at Rensselaer, stepping down in 2012 to lead U.S. participation in the Research Data Alliance. Dr. Berman has been recognized by the Library of Congress as a "Digital Preservation Pioneer", as one of the top women in technology by BusinessWeek and Newsweek, and as one of the top technologists by IEEE Spectrum.
Ross Wilkinson is the executive director of the Australian National Data Service, dedicated to enabling more researchers to re-use data more often. His research career commenced with his Ph. D. in mathematics at Monash University before researching in computer science at La TrobeUniversity, R.M.I.T. and at CSIRO. Some of his areas of research have been document retrieval effectiveness, structured documents retrieval, and most recently on technologies that support people to interact withtheir information environments. Dr. Wilkinson has published over 90 research papers, has served on many program committees and was a program co-chair for both SIGIR'96 and SIGIR'98. Dr. Wilkinson is a member of the RDA Council. He is now leading the Australian National Data Service, creating tools, information, frameworks and the skills to enable Australia's researchers to more effectively use and re-use research data, wherever it comes from.
John Wood, CBE, FREng is the Secretary-General of the Association of Commonwealth Universities and was chair of both the High Level Expert Group on Scientific Data Information and the European Research Area Board. He graduated from Sheffield University in metallurgy and went to Cambridge University for his Ph.D. where he subsequently stayed on as Goldsmith's Research Fellow at Churchill College. In 1994 he was awarded a higher doctorate from Sheffield and has an honorary doctorate from the University of Cluj-Napoca in Romania where he is also a "citizen of honour". He has held academic posts at the Open University followed by Nottingham University where he was Dean of Engineering. From 2001-2007 he was seconded from Nottingham to the Council for the Central Laboratory of the Research Councils as Chief Executive where he was responsible for the Rutherford-Appleton and Daresbury Laboratories in addition to shareholdings in ESRF, ILL and the Diamond Light Source. During this period he was a visiting professor at Oxford University and still remains a fellow of Wolfson College, Oxford. He then joined Imperial College first as Principal of the Faculty of Engineering and subsequently as Senior International Advisor. He is still a visiting professor at Imperial College and University College London. He is a non-executive director of a number of companies including Bio-Nano Consulting and sits on the advisory board of the British Library Previously he was on the board of the Joint Information Services Committee responsible for the UK academic computing network and chaired their Support for Research Committee. He is also involved with a number of charities including acting as chair of the International Network for Accessing Scientific Publications and Research Information Network. He was a founder member of the European Strategy Forum for Research Infrastructures and became chair in 2004 where he was responsible for producing the first European Roadmap. He became the first chair of the European Research Area Board in 2008 responsible for high level advice to the European Commission and in 2009 produced a long term strategic vision entitled "Preparing Europe for a New Renaissance". He was elected as a fellow of the Royal Academy of Engineering in 1999 and is currently a member of their Council and International Committee. He was made a Commander of the British Empire in 2007 for "services to science," and in 2010 was made an "Officer of the Order of Merit of the Federal Republic of Germany". Dr. Wood is co-Chair of the RDA Council.