Building Global Infrastructure for Data Sharing and Exchange Through the Research Data Alliance
Research Data Alliance: Vision and Mission
Today's technologies are enabling the collection and analysis of previously unimaginable quantities of data. This data is transforming all sectors private, public, academic through the development of new approaches, applications, and services. The ubiquity of today's data is not just transforming what is, it is transforming what will be laying the groundwork to drive new innovation. In this sense, perhaps the greatest potential, and most compelling need, for leveraging the capabilities of digital data can be found in the research domain.
Are you at higher risk for asthma in Mexico City or Los Angeles? What are the best predictors of wheat productivity? Which neighborhoods are most likely to sustain damage in an earthquake? Today, such questions are increasingly addressed by combining data from disparate domains using complex models and new approaches to data analysis. The ability of researchers to share and combine key data sets is the foundation from which new approaches to problem solving can be developed.
Data sharing and exchange allow us to uncover connectedness in what was previously unconnected. Combining health, environmental, population and other data to address asthma risk in large urban areas requires infrastructure that supports the access, use, re-use, management, coordination, and stewardship of relevant data sets. Simply making the data available is insufficient for the coherent sharing and interpretation of that data. To make things more challenging, different research communities have disparate data standards, policies, and practices. Consequently, sufficient enabling data infrastructure, both technical and social, is required to integrate data sets from distinct communities and enable collaboration across those communities, just as new technical infrastructure and common agreements were required to connect together the computer networks that form today's Internet.
To address the growing global need for data infrastructure, the Research Data Alliance (RDA, rd-alliance.org) was planned and launched in 2013 as an international, community-powered organization. RDA's vision is of researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society. RDA's mission is to build the social and technical bridges that enable data sharing. These are accomplished through the creation, adoption and use of the social, organizational, and technical infrastructure needed to reduce barriers to data sharing and exchange.
In practice, RDA members focus on both the technical infrastructure needed for data sharing and exchange (including its underlying structures and components persistent digital identifiers, shared metadata frameworks, etc.), as well as the social infrastructure needed for community collaboration (common policy and organizational practice, harmonization of standards, common approaches to data access and preservation, etc.). Short-term (12-18 month) Working Groups come together to:
The focus of the RDA Working Group deliverables are impact and implementation. In addition, longer-lived RDA Interest Groups provide discussion forums in topical areas that spawn Working Groups as specific pieces of needed infrastructure are identified.
RDA in Brief
RDA is an emerging, rapidly growing international organization of researchers, data scientists, and organizations. It is community-driven, and RDA membership is open at no cost to any individual who subscribes to the RDA principles of openness, consensus-based decision making, balanced representation, technology neutrality, and harmonization across communities and technologies. Organizations may also join as Organizational Members (with voting privileges within the Organizational Assembly) or as collaborating Organizational Affiliates.
RDA is guided by an elected Council of nine senior states-people. Council works closely with RDA membership, an elected Technical Advisory Board, and Organizational Members and Affiliates (the Organizational Assembly) to encourage and endorse focused Working Groups and broader Interest Groups.
Working and Interest groups are the heart of RDA. Working Groups conduct short-lived, 12-18 month efforts that implement specific tools, code, best practices, standards, etc., at multiple institutions. Interest Groups have a broader scope and longer life. They work to define common issues and interests that ultimately lead to the creation of more focused Working Groups. In the Fall of 2013, there were roughly three dozen RDA Interest and Working Groups formed to discuss a broad range of topics, from persistent identifier information types to agricultural data interoperability to toxico-genomics data. The number of Working and Interest Groups continues to grow rapidly.
The RDA organization and operation has been guided by an international steering committee of government agencies from the U.S., EU and Australia (the RDA Colloquium RDAC) who are not formally a part of the RDA. U.S. participation (RDA/US) has been sponsored by RDAC member the National Science Foundation, EU participation (RDA/EU, formerly iCORDI) through funding from the European Commission, and Australian participation has been funded by the Australian Government through the Australian National Data Service (ANDS). Other groups including Chalmers University, the US National Institute of Standards and Technology, and Microsoft Research have provided additional support for the RDA Plenaries. The organizational structure for RDA is shown in the figure below. More information can be found on the RDA website at rd-alliance.org.
Organizational Structure of the RDA
RDA Emergence and Rapid Growth
RDA came about as data communities and international agencies sought to accelerate research innovation and the development of enabling data infrastructure at a time of unprecedented growth of digital research data (the RDA community defines "research data" broadly as any digital data used in the conduct of research). In 2011 and 2012, the need for more effective infrastructure to accelerate research data sharing and exchange worldwide drove discussions between the U.S. National Science Foundation and National Institute of Standards and Technology, the European Commission, the Australian Government, and others. These discussions, and the increasing need to develop and coordinate global research data infrastructure, were being explored through the Data Access and Interoperability Task Force (DAITF) and summarized in a White Paper describing a "Data Web Forum". Recognition from the community that broader effort was needed to accelerate the development and adoption of effective infrastructure prompted organizers in the US, EU and AU to propose an international effort to develop the Research Data Alliance that could actualize and expand on the ideas of DAITF and the Data Web Forum concept, driving the development, adoption, and use of infrastructure to accelerate global sharing and exchange of open access research data. The original RDA Organizing Committee Fran Berman and Beth Plale from the US, John Wood, Leif Laaksonen, Juan Bicarregui and Peter Wittenburg from the EU, and Ross Wilkinson and Andrew Treloar from Australia came together in August 2012 for the planning and organization of the nascent organization.
Over 2012 and 2013, the tremendous community interest in, and growth of, the RDA was much greater than anticipated. RDA held its first Plenary and Launch in March 2013 in Gothenburg, Sweden, attended by 240 participants from 31 countries. The second RDA Plenary in September 2013 was held in Washington, DC and attended by nearly 400 RDA members. As of September 2013, the RDA Forum included roughly 1,300 members from 53 countries and all sectors.
In the wake of this rapid growth, RDA is now working to develop an effective organization to meet its mission as well as a support model to cover at least the first 5 years a critical period for organizations in which proof of impact is essential for success. Metrics of success during this period for the organization focus on:
In this Special Issue of D-Lib Magazine, we highlight a few of the RDA groups and activities in more detail. We provide a lens on Working Groups and Interest Groups through a general description by Technical Advisory Board co-Chair Beth Plale, and specific descriptions of the Data Type Registries Working Group (Daan Broeder and Larry Lannom), Agricultural Interoperability Interest Group (Johannes Keizer) and Language Codes and Categories Working Group (Simon Musgrave). Mark Parsons from the RDA Secretariat provides an overview of the second (most recent) RDA Plenary held September 16-18, 2013 in Washington, D.C. At that Plenary, a number of data organizations including DataCite, CODATA, WDS, and FORCE11, came together to discuss the forming of a common agenda for Data Citation. Jan Brase and Adam Farquhar from DataCite describe the RDA Data Citation Summit herein.
Finally, an organization is only as effective as its participants. We encourage interested community members, data stakeholders, and others to visit rd-alliance.org, to join the RDA, increase its impact and its usefulness to the community, and help maximize the potential of a world in which innovation is increasingly driven by the access, sharing and exchange of digital research data.
About the Guest Editors