Volume 19, Number 11/12
Table of Contents
Building Global Partnerships Second Plenary Meeting of the Research Data Alliance
Mark A. Parsons
Rensselaer Polytechnic Institute
The Research Data Alliance held its second biannual Plenary at the National Academy of Sciences in Washington, DC, 16-18 September 2013. It was an exciting meeting with enthusiastic attendees focused on getting things done. Participants came from many sectors and nations but primarily the US and Europe. US, European, and Australian governments continue to show strong commitment to RDA, and individual and organizational membership continues to grow. Most of the work of RDA occurs in Working and Interest Groups and much of the meeting was devoted to breakout sessions. Coordinating the growing number of groups and Plenary breakout sessions will be an ongoing challenge, but is a good challenge heartily embraced in planning for the next Plenary in Dublin, Ireland in March 2014.
The Research Data Alliance (RDA) is a new, rapidly growing organization working to build the social and technical bridges that enable open sharing of data. It is a community-driven, member organization comprised of focused tactical Working Groups and broader, more exploratory Interest Groups. All of these Groups are working to implement an evolving global data infrastructure.
As a community organization, RDA conducts regular biannual plenary meetings. The first Plenary and launch of RDA was held in Sweden in March 2013. The second Plenary was recently held in September at the National Academy of Sciences in Washington, DC. This second meeting was an exciting event highlighting the rapid growth and achievement of the RDA, the enthusiasm of the community, the commitment of government agencies, and some of the challenges of rapid, organic growth.
Program and attendance
The program of the meeting reflected the theme of "Building Global Partnerships" while also highlighting US commitment to a global data infrastructure. Plenary talks and panels emphasized the need to collaborate across governments, organizations, and individuals to achieve a common vision, while more than half of the meeting was devoted to discussion and working sessions necessary to realize the vision.
The meeting had an air of enthusiasm focused on getting things done. Inspirational keynote talks helped grow and channel that enthusiasm. John Wilbanks emphasized the power of open and machine-readable data and how community action can increase the generative value of data. Carole Palmer provided a perspective on how modern data practice needs to build on the existing "evidential" cultures of research.
Conference attendance reflected community interest. Attendance grew more than 50% from the first plenary to 368 participants representing academia, data centers, libraries, government, international organizations, and industry. Participants came from 23 countries, but the majority were from North America with most of the rest coming from Europe. A theme that emerged from the conference, and a motivation for RDA going forward, is to expand representation by including more early-career data scientists, more disciplines, and more people and organizations from Asia, Africa, and South America.
The regions currently participating in RDAEurope, Australia, and the USare very committed to seeing RDA succeed. Farnam Jahanian, NSF Assistant Director for Computer and Information Science and Engineering, opened the meeting with excitement and introduced Tom Kalil, Deputy Director for Technology and Innovation at the White House Office of Science and Technology Policy. Dr. Kalil emphasized the importance of increasingly stringent, open-data policies in the US and called for an "international research data commons". A later panel of senior government officials from the European Commission, Australia, and the US echoed this theme of international collaboration around open data as necessary to global development. Another panel of organizations related to RDA made a further call for close collaboration. This panel included representatives from CODATA, W3C, DataCite, and the Federation of Earth Science Information Partners. It is notable, however, that there was a strong emphasis on taking a bottom-up approach.
Working and Interest Groups
In the bottom-up spirit, much of the meeting was devoted to working breakout sessions. This is where the real activity of RDA occurs. Several new Working Groups (WGs) have emerged since the first plenary to explore fundamental issues including metadata, machine-actionable rules or policies, and formal data categories and codes. Many new Interest Groups (IGs) have also emerged and ad hoc meetings at the Plenary spawned even more groups. All told, RDA currently has six Working Groups developing tangible outputs within a year and more than fifteen Interest Groups exploring issues ranging from semantic interoperability of wheat data to data provenance to the exploration of different legal regimes around data.
The Interest Groups and especially the Working Groups are the heart of RDA. Several of the breakout sessions were packed with more than 50 people. This reflects the zeal of the community, but it also presented a challenge. Working Groups are expected to deliver tangible outcomes in 12-18 months. A few have been working for months already, but they needed to take a step back at the meeting to bring new members up to speed and to consider new perspectives. Future plenaries will increasingly focus on Working and Interest Group activity and will need to be designed and organized accordingly. The most successful groups that met in Washington, were those with organized chair persons who provided quick overviews of past activities and future goals and then had creative approaches to moving forward. Some broke into smaller groups. Some had focused exercises to collect input. Some were active in engaging external participants despite less than ideal Internet connection and audio-video equipment.
Overall, most of the groups are focused on getting broad input to ensure they have good community buy in and diverse views. Many of the established groups are currently conducting surveys or have open calls out for contributions. For example, the Practical Policy WG is seeking input on machine-actionable rules used to manage data; the Metadata WG is assembling existing metadata standards; the Legal Interoperability IG is seeking several case studies that demonstrate the legal problems and solutions around data sharing. Indeed, most of the groups are taking a use-case based approach to developing requirements and are working with very diverse communities. There is also an effort to develop various test beds, especially within the Practical Policy WG. More information about the Working and Interest Groups can be found on the RDA web site.
Organizational members and affiliates
It is important to recognize that while RDA is currently comprised of individual members, it also seeks to engage commercial and non-profit organizations. The keys to RDA's success will be the researchers and engineers who formulate and resolve the problems RDA needs to address, but RDA needs organizations to adopt the output. Getting agreements in Working Groups does not mean RDA outputs will automatically get used. So RDA needs organizations to help with implementation. RDA seeks to create a flourishing landscape in which organizational members have advantages as paying members and contribute to the strategic direction of RDA. Just prior to the Plenary, an organizational task force made an open call for organizational members, and several organizations at the meeting applied to become members. Potential organizational members include university libraries, supercomputing centers, data centers, data projects, and small and large companies. See the current list of interested organizations.
RDA also seeks to partner with like-minded international organizations such as CODATA, DataCite, and the World Data System. Collaborative activities with these groups are already underway, and RDA seeks to formalize these relationships in the near future. The Plenary provided the first opportunity for potential organizational members and other partners to come together in an Organizational Assembly. This effort will continue, and ultimately, these entities will form an Organizational Advisory Board to advise the Council and to enhance adoption of RDA outputs. In general, RDA is viewed as a neutral place where organizations can convene and harmonize their efforts. An early example of this harmonization activity is in the area of data citation.
Data citation "summit"
At the first RDA Plenary in March, people realized that there were many groups working to define a robust data citation framework for research. These groups were not communicating as well as they could and were even sometimes working at cross-purposes. It was agreed to use the occasion of the second RDA Plenary to bring the different groups together. RDA and DataCite purposely arranged to have their meetings consecutively at the same location. Then on the last day of the RDA Plenary and as a prelude to the DataCite meeting, representatives from some two-dozen organizations came together to develop a shared "Declaration of Data Citation Principles." This activity continues and has brought a much-needed unified community voice promoting sound data citation. More information and the latest principles can be found at the Future of Research Communications and e-Scholarship (FORCE11).
The second RDA Plenary was a dynamic, enthusiastic event involving people from many disciplines. Many commented on the buzz of excitement and can-do spirit of the meeting both during the meeting and in the post meeting survey. Moving forward, it is the clear the community wants to focus even more on breakout sessions for WGs and IGs so more work can be done. This plenary devoted half the time to breakouts and had as many as eight parallel sessions running. Future plenaries will need to devote more time to breakouts and will have to grapple with the challenge of scheduling thirty or more groups and minimizing conflicts. Some training of session leaders will also be needed as well as additional background information. This will continue to be a challenge if RDA continues to grow at the same pace. It is quite possible that more than 500 people may attend the next Plenary.
These are the challenges of a healthy growing organization, and it appears the community is ready to address them. The last plenary session of the meeting was a facilitated community discussion. During the course of the meeting, participants were encouraged to identify and record issues facing the whole of RDA. Then on the next to last day, a participant voting exercise prioritized the issues for community discussion. Sixteen issues were identified, but four were deemed to be of high priority for discussion. The highest priority topic was for RDA to clearly define its formal outputs. During the discussion it became clear that there was little interest in RDA becoming a formal standards body, but that it could facilitate the development and implementation of standards. Council has since established an RDA task force to better define the formal outputs of RDA and the appropriate intellectual property rights for those outputs.
Other major issues centered on how RDA will coordinate with other organizations and how groups within RDA can minimize overlap and work together. Broader issues included more engagement of early-career practitioners and working towards a change in the culture of science to better recognize data related outputs. These will be ongoing issues, but RDA is developing the structures to help address them. At the close of the meeting, Ross Wilkinson from the Council and the Australian National Data Service announced an upcoming recruiting call for an RDA Secretary General. Also, candidates for the RDA Technical Advisory Board (TAB) had an opportunity to speak during the meeting. A subsequent member election led to the formal establishment of the TAB. This group will be central in laying out the future technical direction of RDA and ensuring coordination within RDA. The emerging Organization Assembly will also help with organizational processes while better connecting RDA to external entities.
We now look forward to the next Plenary in Dublin, Ireland, 26-28 March 2014. At that point, RDA will be a year old and beginning to mature as an organization, yet zeal and commitment to action will remain. Don't hesitate to join RDA today and to get more involved!
This short article only skimmed the surface of all that went on at the RDA Second Plenary. For additional perspectives and more details, see some of the blogs written about the meeting. Also slides and video recordings of the plenary sessions are linked off the program on the RDA web site.
See you next year in Dublin!
About the Author
Mark A. Parsons is the Managing Director of the US Component of the Research Data Alliance and the Rensselaer Center for the Digital Society. He focusses on stewarding research data and making them more accessible and useful across different ways of knowing. He has been leading major data stewardship efforts for more than 20 years, and received the American Geophysical Union Charles S. Falkenberg Award as an advocate of robust data stewardship as a vital component of Earth system science and as an important profession in its own right. Prior to joining Rensselaer, Mr. Parsons was a Senior Associate Scientist and the Lead Project Manager at the National Snow and Ice Data Center (NSIDC). While at NSIDC, he defined and implemented their overall data management process and led the data management effort for the ICSU/WMO International Polar Year 2007-2008. He is currently active in several international committees while helping lead the Research Data Alliance in its goal of accelerating innovation through data exchange. His research interests include the role of scientific social interaction in the success, development, and extension of data sharing networks.