T A B L E   O F   C O N T E N T S
J A N U A R Y / F E B R U A R Y   2 0 1 1
Volume 17, Number 1/2

ISSN: 1082-9873




Research Data
by Laurence Lannom, Corporation for National Research Initiatives



Access to Research Data
Introduction by guest editors Jan Brase, German National Library of Science and Technology and Adam Farquhar, British Library

Abstract: Scientists around the world are addressing the need to increase access to research data. Science is international and global cooperation is imperative. DataCite, launched in December 2009, is an association of more than a dozen members from 10 countries and growing, that enables researchers to locate, identify, and cite research datasets with confidence, and plays a global leadership role in promoting the use of persistent identifiers for datasets. In June 2010, the first DataCite summer meeting took place in Hannover, Germany and provided a forum for 25 speakers and nearly 100 participants from Europe, North America and Australia to exchange information for handling research data. This special issue of D-Lib Magazine includes eight articles derived from talks given at the summer meeting and one additional article on the quality of research data. Together, these articles provide a snapshot of the state-of-the-art on these topics.

The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data
Article by Mercè Crosas, Institute for Quantitative Social Science, Harvard University, Harvard

Abstract: The Dataverse Network is an open-source application for publishing, referencing, extracting and analyzing research data. The main goal of the Dataverse Network is to solve the problems of data sharing through building technologies that enable institutions to reduce the burden for researchers and data publishers, and incentivize them to share their data. By installing Dataverse Network software, an institution is able to host multiple individual virtual archives, called "dataverses" for scholars, research groups, or journals, providing a data publication framework that supports author recognition, persistent citation, data discovery and preservation. Dataverses require no hardware or software costs, nor maintenance or backups by the data owner, but still enable all web visibility and credit to devolve to the data owner.

DataONE: Data Observation Network for Earth — Preserving Data and Enabling Innovation in the Biological and Environmental Sciences
Article by William Michener, University Libraries, University of New Mexico; Dave Vieglais, Biodiversity Research Institute, University of Kansas; Todd Vision, University of North Carolina; John Kunze and Patricia Cruse, University of California Curation Center; and Greg Janée, University of California Curation Center and University of California at Santa Barbara

Abstract: This paper describes DataONE, a federated data network that is being built to improve access to, and preserve data about, life on Earth and the environment that sustains it. DataONE supports science by: (1) engaging the relevant science, library, data, and policy communities; (2) facilitating easy, secure, and persistent storage of data; and (3) disseminating integrated and user-friendly tools for data discovery, analysis, visualization, and decision-making. The paper provides an overview of the DataONE architecture and community engagement activities. The role of identifiers in DataONE and the policies and procedures involved in data submission, curation, and citation are discussed for one of the affiliated data centers. Finally, the paper highlights EZID, a service that enables digital object producers to easily obtain and manage long-term identifiers for their digital content.

Quality of Research Data, an Operational Approach
Article by Leo Waaijers; Maurits van der Graaf, Pleiade Management and Consultancy

Abstract: This article reports on a study, commissioned by SURFfoundation, investigating the operational aspects of the concept of quality for the various phases in the life cycle of research data: production, management, and use/re-use. Potential recommendations for quality improvement were derived from interviews and a study of the literature. These recommendations were tested via a national academic survey of three disciplinary domains as designated by the European Science Foundation: Physical Sciences and Engineering, Social Sciences and Humanities, and Life Sciences. The "popularity" of each recommendation was determined by comparing its perceived importance against the objections to it. On this basis, it was possible to draw up generic and discipline-specific recommendations for both the dos and the don'ts.

Acquiring High Quality Research Data
Article by Andreas Hense and Florian Quadt, Bonn-Rhine-Sieg University

Abstract: At present, data publication is one of the most dynamic topics in e-Research. While the fundamental problems of electronic text publication have been solved in the past decade, standards for the external and internal organisation of data repositories are advanced in some research disciplines but underdeveloped in others. We discuss the differences between an electronic text publication and a data publication and the challenges that result from these differences for the data publication process. We place the data publication process in the context of the human knowledge spiral and discuss key factors for the successful acquisition of research data from the point of view of a data repository. For the relevant activities of the publication process, we list some of the measures and best practices of successful data repositories.

Criteria for the Trustworthiness of Data Centres
Article by Jens Klump, Helmholtz Centre Potsdam German Research Centre for Geosciences

Abstract: The use of persistent identifiers to identify data sets as part of the record of science implies that the data objects are persistent themselves. Scientific findings, historical documents and cultural achievements are to a rapidly increasing extent being presented in electronic form — in many cases exclusively so. However, besides the invaluable advantages offered by this form, it also carries serious disadvantages. The rapid obsolescence of the technology required to read the information combined with the frequently imperceptible physical decay of the media themselves represents a serious threat to preservation of the information content. Since research projects only run for a relatively short period of time, it is advisable to shift the burden of responsibility for long-term data curation from the individual researcher to a trusted data repository or archive. But what makes a data repository trustworthy? The trustworthiness of a digital repository can be tested and assessed on the basis of a criteria catalogue. These catalogues can also be used as a basis to develop a procedure for auditing and certification of the trustworthiness of digital repository.

Abelard and Héloise: Why Data and Publications Belong Together
Article by Eefke Smit, International Association of STM Publishers

Abstract: This article explores the present state of integration between data and publications. The statistical findings are based on the project PARSE.Insight, which was carried out with the help of EU funding in 2008 — 2010. The main conclusion drawn from these findings is that currently very few conventions and best practices exist among researchers and publishers in how to handle data. There is strong preference among researchers and publishers alike for data and publications to be linked in a persistent way. To achieve that, we advocate good collaboration across the whole information chain of authors, research institutes, data centers, libraries and publishers. DataCite is an excellent example of how this might work.

Supporting Science through the Interoperability of Data and Articles
Article by IJsbrand Jan Aalbersberg and Ove Kähler, Elsevier, S&T Journals, Content Innovation, The Netherlands

Abstract: Whereas it is established practice to publish relevant findings of a research project in a scientific article, there are no standards yet as to whether and how to make the underlying research data publicly accessible. According to the recent PARSE.Insight study of the EU, over 84% of scientists think it is useful to link underlying digital research data to peer-reviewed literature. This trend is reinforced by funding bodies, who — to an increasing extent — require the grantees to deposit their raw datasets at freely accessible repositories. And also the publishing industry believes that raw datasets should be made freely accessible. This article presents an overview of how Elsevier as a scientific publisher with over 2,000 journals gives context to articles that are available on their full-text platform SciVerse ScienceDirect, by linking out to externally hosted data at the article level, at the entity level, and in a deeply integrated way. With this overview, Elsevier invites dataset repositories to collaborate with publishers to create an optimal interoperability between the formal scientific literature and the associated research data — improving the scientific workflow and ultimately supporting science.

isCitedBy: A Metadata Scheme for DataCite
Article by Joan Starr, California Digital Library; Angela Gastl, ETH Zürich Library

Abstract: The DataCite Metadata Scheme is being designed to support dataset citation and discovery. It features a small set of mandatory properties, and an additional set of optional properties for more detailed description. Among these is a powerful mechanism for describing relationships between the registered dataset and other objects. The scheme is supported organizationally and will allow for community input on an ongoing basis.

"Earth System Science Data" (ESSD) — A Peer Reviewed Journal for Publication of Data
Article by Hans Pfeiffenberger, Alfred Wegener Institut; David Carlson, UNAVCO, USA

Abstract: In 2008, ESSD was established to provide a venue for publishing highly important research data, with two main aims: To provide reward for data "authors" through fully qualified citation of research data, classically aligned with the certification of quality of a peer reviewed journal. A major step towards this goal was the definition and rationale of article structure and review criteria for articles about datasets.


F E A T U R E D   D I G I T A L

Digital Dialects
[Image courtesy of Digital Dialects. Used with permission.]


Digital Dialects

Digital Dialects features web-based games for learning sixty languages, ranging from Afrikaans to Zazaki. Launched in January 2007, the site originated as a by-product of a dissertation that reviewed web resources for language learning. The site was conceived as an educational tool for learning languages, and as a guide to online resources.

The site's animation and webpage design is by Craig Gibson, who has worked for a variety of higher education and government institutions in the field of educational and electronic resource management. The animated activities are intended to incorporate the interactivity of computer-aided language learning software with the web-design principle of simplicity in use and access. In essence, the games are intended to provide a relaxed way of acquiring basic language skills.

All materials are free to use. For more information on usage and copyright issues please refer to the Terms and Conditions section. The animations require the use of Macromedia Flash Player.

Most games are directed at students at a beginner's level, although some language sections include materials at an advanced level. Games include those for learning phrases, numbers, vocabulary, alphabets, spelling and verb conjugation. Audio files have been incorporated into games for several languages, and more audio materials are planned. The site has become popular in schools.

Digital Dialects continues to be a work in progress, and it is intended that the site will represent the world’s major languages, with a cross-section of languages from diverse regions. The development of particular language sections depends on assistance from translators and native speakers.

Please send questions or feedback to dd@digitaldialects.com.


