D-Lib Magazine
The Magazine of Digital Library Research

T A B L E   O F   C O N T E N T S
J U L Y / A U G U S T   2 0 1 7
Volume 23, Number 7/8
ISSN: 1082-9873





The End of an Era
Editorial by Laurence Lannom, Corporation for National Research Initiatives



RARD: The Related-Article Recommendation DatasetRARD: The Related-Article Recommendation Dataset
Article by Joeran Beel, Trinity College Dublin, Department of Computer Science, ADAPT Centre, Ireland; Zeljko Carevic and Johann Schaible, GESIS — Leibniz Institute for the Social Sciences, Germany; Gabor Neusch, Corvinus University of Budapest, Department of Information Systems, Hungary

Abstract: Recommender-system datasets are used for recommender-system offline evaluations, training machine-learning algorithms, and exploring user behavior. While there are many datasets for recommender systems in the domains of movies, books, and music, there are rather few datasets from research-paper recommender systems. In this paper, we introduce RARD, the Related-Article Recommendation Dataset, from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains information about 57.4 million recommendations that were displayed to the users of Sowiport. Information includes details on which recommendation approaches were used (e.g. content-based filtering, stereotype, most popular), what types of features were used in content based filtering (simple terms vs. keyphrases), where the features were extracted from (title or abstract), and the time when recommendations were delivered and clicked. In addition, the dataset contains an implicit item-item rating matrix that was created based on the recommendation click logs. RARD enables researchers to train machine learning algorithms for research-paper recommendations, perform offline evaluations, and do research on data from Mr. DLib's recommender system, without implementing a recommender system themselves. In the field of scientific recommender systems, our dataset is unique. To the best of our knowledge, there is no dataset with more (implicit) ratings available, and that many variations of recommendation algorithms. The dataset is available at http://data.mr-dlib.org, and published under the "Creative Commons Attribution 3.0 Unported (CC-BY)" license.

Ensuring and Improving Information Quality for Earth Science Data and Products
Article by Hampapuram Ramapriyan, Science Systems and Applications, Inc. and NASA Goddard Space Flight Center; Ge Peng, Cooperative Institute for Climate and Satellites-North Carolina, North Carolina State University and NOAA's National Centers for Environmental Information; David Moroni, Jet Propulsion Laboratory, California Institute of Technology; Chung-Lin Shie, NASA Goddard Space Flight Center and University of Maryland, Baltimore County

Abstract: Information about quality is always of concern to users, whether they are buying a car or some other consumer goods, or using scientific data for research or an application. To facilitate consistent quality evaluation and description of quality information on data products for the Earth Science community, we formally introduce and define four constituents of information quality — scientific, product, stewardship and service. As requirements to ensure and improve information quality increase across government, industry and academia, there have been considerable efforts toward improving information quality during the last decade. Given this background, the Information Quality Cluster (IQC) of the Federation of Earth Science Information Partners (ESIP) has been active with membership from multiple organizations, participating voluntarily on a "best-effort" basis. This paper summarizes existing efforts on information quality with emphasis on Earth science data and outlines the current development and evaluation of relevant use cases. The IQC, with its open membership policy, is well positioned to bring together people from various disciplines and iteratively address the relevant challenges and needs of the Earth science data community. Moving forward, the IQC pledges to continue facilitating the development and implementation of data quality standards and best practices for the international Earth science community.

Trends in Digital Preservation Capacity and Practice: Results from the 2nd Bi-annual National Digital Stewardship Alliance Storage Survey
Article by Michelle Gallinger, Gallinger Consulting; Jefferson Bailey, Internet Archive; Karen Cariani, WGBH Media Library and Archives; Trevor Owens, Institute of Museum and Library Services; Micah Altman, MIT Libraries

Abstract: Research and practice in digital preservation requires a solid foundation of evidence of what is being protected and what practices are being used. The National Digital Stewardship Alliance (NDSA) storage survey provides a rare opportunity to examine the practices of most major US memory institutions. The repeated, longitudinal design of the NDSA storage surveys offer a rare opportunity to more reliably detect trends within and among preservation institutions rather than the typical surveys of digital preservation, which are based on one-time measures and convenience (Internet-based) samples. The survey was conducted in 2011 and in 2013. The results from these surveys have revealed notable trends, including continuity of practice within organizations over time, growth rates of content exceeding predictions, shifts in content availability requirements, and limited adoption of best practices for interval fixity checking and the Trusted Digital Repositories (TDR) checklist. Responses from new memory organizations increased the variety of preservation practice reflected in the survey responses.

Explorations of a Very-large-screen Digital Library Interface
Article by Alex Dolski, Independent Consultant; Cory Lampert and Kee Choi, University of Nevada, Las Vegas Libraries

Abstract: While digital libraries accommodate remote access via the web and mobile devices, their physical presence tends to be minuscule. A locally-developed prototype digital library application called DLib Wall, connected to MultiTaction display hardware from MultiTouch Ltd., is an attempt to create a rich and engaging onsite presence for the digital collections of the University of Nevada, Las Vegas Libraries. DLib Wall is one of the first applications of its kind, and demonstrates several relatively unexplored interaction modalities. In late 2014, it was deployed in the Goldfield Room — a conference room in the Lied Library — on a wall-mounted array of six 42-inch touch displays. DLib Wall represents both application development and the work of a collaborative team charged with creating a large, interactive, and content-rich media wall experience on a limited resource budget. The group's work included: evaluating and recommending technologies, managing the custom development for the DLib Wall application, and creating an initial content plan for the public rollout. This article highlights the technical aspects of the application while sharing key decision points that the team encountered in the process of bringing the project from conception to completion.

The Best Tool for the Job: Revisiting an Open Source Library Project
Article by David J. Williams and Kushal Hada, Queens College Libraries, CUNY, Queens, New York

Abstract: Sometimes the best tool for the job isn't the newest. It can be an existing tool that only requires a bit of polishing. This case study describes how a new digital services librarian, using a systematic approach, worked with colleagues to inaugurate a technology program within an academic library. In the process a previously-established open source web application was revived, extending its utility to contemporary development platforms. At the Queens College Libraries, an open source room reservation and scheduling system provided an answer to two important questions, and a means for building a new program.

Massive Newspaper Migration — Moving 22 Million Records from CONTENTdm to Solphal
Article by Alan Witkowski, Anna Neatrour, Jeremy Myntti and Brian McBride, J. Willard Marriott Library, University of Utah

Abstract: Utah Digital Newspapers is a pioneering digital newspapers program at the University of Utah J. Willard Marriott Library. Recently, a small project team completed a successful migration away from CONTENTdm onto a home-grown system called Solphal, built using open-source applications. The migration process is detailed along with examples of scripts used to prepare and enhance metadata. Transitioning away from a limiting vendor-based solution to a home-grown system has enabled the Utah Digital Newspapers program to be more responsive to user requests as well as realizing greater efficiencies in hardware and software. The platform has opened up new possibilities for the future as the collection continues to grow.


D - L I B   E D I T O R I A L   S T A F F

Laurence Lannom, Editor-in-Chief
Allison Powell, Associate Editor
Catherine Rey, Managing Editor

  |   Mirror Sites  |  Export Citations: RIS or BibTeX
transparent image