Volume 20, Number 9/10
Table of Contents
Connecting Systems for Better Services Around Special Collections
Saskia van Bergen
Leiden University Library, the Netherlands
Over the last few years, several projects to improve physical and digital access to special collections have been undertaken by Leiden University Libraries in the Netherlands. These heritage collections include manuscripts, printed books, archives, maps, atlases, prints, drawings and photographs, from the Western and non-Western worlds. They are of both national and international importance. The projects were undertaken to meet two key requirements: providing better and faster service for customers when using the collections, and creating a more efficient workflow for the library staff. Their interdependencies, with regard to creating new formats for the description of graphic materials and providing digital access, led to a merger of the projects with a combined set of goals for conversion, cataloging and digitization-on-demand. This article describes the infrastructure behind these projects, and the impact of the projects on users and staff to date.
Founded in 1575 by William I, Prince of Orange, Leiden University owns a large number of heritage collections of national and international importance. These special collections include manuscripts, printed books, archives, maps, atlases, prints, drawings and photographs both from the Western and non-Western world.1 Recently, the university library acquired the library collections of the Royal Tropical Institute in Amsterdam and the Royal Netherlands Institute of Southeast Asian and Caribbean Studies in Leiden. Because of this international orientation, researchers and students from all over the world come to Leiden to visit the library.
The special collections are part of our national and international heritage. They are important to reconstructing the development and dissemination of science in the Dutch Republic. For instance, how did 17th century researchers like Christiaan Huygens and Antonie van Leeuwenhoek obtain their knowledge? The aim of the library is to facilitate the use of our special collections in research and education by students, teachers, and researchers, but also by culturally interested people in the general public. To reach these objectives, the library continuously invests in digital and in physical services, such as the facilities in the reading room and (virtual) exhibitions. The foundation of the Scaliger Institute, a research center that aims to stimulate and support the use of the special collections by means of lectures, symposia, master classes and the provision of scholarships, also serves to further the library's objectives.
The special collections are increasingly made available through the library's catalogue. Since the late 1960s, Dutch libraries have described their library materials in a union catalogue, which is currently hosted by OCLC.2 Apart from this, Leiden makes use of three Ex Libris products: Aleph for our library services, DigiTool for our digitized special collections and Primo for discovery and delivery (see Figure 1). The OCLC-GGC union catalogue is used for bibliographic information. It facilitates interlibrary loans within the Netherlands, and it is also used to ensure worldwide availability via WorldCat. Metadata records from the national OCLC-GGC database are fetched by the local Aleph database, in which descriptive information about the individual copies is added. Materials can only be lent out or made available in the reading room when the shelfmark is included in the metadata. Primo, finally, is used for the discovery of materials in both Aleph and DigiTool. But as you can see in Figure 1, these last two databases were treated as separate silos. This means that the information was not synchronized, causing inconsistencies in Primo.
Figure 1: Graphic overview of the main systems used by Leiden University Library for the cataloguing of the special collections. Old situation.
Another problem we were dealing with is that the union catalogue was originally designed for books, periodicals and other textual sources. As described above, our library owns many non-textual collections, and the union catalogue did not contain the right formats to describe these. The result was that the curators of our library started looking for their own personal solutions. They were cataloguing in their own Access or Excel databases, without using a metadata standard. As a consequence, the metadata were locked in these databases. DigiTool was 'misused' for cataloguing purposes as well, both for digitized and non-digitized materials. Because DigiTool and Aleph are not connected, no services could be delivered for these collections in Primo. Visitors could see the metadata, but viewing the images online or placing a request for the physical items was impossible.
2. Project Goals
The main aim of the project we started with OCLC was to create new formats for the description of graphic materials, such as prints, drawings and photographs, and to convert all of our special collections metadata to the standard used in the union catalogue. We had already started two other projects concerning the digital access to our special collections, but as it soon became clear that these had many interdependencies, it was decided to merge these into a program focusing on three main goals:
- Converting all special collections metadata to OCLC's union catalogue, which would thereafter be used for cataloguing all of our special collections (including archival materials, prints, photographs and objects).
- Making all of our special collections available through our library catalogue. This means that clients can place view requests 24/7 from anywhere in the world and do not need to come to the library anymore just to fill in a paper call slip (and wait until the materials are available in the reading room). This is especially an advantage for our many foreign researchers. When researchers can select and request materials in advance, they can plan their travels much more efficiently.
- Creating a new service for digitization-on-demand of our special collections, built on the library catalogue. A visitor survey held earlier had revealed that clients considered the current application procedure for reproductions to be too slow and the costs of the scans to be too high. By integrating the new service into Primo, the administrative process could be simplified, both for the clients and the staff.
3. Project Stages
Four main stages were identified within the project. The first step concerned the transfer of the metadata to the union catalogue, carried out in close collaboration with OCLC. To accomplish this, we had to take several factors into account. First, we had to make separate records for the physical and digital objects, both with their own set of metadata. This process was recommended by the consortium of Dutch university libraries and the National Library (UKB). Although referring to the same object, digital and physical records describe different material types, both with their own services. Selections in WorldCat, for example, are based on material types. If you search for e-only, these materials can only be selected when they are catalogued separately as a digital record. To be able to make this distinction, we had to create different procedures for each of the following situations:
- Materials that had been catalogued in DigiTool, but had not been digitized and for which the system had been misused. For these materials only one new record was created in the union catalogue, representing the physical object. The records in DigiTool were deleted after conversion.
- Materials that were catalogued in the union catalogue, but were also digitized and therefore had a record in DigiTool as well. In this case, only one extra record was made for the digital object, with a link to the scans in DigiTool.
- Digitized materials that were only catalogued in DigiTool. For these materials two records were created, one for the physical and one for the digital object.
- Scans of books and manuscripts in DigiTool that were not scanned completely into a digital facsimile, but for which only one or a few specimen scans were made. For these records, a link to these scans was added to the record for the physical object.
A second concern was that the union and Aleph catalogues describe editions, or, according to the FRBR terminology, manifestations, whereas DigiTool describes copies or items.3 Fortunately, this distinction didn't affect our procedures that much, because our records in most cases describe unique materials. This means that, in practice, edition and copy are identical. The printed books were already catalogued in the union catalogue, so this didn't cause any conversion problems either. Strictly speaking, the print collections do not contain unique materials. However, since the metadata do not contain any information about state or edition, it was decided that these would be considered as unique materials nonetheless.4
The second stage of the project concentrated on the development of a digitization-on-demand service. All of our scans can already be viewed for free in DigiTool in high JP2 resolution. The service had to be set up for people who preferred the original TIF scans or a PDF, or who wanted to order reproductions of non-digitized materials. The first thing we did was to add order buttons to all special collections records in the catalogue and to connect them with an order form (fig. 2). Clients can now order scans of all our materials, catalogued or non-catalogued, digitized or non-digitized. They can place their orders regardless of time and location and pay with credit card, PayPal or bank transfer. Scans are delivered to the client after payment. Previously, ordered TIFF scans and PDF files were sent by email, WeSendit, or other external services. When a large order was placed, the scans were placed on a DVD and shipped, which was not very practical considering the fact that our clients are from all over the world. Part of the project was therefore to implement an FTP server, which could be used to deliver the ordered scans quickly and safely. At present, clients receive a link by email, which they can use to download the scans during one month as often as they want, on various devices.
Figure 2: All special collections records in the catalogue are provided with an order button. (See a larger version of Figure 2)
To organize the digitization-on-demand workflow, we use the open-source software application Goobi. This software allows you to model, manage and supervise all production processes involved in creating a digital library. These include importing data from library catalogues, scanning and content-based indexing and the digital presentation and delivery of results in standardized formats. The software was developed by a consortium of German libraries and commercial companies. It is used in libraries and archives in Germany, England, Spain and Austria, and we are the first library to implement the software in the Netherlands. Scan requests placed in Primo are sent to Goobi automatically and connected instantaneously to the bibliographic metadata imported from the catalogue (See Figure 3). The software is also used to make METS files, add structural metadata, and deliver scans to the client. Scans of completely digitized materials are imported into DigiTool as well.
Figure 3: The Goobi software is connected to various other applications: Scan requests placed in Primo are sent to Goobi automatically (1), together with client information taken from the order form (2). Goobi imports the bibliographic data from the catalogue (3), scans are sent to the client with an FTP-service (4), and when a complete object is digitized, the scans are exported to DigiTool as well (5).
Step three was to synchronize the metadata between all systems in an automated process (See Figure 4). Eventually all metadata in DigiTool will be substituted by the records for the digital objects in the union catalogue. This is a great advantage. In DigiTool there is no validation on the use of Marc21, so the standard wasn't always used in the right way. During the conversion the metadata were enriched and corrected where possible and the result is that we have cleaned up metadata in all systems. Our aim is to close the connection between DigiTool and PRIMO as soon as possible. Because the images from DigiTool will be made available in Primo through a link in Aleph, the connection is no longer necessary.
Figure 4: Overview of the connections between the systems in use by Leiden university library.
Scans that are made in projects by external vendors are uploaded in batches into DigiTool with a locally developed tool called MEGI (which stands for Mets ingester and uploader). With this tool it is possible to prepare the structure of a METS file before upload, and to create metadata on both collection and item level. For uploads of single items, DigiTool's own ingest service Meditor is still used. In both cases, the identifier from the union catalogue is added to the records during ingest, to make it possible for DigiTool to import the right metadata.
The last step was to add holding and item information to the new records in Aleph, to allow for view requests and loans. Because our collections are highly diverse, different request procedures and restrictions had to be taken into account. Materials that are kept in our stacks are collected throughout the day and are available an hour after the request. Photographs, however, have to acclimate for 24 hours before they can be made available in the reading room. In addition, a part of our collection is kept in the Bibliotheca Thysiana, the only Dutch book collection from the seventeenth century that is still housed in its original purpose-built building.6 These materials can be consulted in the special collections reading room of the University Library only, and are collected once a week.
When we started the project, we knew it would take some time before all materials received an item description, especially because of the conversion project. For this reason, we decided to use the existing scan request button for view requests as well. The buttons only appear with materials that don't have an item yet, so they will gradually become obsolete. An additional problem is that a considerable part of the special collections is not available in online search systems. Some materials are described only in a non-digital form, like printed catalogues and inventories. In some cases, uncatalogued materials are mentioned in scholarly publications, or researchers find references in publications. For these materials we placed the buttons for scan orders and view requests prominently on the special collections tab of Primo (See Figures 5). This way, all of our special collections, catalogued and non-catalogued, digitized and non-digitized, can be requested online.
Figure 5: For uncatalogued materials, special order buttons are made available through the catalogue.
By connecting the existing systems and by developing local additions, researchers and students are offered easier and faster ways to view materials in the reading room, and to order, pay and receive reproductions of our heritage collections. The project has made an important contribution to the visibility of these collections. All of our materials, not just the textual sources, but also the various image collections prints, drawings and photographs are now available in WorldCat, thus improving the discoverability of our records. We are also experiencing a significant increase in orders. By placing the buttons prominently in the catalogue, ordering scans has apparently become so much simpler that the scanner we bought for our in house scanning activities is now in use almost fulltime for digitization-on-demand.
As explained above, OCLC not only converted the metadata to the standard used in the union catalogue, but also created new formats for various material types. Because we are the first library in the Netherlands to use the union catalogue for these materials on this scale, our project was conceived as a pilot. While our opinions clearly carried much weight, the main goal was to create new formats for all Dutch libraries.
An important secondary goal of the program was to make the current systems more manageable by using them correctly, and also by automating the workflow where possible. The cleanup consisted of more than technical solutions. During the project it became apparent that it was even more important than previously to make proper arrangements for the use of systems, to identify deficiencies, and when necessary to create work-arounds to avoid future problems in our systems. Although it took our staff some time to get used to the new way of working, they now clearly see the advantages as well. For example, colleagues involved in cataloguing considered the stricter procedures as inflexible at first. But working in a consistent manner improves the quality of the content considerably, thus increasing the possibility of building new services on the content as well.
Altogether, the most important result of the project is that, at present, we are much better prepared for the future of library cataloguing. In the near future, we plan to implement an update of the infrastructure for our digital collections, and the results of this project will make replacement with any possible future system much easier.
1 Christiane Berkvens-Stevelinck. Magna commoditas - Leiden University's great asset. 425 years library collections and services. Amsterdam, Leiden University Press, 2012.
2 Information on the Dutch union catalogue GGC (which stands for Gemeenschappelijk Geautomatiseerd Catalogiseersysteem, or shared automated cataloguing system) can be found here.
3 IFLA Study Group on the Functional Requirements for Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. München 1998. PDF and HTML files of the report can be found here.
4 Rare Books and Manuscripts Section of the Association of College and Research Libraries. Descriptive Cataloging of Rare Materials (Graphics). Chicago 2013. Especially Appendix E. Variations requiring a new record.
5 For Goobi case studies see http://www.goobi.org/en/ and http://slideshare.net/goobi_org.
6 For English information about the Bibliotheca Thysiana, with further literature, see here.
About the Author
Saskia van Bergen works as a senior Project manager for the Innovations and Projects Department of Leiden University Library. She is responsible for projects focusing on the digital access to Special Collections, and deals with the management of digitization, cataloguing and digital collections. She also participated in several national projects, like Early Dutch Books Online (now Delpher) and the Dutch portal for academic heritage Academische Collecties.