Allison B. Zhang
Historical scrapbooks are precious primary resources for researchers. Digitizing historical scrapbooks to provide access to them on the Web is desirable, and many scrapbook digitization projects are underway. However, due to limitations of technology and budget, creating a user-friendly interface for online historical scrapbooks has been very difficult. This case study describes a new approach for presenting online scrapbooks with "page flip" animation and clickable-items, as well as other user-friendly features, without major funding.
Many libraries hold historical scrapbooks in their special collections. The scrapbooks were created in the past by people who wanted to preserve their photographs, letters, newspaper clippings, and many other objects that represent the history of their personal lives or family tree. The scrapbooks are valuable primary resources for researchers studying the history of typography and printing, advertising, architecture, art and design, as well as lifestyle and cultural trends (Gartrell). For example, the discovery of Thomas Jefferson's personal scrapbooks at Alderman Library in 1999 was considered a major discovery, because the scrapbooks reveal the sentimental side of Jefferson's complex personality and help us better understand the man and his world (Turner, 1999).
Unfortunately, historical scrapbooks are not always made available to library users, because everyday use will definitely damage these already deteriorated items. In recent years, as digital technology has developed, libraries have started to digitize scrapbooks and to make them available online for researchers and the public to use. Digitizing historical scrapbooks has not only provided broader access to these hidden treasures but has also preserved the fragile scrapbooks from further deterioration.
However, due to the limitations of technology and budget, creating user-friendly online presentations of the scrapbooks has not been an easy task. Most online scrapbooks are broken down to individual items. Users can view individual items or view each page separately.1 Some online scrapbooks can be viewed only through each page, but users cannot also view the details of each item.2 These online scrapbooks do not have a mechanism to show users how the original physical forms of the scrapbooks are organized and presented, which may be very important factors for studying the historical scrapbooks. The lack of "look and feel" of the overall structure of a scrapbook makes viewing an online scrapbook less interesting for users than flipping through a physical scrapbook would be.To address this situation, the Digital Collections Production Center (DCPC) of Washington Research Library Consortium (WRLC) created an online historical scrapbook that can be flipped through very much like a real scrapbook, and each item in the scrapbook can be clicked to view details of organization and presentation. This article describes how the DCPC created such a user-friendly interface for a digital historical scrapbook within an affordable budget.
The Scrapbook Projects
The DCPC is a centralized facility that provides digital conversion services in a production environment for WRLC member libraries. In early 2007, we received two requests for digitizing historical scrapbooks. One request was from the American University Washington College of Law Library, which wanted to digitize a collection of over 20 scrapbooks compiled between the end of the 19th and beginning of the 20th centuries. Another request came from The Catholic University of American Library (CUA). CUA wanted to add to an existing digital collection a scrapbook created from 1916 to 1920.The scrapbooks contain photographs, newspaper clippings, cards, 3-D objects, brochures, small booklets, and other objects. Most of them are fragile, and some are in very bad condition. The staff from both institutions provided a list of features they wanted for viewing the online scrapbooks. These included being able to:
The DCPC began by selecting the software needed to create the digital versions of the scrapbooks.
All of the DCPC's existing digital collections were created using Greenstone Digital Library Software.3 Greenstone provides powerful search and flexible browse functions. We can configure Greenstone to display individual items in the digital versions of the scrapbooks. The individual items can be browsed through a title list displayed in thumbnail images with brief descriptions or through a Table of Contents. This is similar to our other digital collections.4 Page by page viewing can also be done using Greenstone, but viewing page by page and viewing item by item on each page would be separated, just like the other online scrapbooks mentioned earlier. In addition, desired page flipping and item clicking features cannot be achieved using the Greenstone software.
As the two DCPC scrapbook projects together contain over 20 scrapbooks with about 6,000 individual images, we thought it worthwhile to conduct research and experiment with another technology. If we succeeded in finding software that enabled us to cost-effectively provide the features that the American University and Catholic University libraries desired, it would encourage our other WRLC member libraries to digitize more historical scrapbooks and similar materials.
As we conducted our research for software, we were referred to the British National Library's "Turning the Pages" project,5 and we were impressed by its features. However, scrapbooks are different from books. Pages in a book are flat. "Turning the pages" can create a real experience of reading a paper book by flipping the pages using a mouse. The "zoom-in" feature also works well for looking at the details of the contents on a flat page. However, our scrapbooks contain objects other than just flat photographs and papers. Some newspaper clippings and letters are folded (see Figure 1) or are glued under another piece, which produces layers of clippings. Also, some of the most interesting objects in the physical scrapbooks are dance books and booklets that consist of up to 15 pages each (see Figure 2). A simple zooming feature cannot reveal what is under the folded papers or what is inside the booklets. In addition to these limitations of the "Turning the Pages" software, it is very expensive and beyond our limited budget.
Although we decided against using the "Turning the Pages" software for the reasons stated above, nonetheless we were motivated by what the British National Library project had accomplished with "Turning the Pages". We decided to look for another tool that could not only create the effect of flipping pages but would also allow clicking on each individual item to display a larger version of the image, or display another book representing one of the booklets that then could be flipped through page by page. Was such a tool available? As we searched the Web, a commercial site caught our attention. The site displays a weekly advertisement. Using his or her mouse, a user can flip from page to page and click on any item on a particular page to open a new page with a larger image and descriptions of the selected item. This was exactly what we imagined for our online scrapbooks! Analyzing the web site, we learned that the advertisement display was created using Flash� software from Macromedia. Following this clue, we studied sample products of Flash software and conducted more comprehensive searches on the Web. As a result, we found a Flash component called Premium Page Flip.6
Premium Page Flip is a component for Macromedia Flash MX, Flash MX 2004 and Flash 8. It is an out-of-the-box application used with Flash to create online/offline digital books with real flipping effect. The component is easy to set up and configure. Advanced features can be controlled using ActionScript. The configuration can be done within the component or in an external XML file. The "pages" of the digital book can be composed of JPEG images and Flash files, which can contain clickable items. The pages and image files can be added easily on the file list within the component or in an external XML file. The pages of the book with the page flipping effect created using the Flash component load very fast on the Web. This has been a very important feature for our projects, as the largest scrapbook we have contains over 200 pages. Without a fast loading application, our users will be frustrated waiting for the digital scrapbook pages to load. Although the final online presentation requires users to download and install a Flash plugin, this has not been a major concern for us since the Flash plugin is very popular and is used by the major browsers. Many users may already have the plugin on their computers. Most importantly, the Premium Page Flip component costs only $25 and Flash 8 software costs $699. This is surely an affordable approach for us.
Creating the online scrapbook
We decided to experiment first with the scrapbook from CUA. This scrapbook was compiled by James Michael Carroll while he was a student at CUA from 1916 to 1920. He assembled a collection of photographs (many of which he took himself) and ephemera with a particular emphasis on the CUA baseball team (of which he was a member) and other athletic events at CUA. In addition to the photographs of CUA, the scrapbook contains examples of exams, dance cards/books, tickets for different events and attractions at CUA and throughout Washington, DC.7 There are 78 pages in this scrapbook.
Scan and process images
Before we started to scan the scrapbook, we carefully outlined the desired features based on the wish list from the staff in both institutions that asked us to digitize scrapbooks, and we determined our scanning techniques according to that list. The first desired feature is the "page flip" effect, which will allow users to view the entire book page by page. This requires full-page images. Highlighting and clicking is the second feature, which will allow users to click on a highlighted item and view a larger version of the image. This requires a large version image of each individual item. Opening the small booklets to view them page by page is the third feature desired. This demands images of all the pages contained in a booklet. Finally, as this scrapbook was to be added to an existing digital collection, which displays a title list using thumbnail images for each item, we needed to create thumbnails for each item image.
Therefore, to obtain the desired features for our online scrapbook, we created three sets of images. First, we scanned the scrapbook page by page. While we do not have an overhead book scanner, the scrapbook we were using for this project had fallen apart and the spine had crumbled, so we were able to scan the scrapbook on a flatbed scanner. The most important technique of this part of the process is to scan all pages in the same size, because the pages will be re-sized using the Premium Page Flip program. Scanning the pages in the same size will help guarantee that all pages are re-sized proportionally without distortion. Second, we scanned the items that have multiple pages, such as brochures, booklets, folded papers and layered newspaper clippings, and so forth. The "scanning pages in same size" technique is also required for each booklet. Finally, using Photoshop we cropped individual items from the page images we had scanned in the first step and saved those items individually. We scanned the pages and the multi-page items at 500 dpi, 24-bit color and saved them as uncompressed TIFF images. These TIFF images comprise our master files.
Following the procedure we use for all of the DCPC digital collections, we added some technical information to each of the TIFF images using Photoshop. Then we converted the TIFF images to JPEG format for the large and thumbnail images that are displayed on the Web. All image processing was completed using the batch processing tool in Photoshop.
Design and create metadata
For all our digital collections, including this scrapbook project, we use qualified Dublin Core metadata for descriptive metadata and MPEG-21 DIDL (Bormans and Hill, 2005) for representing the internal structure of digital objects.
Naming images is a crucial aspect of structural metadata design. "Since meaningful structural metadata can be embedded in file and directory names, consideration of where and how structural metadata is recorded should be done upfront." (NARA, 2004). Before we started to scan the images, we designed the image filenames according to the following criteria. Firstly, the filename is the identifier of a digital image and it must be unique. Secondly, the filename should reflect the relationship of this file to other related files. Thirdly, the filename should act as a locator for finding the original images. Finally, a thoughtfully designed file naming convention will allow us to create metadata automatically using locally created scripts.
Our final naming convention for CUA scrapbook is as follows.
This file naming structure illustrates page number, location of the item on each page, and nature of the item, i.e., whether it is a multi-page item or a single-page item. The naming structure also provides an easy way to refer back to the original scrapbook when needed. In addition, this file naming structure contains encoded structural metadata, such as sequence and hierarchy, that assists in creating the flip-book and the descriptive metadata, such as DC.Relation, using a locally created script.
File extensions also play an important role in our automated process and display configuration. One item record may be associated with up to four image files: a master TIFF file, a display JPEG file, a display Flash file (swf file) and a thumbnail file. All four image files have the same filename but different file extensions.
To accommodate easy navigation of the online scrapbook, we decided to create two sets of metadata records, a record for each item and a record for the entire book. The CUA staff provided descriptions for each item in the scrapbook in an Excel file. We mapped the elements in the Excel file to Dublin Core and DIDL metadata and created item records in batch mode using a script. In each item record, we included a qualified Dublin Core element "DC.Relation.ispartofseries" to link all objects in the scrapbook:
In the book record, we used a qualified Dublin Core element "DC.Relation.haspart" to include all pages in the scrapbook. The page identifiers were automatically added to this element using a script checking through the file naming structure.
Set up workflow
Generally speaking, the workflow of creating a book with page flipping capability using the Premium Page Flip component is rather simple. The display image files can be stored in a file system, on another server, or in a repository, where they can be assigned a URI. The workflow can be broken down into the following steps:
Figure 4 shows simple workflow of creating the book with the page by page flipping feature.
However, our current repository system makes this workflow more complex. We use Digital Object Catalog (DOC) created using DSpace8 as our digital object repository. The DOC provides a mechanism to store both metadata and associated files in the system, and preserves the digital objects. After a digital object is stored in DOC, a handle9 is automatically assigned to the object, which can be used to link the object to other locations and for different purposes. Because all of the digital objects in the scrapbook are stored in DOC, we will need to get the handle for different image and Flash swf files at different stages. For example, in order to create a book with flipping capability for a small booklet, we will need the DOC handle for each JPEG image in the booklet. So the JPEG images must be uploaded to DOC first. To compile the final scrapbook, we will need the DOC handle for each page, which is either a Flash swf file or a JPEG file that must be stored in DOC first. The workflow must be thought through carefully, and setup done step by step before creating the Flash swf files for each page and for the whole scrapbook.
Put the pages together
Creating highlighted items on each page and linking them to images in our DOC repository is a time-consuming task. This process uses Flash techniques that require several steps. Creating each highlighted item repeats these steps. The external URIs are added using ActionScript, which is provided in Flash. Below is an example of the ActionScript for two items.
For the CUA experimental scrapbook, we created each highlighted item manually; however, in the future this may be done in a batch mode or using an automatic script.
Three templates were created for creating books with flipping capability in different sizes. One template is for small booklets that are in landscape orientation, and another template is for those that have portrait orientation. The third template is for the book as a whole. Despite the fact that the Premium Page Flip component offers a simple way to configure the dimensions of a book, using the three templates has speeded up the process. Setting up screen and book background colors and sizes is also easy. The out-of-the-box application provides navigation tools such as the "Previous" and "Next" buttons, the "Go to" page box, page numbers, printing, zooming, and so on. More advanced navigation can be added using ActionScript. We removed the printing feature and added total number of pages. We also added usage instructions on the opening page of the scrapbook.
After all pages and items have been created, putting them into the final book is a very easy process. The pages can be added on a file list or in an external XML file. The external XML file option provides an opportunity to accomplish this process automatically using a Perl script.
Present the Scrapbook in Greenstone
The final online scrapbook provides a user-friendly interface. Users can flip the pages by clicking or dragging the mouse, typing the page number, or clicking the "Next" and "Previous" button on the book navigation bar. The user can zoom in on each page to see details. Each highlighted item can be clicked so that a larger version of the object is displayed in another window with title description. The booklets can also be viewed using the page-flipping feature. The online scrapbook displays page numbers as well as the total number of pages in the book. Figure 5 shows a screen shot of the final online scrapbook. The red arrow shows when a booklet is clicked, a flippable booklet is opened in another window. The blue arrow points to a highlighted item as a user selects it using a mouse. The green arrow indicates when a highlighted item has been clicked and a larger version of the image is displayed in another widow with title information.
This online scrapbook can be used as a standalone Flash application. However, the standalone version does not link the images to full descriptions of each image and cannot be searched. To make the images searchable, we need to present this scrapbook in our presentation system Greenstone. Since this scrapbook is part of an existing collection, additional configuration of Greenstone is needed.
The existing collection "ACUA Photograph Collection: Selected Images and Prints of The Catholic University of America" had already contained two photo albums, which can be viewed in two ways. The individual photographs can be searched and browsed by title, subject, personal names and date. The entire albums can be viewed through a "series" link in each record and on the navigation bar, through a Table of Contents, or through the image viewer in Greenstone. The online James Carroll Scrapbook resulting from our project has been added as the third album and series in the "ACUA Photograph Collection". All individual items in the scrapbook can be searched and browsed like the two photo albums, but in addition, the James Carroll Scrapbook has page flipping capability. This scrapbook can be accessed at <http://www.aladin.wrlc.org/dl/collection/hdr?cuphoto> through the "James Carroll Scrapbook" link, or browse by title, subject, series, and so on.
Our experimental online scrapbook presents a new and user-friendly way to view the content-rich historical scrapbook. Users can enjoy viewing the complete layout of the scrapbook, looking into details of each item, and flipping the pages as if viewing the physical version.
The total cost of the software needed to create the online scrapbook, Flash 8 plus the Premium Page Flip component, was less than $750. While the process of online scrapbook creation requires the use of human resources, especially when creating clickable items, the process itself is simple and repetitive. With step-by-step instructions, a student worker could easily complete the process. In addition, it should be possible to automate the process by creating a Perl script or other similar program to use for that purpose. We hope our work will inspire other libraries wishing to create more user-friendly online scrapbooks at a reasonable cost.
Many thanks to Don Gourley, Director of Information Technology of WRLC, for his helpful suggestions and comments.
1. See "The Lewis Carroll Scrapbook" at the Library of Congress website. <http://memory.loc.gov/intldl/carrollhtml/lchome.html>.
2. See "Emergence of Advertising in America: Scrapbooks" website. <http://scriptorium.lib.duke.edu/dynaweb/eaa/databases/scrapbooks/@Generic__BookView>.
7. About this collection. ACUA Photograph Collection: Selected images and Prints of the Catholic University of America website. <http://test.aladin.wrlc.org/gsdl/collect/cuphoto/cuphoto.shtml>.
Bormans, Jan, and Keith Hill, eds. 2005. MPEG-21 overview v.5 <http://www.chiariglione.org/mpeg/standards/mpeg-21/mpeg-21.htm>.
Gartrell, Ellen. Scrapbooks. <http://scriptorium.lib.duke.edu/eaa/scrapbooks.html>.
Turner, Emily. (1999). Scrapbooks shed light on Jefferson. The Cavalier Daily, September 30, 1999. <http://www.cavalierdaily.com/CVArticle.asp?ID=898&pid=473>.
U.S. National Archives and Records Administration (NARA). (2004) Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files Raster Images. June 2004. <http://www.archives.gov/research/arc/digitizing-archival-materials.pdf>.
Copyright © 2007 Allison B. Zhang