Clare L. Birdsey
AbstractAs technology improves, the desire to replace analogue material with digital reproductions grows. The disadvantages, both financial and practical, of using new analogue material to extend the life of old analogue material are well known. However, the infrastructure needed to use digital media, as an alternative to the use of analogue surrogates, is often unachievable by many institutions. Unlike self-visible media, digital files require a matrix of hardware and software to facilitate their retrieval. Further investment is required to preserve and make available in the future the digital storage media. Digital access initiatives, therefore, can only be undertaken after careful planning. The development and implementation of one digital access programme are the subject of this article. A pilot study to digitise fragile and degrading material from the advent of photography was completed in August 1999. A digital strategy, a methodology for digitisation, and a plan for the dissemination of the material to a wider audience were developed based on the ability of the organisation in possession of the collection to maintain a programme of digitisation.
The use of information technology has been steadily growing over the last twenty years, and for the past decade, this increasingly has involved the digitisation and display of high quality digital images (Besser, 1994, Cringley, 1996). The benefit of using information technology (IT) within organisations such as libraries and archives has proven to be a revolutionary tool for information management and information retrieval. Text databases and search engines have enabled textual information on collections containing multiple media types to be retrieved locally and globally (Khoshafian & Baker, 1996).
Yet at the same time, increasing reliance on information technology has presented new challenges. Most organisations and individuals, for example, have many different types of computers, storage devices and software applications. As systems become obsolete, it may not be possible to transfer data to new systems (TFADI, 1996). When the information contained within a computer system is not a matter of kilobytes but hundreds of megabytes and gigabytes, the effect of information loss, and the cost of retrieval and transference, is huge. Hence there has been great interest in the development of standards to assist the preservation and transference of digital files to different hardware and software (Getty, 1999, Kenney, 1993). The digitisation of collections has also caused curators and archivists to become concerned about digital copyright, intellectual property rights and resource management (Cornish, 1996). One of the most significant changes incurred by many organisations has been the increase in requests for information, once awareness of the existence of a specialized collection becomes available to the general public. Dealing with an increase in requested information requires a system of organization (Canale & Wills, 1995).
A successful digitisation system needs to be developed from the ground up. It should take into account requirements for image quality during the digitisation process, indexing requirements, and the overall database. The chain of events from selection to digitisation, projection, awareness, and retrieval was investigated by the Digital Image Archive (DIA) project. The project concentrated on the digitisation of mainly inaccessible photographic material from a collection at the Royal Photographic Society (RPS) in Bath, UK. The DIA project attempted to cover each stage of the digitisation process and draw conclusions about the effectiveness of the methods used at each stage.
The project began with a survey of photographic collections in the United Kingdom (UK). The pilot study revealed that many collections had material in a poor or degrading state. Furthermore, they were described in a very inconsistent fashion. Efforts to digitise photographic material were underway in a high percentage of the organisations surveyed (Birdsey et al., 1999). However, in many cases the digitisation efforts appeared to be directly related to the National Lottery requirements for proposals to improve archival operations. Many collections were attempting to use the possibility of digitising their material as the basis for reorganizing it, and not vice versa as would be the logical course of action for digitisation. However, the actual or expected incorporation of technology in the management of photographic collections appeared to boost work efforts within many organisations and provided an optimistic outlook about the future of the information concerned. The possibility of new technology thus served as a spark for reorganisation and increased conservation efforts of the photographs. Furthermore, digitisation was seen by survey participants as a positive method for preserving images and enabled conservationists to conserve original material. By providing access to digital reproductions, the original material could be conserved.
The survey results supported the strategy underway at the RPS. The need for reorganisation of the collection was highly apparent, but without additional funds, no effort could be made to commence change. Submitting a successful proposal for digitisation would enable the Society to begin a process of digitisation but, more importantly, to investigate the contents of the collection.
The Royal Photographic Society was formed in 1853 with Queen Victoria and Prince Albert as patrons. It was established to promote the art and science of photography amongst photographers. The mission continues today through the work of specialist groups in various photographic disciplines. Major photographers represented in the collection include Nic�phore Ni�pce, William Henry Fox Talbot, Julia Margaret Cameron, David Octavius Hill and Robert Adamson, Edward Steichen, Roger Fenton, and Alvin Langdon Coburn (Royal Photographic Society, 1994). Public access is available to most areas of the society. Access to the collection of the society, however, is restricted to bona-fide researchers. The central reasons for this are: limited numbers of staff; the need for security; inadequate search and retrieval systems; and restrictive handling policies necessary because of the harmful effect of light on some materials (Reilly, 1986). Serious researchers are allowed to use the resources under supervision because funds for creating duplicates for use are very limited.
The DIA project selected the William Henry Fox Talbot (WHFT) collection for digitization. The collection contains 600 photographic images from the advent of photography, including salt prints and Talbotypes. In addition, the collection contains 20 pieces of photographic equipment and hundreds of handwritten letters and other documentation. The textual documents were not included in the DIA project, but may be incorporated in the future into a multimedia CD-ROM or Internet site for scholars of photographic processing and the history of photography. Much of the collection has been damaged by past handling and exposure to light, and therefore many items are inaccessible. Talbot experimented with different photographic processes, and it is often difficult for conservationists to determine how much damage use of the original items may cause. Early salt prints, for example, were fixed with a sodium chloride solution that did not remove the silver chloride, but merely inactivated it. This process, and variations on this process, can now be unstable and damaged by exposure to light for only a few hours. It was determined from literature research into the effect of light on images created using certain photographic processes that many of the WHFT originals would be damaged by scanning (Reilly, 1986, Ware, 1994). Therefore, reproductions of the photographic material and the equipment that had been made on 35mm Ektachrome slide film in 1996 were used for digitisation.
Developing a methodology
The choice of the WHFT collection for the pilot study determined the choice of hardware selected for digitisation, the time allocated for scanning, the level of reorganisation necessary, and whether an indexing system needed to be designed (Fitzgerald, 1995). Literature research into previous similar projects enabled the design of a flow chart of factors to be considered or followed during the course of the pilot study (Besser & Trant, 1995, Philips et al., 1994). Some subsequent changes were made during the course of the project.
An important first step was determining how the information would be made available to researchers and the public. Early on in the DIA project it was decided that two methods for retrieval would have to operate. The first would consist of an archive of high-resolution digital image files for high quality retrieval. The second retrieval mechanism would provide a front end at screen resolution to the stored archive. The database of archive quality images is stored on CD-ROMs, whereas the front-end displays screen resolution images on-line. The content of the front-end was to be searchable, and therefore the collection had to be catalogued and indexed. Low-resolution images and documentation are called to the screen from a local area network (LAN) server. The high-resolution images are not provided on-line as there is currently no CD-ROM jukebox or large storage medium available for the storage of the larger images. Consequently, the high-resolution images are provided to curators and in-house researchers upon request, and are not available to the general public.
Designing a system for scanning and organisation
The selection of hardware and software for the WHFT DIA was based on the facilities already installed at the RPS and the availability of funds to purchase new equipment. There was great concern during selection of the hardware and software for scanning and organising the digital information that the products selected conformed to international standards or practices (Blackaby & Sandore, 1997, Fitzgerald, 1995). Image quality and methods to retreive data were a central concern of the project.
In an ideal world, archival material would be scanned at the highest resolution currently available and stored within a storage medium that has a long archival life and is not greatly effected by external oxidants. Analysis of previous projects revealed that great emphasis should be placed on the need to retrieve the digital information in the future (Hopkin, 1996, May & Barnard, 1996, Mohlhenrich, 1993), and that the highest resolution that the capture system can produce should be obtained. Many of the earlier projects digitised material for a single use at the time of capture (Musalem, 1995). The history of computing shows that storage capabilities and retrieval speeds are never constant. Approximately every eighteen months, computing power advances (Cringley, 1996, Khoshafian & Baker, 1996). It is therefore desirable to capture and store as much information about an image at the time of capture as is possible. These files can then be down-sampled for screen or Internet output and kept at a high resolution for archival storage and printing.
The 35mm transparencies were scanned at the highest resolution of a Nikon Coolscan bulk transparency slide scanner to produce 24 bit (16.7 million colours) digital image files measuring 2482 x 3764 pixels in dimensions and 26.7Mb (Megabytes) in size. The Kodak colour test target Q-60 on Ektachrome 35mm slide was used to grey balance the Nikon Coolscan. Scanning the Q-60 target on the same film as the RPS slides, and grey balancing the scanner for this film, involved several levels of calibration. The red, green and blue output signals were aligned so that R=G=B was obtained. This was achieved within the Nikon software by "correcting" the red, green and blue curves that represent RGB output signals for the 22-step greyscale on the Q-60 test target. The Talbot images were quite thin due to the effects of ageing on the originals and the ambiguity of the development processes. Therefore, the gamma of the scanner had to be tested and adjusted to determine the tonal response of the scanner. This culminated in the capture of a greater tonal range.
All files were processed and stored as sRGB (Standard Red Green Blue) TIFF (Tagged Input File Format) files on ISO 9660 format CD-ROMs. All processing was completed using Matlab image processing software. The RPS requested, and the project agreed, that the digital files would not be edited or manipulated in commercial imaging software. Investigations carried out on previous projects, both archival and multimedia, revealed a concern about the difference between the appearance of original material and digital reproductions (Kenney, 1993).
The entire collection is contained on 25 CD-ROMs. These files can be retrieved via any computer able to support the ISO 9660 CD-ROM standard and can be opened using any imaging software capable of reading TIFF files. International standards were used to enable the digital files and storage media to be successfully migrated to future digital formats.
Digitisation of the slides and organisation of the digital files was completed by two research assistants. The first came from an image science background and was researching image quality for the digitisation of photographic material and it's subsequent display. The second member of the team was investigating the methodology behind construction of a digital image archive and the methods of disseminating the material to a wider audience.
As part of the DIA project, a review of available retrieval software was conducted. The review concluded that museum database management software, based on international cataloguing standards, was the best tool for organising the digital information. It was felt that this type of software would store information about the collection within a structure that allowed the collection to be efficiently reorganised. Further, this kind of software could serve as a basis for future digitisation projects that could incorporate the same indexing system and metadata documentation methods. Unfortunately, budget resources for this project did not allow for the purchase and implementation of museum database management software. An alternate method had to be designed.
Attempts to organise the WHFT digital image files within a sophisticated cataloguing system that had not been produced by a third party proved to be extremely difficult. The greatest challenge was not the actual process of creating a database and making information available via the Internet or CD-ROM. Instead, it was in designing a system that would enable the migration of information into future programmes. Literature research into successful and unsuccessful projects changed our priority from organisation to retrieval. The dynamic link, within archives, between digital and paper records shows that there must be a common cataloguing system for all information. Most of the archives covered in a survey of cataloguing techniques indicated that they had multiple collections in various digital and analogue formats. Their greatest concern was the inability of previous and current digitisation projects to integrate these records. Hindsight has shown that all encompassing digitisation projects are not realistic and are rarely completed (Hopkin, 1996, UKOLN, 1999). We visualised the DIA as being part of a jigsaw where the face of each piece differed but the physical structure was constant, thus enabling any piece to fit together regardless of when it was produced and in what cataloguing software.
To assist in the creation of such a system, a survey of museums, archives and libraries in the UK was conducted to determine what method of indexing was being used by the highest percentage of organisations. The survey concentrated heavily on cataloguing systems and thesauri (Birdsey et al., 1999). The effectiveness of the methods used was also analysed and conclusions were drawn on awareness, within these organisations, of compatibility and transferability.
The indexing survey revealed that the majority of organisations were cataloguing their collections in accordance with the Museum Documentation (MDA) standard SPECTRUM. The MDA aids museums and archives with documentation methods. Although this standard is not international, it has been designed in correspondence with international documentation practices (MDA, 2000). Furthermore, the fact that many other institutions used SPECTRUM meant that the the RPS, a small non-profit organisation, would not have to solely maintain a system of documentation in the future. The work of the Museum and Galleries Commission (MGC) and the MDA is helping to standardise documentation practices throughout the museum community in the UK.
The survey of cataloguing and indexing practices also assessed the use of thesauri within image collections, especially to control the use of keywords assigned to images to facilitate retrieval from text and image databases. Initial results indicated that there are highly idiosyncratic practices throughout the archival community. However, further analysis of the data revealed that many independent thesauri and cataloguing systems were designed using a combination of standards and practices (ANSI-AIIM, 1995, BS6529, 1984, ISO2788, 1986). The MDA supports activities in support of thesaurus construction as well. It runs workshops and conferences on thesaurus design and use, and also publishes independent guides on thesaurus design for specific collections.
The DIA analysis of indexing options led to the selection of the Library of Congress' Thesaurus for Graphic Material I & II (TGM) for the project. The TGM matched the content of the WHFT collection; it conforms to international standards; and it could be amended whilst maintaining its overall structure.
The WHFT collection had been catalogued in 1996 by Larry J. Schaff, a renowned author on Talbot. The cataloguing was stored as a text database for use by scholars. The cataloguing system does not conform to any standards or published cataloguing methods (ANSI-AIIM, 1995, Piggot, 1990). Schaff�s data was exported to tab-delimited text, imported into a new WHFT database, and reformatted. The WHFT DIA operates from within a FileMaker Pro database and runs from a Windows NT server. Access to the material will be made available via a LAN intranet server on any computer platform that supports a browser using the Hypertext Mark-up language (HTML). The cataloguing system conforms to the MDA standard SPECTRUM, and the images are indexed using the Library of Congress' Thesaurus for Graphic Material I & II.
This pilot study has provided a good understanding of the process of digitising material and organising it within a database or collections management programme. For small organisations the task of digitisation is often impossible without a firm structure to organize limited resources. This project concentrated on creating a thorough methodology to govern every stage of production. If the funds had been available, collections management software would have saved a great deal of time. If the selected programme conforms to international standards for documentation and metadata, then the possibility of future integration is much higher. Scanning with documented calibration techniques and not manipulating to "taste" created raw output. The raw image output can then be "tweaked" for a particular application, such as an Internet site. Future uses of the image are not governed by decisions on colour and luminance made by the scanner operator today.
Overall, the information gathered during this study should be beneficial to the museum, archive, and library community. Publications on previous projects were used to great effect for this research. The RPS has recently been successful in their bid for National Lottery funding. They are now planning to continue their strategy of reorganisation and also make more images from the collection available to the public via the Internet. It is hoped that this will test the effectiveness of our jigsaw which should save the new team a lot of time and resources.
A list of publications regarding the WHFT DIA and related research is available on the Internet at <http://www.wmin.ac.uk/ITRG/>. More information on the Royal Photographic Society and their digitisation project is also available on the Internet at <http://www.rps.org>
Many thanks to Sophie Triantaphillidou for her work on digitisation and overall image quality. Also to Ralph Jacobson and Andy Golding for their supervision. Thanks to the RPS for their co-operation and collaboration in this project.
ANSI-AIIM TR40:1995. Suggested Index Fields for Documents in Electronic Image Management (EIM) Environments. American National Standards Institute and the Association for Image Management International, 1995.
Besser, H. The Changing Role of Photographic Collections with the Advent of Digitisation. The Working Group for Digital Image in Curatorial Practice. Kodak Eastman Ltd, USA, 1994.
Besser, H. and Trant, J. Introduction to Imaging: Issues in Constructing an Image Database. The Getty Art History Information Program, USA, 1995.
Birdsey, C., Golding, A., and Jacobson, R. The Effect of Digital Technology on the Control of and Access to a Photographic Collection, in Cultural Heritage Informatics: Selected papers from ICHIM99: Washington DC. Archives and Museum Informatics, pp. 210-213, 1999.
Blackaby, J. and Sandore, B. Building Integrated Museum Information Retrieval Systems: Practical Approaches to Data Organization and Access. Archives and Museum Informatics, 11, no. 2, pp. 117-146, 1997.
BS6529:1984. Examining Documents, Determining their Subjects and Selecting Index Terms. British Standards Institution, 1984.
Canale, R. and Wills, S. Producing Professional Interactive Multimedia: Project Management Issues. British Journal of Educational Technology, 26, no. 2, pp. 84-93, 1995.
Cornish, G.P. Copyright: Interpreting the Law for Library, Archive and Information. Library Association, GB, 1996.
Cringley, R.X. Accidental Empires. Penguin Books Ltd, UK, 1996.
Fitzgerald, S. Archives Cataloguing on Computer at the Royal Botanic Gardens, Kew: Using MARC, International Standards and UNICORN, Journal of the Society of Archivists, 16, no. 2, pp. 179-191, 1995.
Getty Information Institute and the International Committee for Documentation of the International Council of Museums (ICOM-CIDOC) Developments in Museum and Cultural Heritage Information Standards, USA, 1996. Internet publication at: <http://www.cidoc.icom.org/stand1.htm>. Last updated 29th July 1996. Site consulted 1st May, 1999.
Hollier, A. Computerised Finding Aids at the British Petroleum Archive, Journal of the Society of Archivists, 13, no. 2, pp.124-125, 1992.
Hopkin, D. Shifting the Focus: Digital Imaging and Photographic Collections Management at the National Railway Museum. Records Management Bulletin, 76, pp. 3-8, 1997.
ISO2788:1986. Documentation: Guidelines for the Establishment and Development of Monolingual Thesauri. International Organisation for Standardization, 1986.
Kenney, A.R. Preserving Archival Material Through Digital Technology. New York State Program for the Conservation and Preservation of Library Research Materials, USA, 1993.
Khoshafian, S. and Baker, A.B. Multimedia and Imaging Databases. Morgan Kaufmann Publishers, USA, 1996.
May, J. and Barnard, P.J. A Modest Experiment in the Usefulness of Electronic Archives. Behaviour and Information Technology, 15, no. 3, 1996.
Mohlhenrich, J. M. (Ed). Preservation of Electronic Formats and Electronic Formats for Preservation. Highsmith Press, USA, 1993.
Musalem, A.M. A Multimedia Database System. Managing a Virtual Collection of Art and Architectural Works, in Multimedia Computing and Museums. ICHIM 1995: San Diego, USA. Archives and Museum Informatics, pp. 39-56, 1995.
Museum Documentation Association. <http://www.mda.org.uk>. Site last consulted January 2000.
Philips, G., Crookes, D., and Juhasz, Z. QUIMaS (Queen's University Image Management System): A Museum Photographic Database, Journal of Information Science, 20, no. 3, pp. 161-174, 1994.
Piggot, M. The Cataloguers Way Through AACR2. From Document Receipt to Document Retrieval. The Library Association, UK, 1990.
Reilly, J.M. Care and Identification of 19th Century Photographic Prints. Eastman Kodak Ltd, USA, 1986.
Royal Photographic Society, The Royal Photographic Society Collection, GB, 1994.
Task Force on Archiving Digital Information. Preserving Digital Information. Commissioned by The Commission on Preservation and Access and The Research Libraries Group, USA, 1996.
UKOLN: the UK Office for Library and Information Networking and the National Council on Archives. Full Disclosure: Releasing the Value of Library and Archive Collections. University of Bath, UK, 1999.
Ware, M. Mechanisms of Image Deterioration in Early Photographs. Science Museum and the National Museum of Photography, Film and Television, UK, 1994.
Copyright � Clare L. Birdsey
|Top | Contents
Search | Author Index | Title Index | Monthly Issues
Previous story | In Brief
Home | E-mail the Editor
D-Lib Magazine Access Terms and Conditions