Safeguarding Digital Library Contents and Users

Digital Images of Treasured Antiquities

Henry M. Gladney, Fred Mintzer, and Fabio Schiattarella
IBM Research Division and IBM Italy
San Jose, California, Yorktown Heights, New York, and Rome, Italy
[email protected], [email protected], [email protected]

D-Lib Magazine, July/August 1997

ISSN 1082-9873

Abstract: An IBM collaboration with the Biblioteca Vaticana Apostolica is an early experiment to explore the technical, financial, and practical challenges of making illustrated mediaeval manuscripts accessible via the Internet. The Vatican collection is important to literary scholars, historians, and art historians because it contains not only seminal texts but also magnificent illustrations. We convey what the Vatican Library project and several other IBM joint studies--studies with el Archivo General de Indias de Sevilla (Spain), the lifetime collection of Andrew Wyeth's paintings, a collection held by the Hebrew Union College, and the Yale Beinecke Library--teach about administering intellectual property rights. The question is how to maintain both the intellectual integrity of disseminated images and also the reputations of the institutions making their collections digitally accessible.

First-rate management of source material and other intellectual property is more difficult and more controversial than anybody understood 5 years ago. Making images and text available via the World Wide Web and on CD-ROMs is easy; doing so in a way that preserves and enhances critical values is not, particularly when libraries need to make the new services pay for themselves rather than rely on subsidies.

Safeguards are nearly always imperfect. The most realistic objective is to make misuse economically unattractive. We proceed from specific aspects of protecting Vatican Library values to compare the concerns of and actions for several other antiquities collections, and convey not only the specifics of watermarks as a technical mitigation of risks but also how this fits into IBM's multifaceted response to a wide range of protection and quality concerns.


In IBM Research and Development, we have been investing in digital library inquiries and pilots for more than a decade. For the front-end components -- capture, presentation, and printing, this work is manifested in sometimes adventuresome pilot activities such as joint studies with Andrew Wyeth, the Spanish Archive of the Indies, and the Vatican Library. For the back-end components -- storage management and security services, we started with massive collections of administrative paperwork, such as tax return forms and insurance records, and gradually progressed to engineering records, always maintaining a clear eye on the most challenging target -- management of the recorded intellectual heritage of the world.

A recurrent and ever-strengthening theme in these studies is the care needed to protect the values now known as intellectual property rights. What we now understand about requirements was not anticipated when our work began in about 1985.

In retrospect, that information protection would be crucial is obvious. Digitization projects are done for preservation and for improved access -- to make valuable materials more readily accessible to more people with fewer administrative, physical, and accidental barriers to each user. Progress towards improved accessibility may be expected to open pathways to deliberate and inadvertent abuse of privileges granted. We must, therefore, balance efforts towards improved accessibility with efforts to help the curators of rare materials limit usage to what authors, copyright holders, information owners, or other authorities are willing to grant.

The best practices for digitization and protection depend on the nature of the materials to be converted and possibly on the kinds of usage projected; what is needed is different for music recordings, for scientific journals, and for works of fine art. Below, we focus on manuscripts of the 9th through 19th centuries. The source materials include some of the most valuable and most beautiful manuscripts ever created. They are rendered with diverse base materials (e.g., paper, parchment, deerskin), coloring materials, sizes, and shapes. Some are bound into volumes. Some are quite fragile. Capturing and preserving their content and beauty is a challenge to our scanning, image processing, and display technologies.

We give the most attention to the content of the Vatican Library Project (VLP). Other kinds of collection also need protection, but what needs to be emphasized varies not only because the source media are different, but also because the contents have different social values and evoke different abuses. For example, motion pictures have copyright protection and copies may have significant retail market value--properties that are not shared by 16th century commercial records. It was in VLP that we first encountered forcefully stated intellectual property issues. We are still working to improve the tools needed for it and other antiquity collections.

The Vatican Library Project (VLP)

A core VLP goal was to provide access to some of the Library's most valuable manuscripts, printed books, and other sources to a scholarly community around the world. The Vatican Library is an extraordinary repository of rare books and manuscripts. Among its 150,000 manuscripts are early copies of works by Aristotle, Dante, Euclid, Homer, and Virgil. However, because of the time and cost required to travel to Rome, only some 2000 scholars can afford to visit the Library each year. The VLP investigated the practicality of an Internet digital library service for scholars. The ambitious goals of this experiment included:

These goals were to be pursued through digitizing a significant set of manuscripts, making them available on-line using the Internet, and soliciting and recording the responses of participating scholars. Early in the life of the project, a Scholar's Advisory Committee was formed to advise the project's technical team, to select other participating scholars, and to select the manuscripts to be scanned. In addition, a number of these scholars were polled to determine what the system should provide to the user community. This led to requirements that the digital library system should provide:

To safeguard the Vatican Library materials, the system should:

From the beginning of the project, it has been clear that the manuscript images must have sufficient spatial resolution for easy text legibility and for detailed examination of their illuminations. In addition, faint detail such as handwritten pencil lines must be readily visible; this requires good tonal reproduction.

Figure 1: Watermarked image of a 16th century Flemish manuscript (Chig. C VIII 234 (fols. 19v, 20 r, music for a mass). These manuscript pages are blemished by Spanish coats-of-arms superposed by 17th-century owners.

Color illuminations (Figure 1) are an aesthetically-important part of many manuscript pages. Good color reproduction is therefore required. Just how valuable this is was drawn to our attention vividly by a 1995 Vatican Museum exposition of about 60 manuscripts. As usual, each manuscript was shown opened to a dramatically beautiful page. However, for the first time as far as we know, each displayed manuscript was represented by many more scanned pages and accompanying explanatory text on an IBM workstation mounted on a pedestal close to the manuscript display case. This VLP contribution extended the exhibition in dimensions never before feasible, and created an excitement that captured the enthusiasm even of the sophisticated special guests at an opening ceremony.

Starting with the Wyeth project, we designed and built image scanners (Figure 2) with the spatial resolution and color fidelity needed to record fine arts and manuscripts with at most imperceptible mechanical or radiation damage. The VLP images were scanned at 2600 by 3000 pixels, with 24 bits-per-pixel for color pictures and 8 bits-per-pixel for monochrome images. Almost all of the images were scanned directly from the original manuscripts (rather than from photographs of the manuscripts), except that very large pieces were scanned from transparencies. (An early experiment compared scans of original materials and of photographs of the materials. The scans of original sources had much more faithful color reproduction than those from photographs. We also found that the quality of available microfilms was often too low for good rendition.)

Figure 2: a Pro/3000 scanner designed for quality capture of manuscript pictures and a manuscript easel designed to avoid damage to bindings.

Some collection creation anecdotes are instructive. Early on, a scholar was shown a digital enlargement of an illumination that revealed a faint scene of some horsemen and a sleeping figure. He was amazed to discover this scene in the image. Another scholar was surprised to see that we had captured the texture of a parchment accurately enough so that she could tell which side of the parchment was being shown. Yet another scholar requested that whole pages be captured, including all margins.

Some scholars' comments later in the project reflected a belief that the image quality produced was higher than needed. Nevertheless we believe there are some aspects of images that are not strictly needed for study, but which convey the look and feel of the original and are important visual cues that are important for convincing a viewer that an authentic replica is being seen; these aspects include texture, shadows, page curvature, and page boundaries (see Figure 3), as well as a high level of detail and good tonal and color reproduction. Since long term technology evolution is hard to forecast, and since scanning is the most expensive part of the process, requiring extensive skilled workmanship and posing risks for the originals, we decided to scan at the highest resolution that could be achieved while also achieving the project's scanning throughput requirement, even though technological constraints force distribution of only much less detailed derivative images. This course has subsequently become a conventional wisdom.

The storage space needed for a high resolution image often exceeds 20 megabytes. This is 100 times too large for delivery across the Internet. To prepare images for the Internet server, they were both reduced in size, to 1000 by 1000 pixels, and compressed for on-line access; this resulted in typical data sizes of 150 kilobytes to 250 kilobytes. These reduced-data derivatives were found to be adequate for most pages. However, for physically large originals, such as maps, they were inadequate; in this case, the scanned images were only compressed, and not reduced, before being stored on the Internet server.

The Vatican manuscripts are priceless. Consequently, the team developed a manuscript scanning environment that minimized handling and avoided ultraviolet light damage. This scanning machinery was installed near the Vatican Library manuscript vault.

The images produced in VLP are valuable property of the Vatican Library. Protection was an especially keen concern because the images were to be made available via the Internet, which is not inherently secure. We were asked to provide protection which went beyond the usual means based on limiting physical access to the materials of interest.

A common protection for digital images is to supply only low resolution versions on-line; this has never been challenged by VLP users, because transmitting better images takes too long for interactive service. Low resolution images are of little use to would-be pirates. In some cases this is a satisfactory solution; but in VLP, it was not enough because on-line dissemination was intended to support research in history, art history, and related disciplines. As an example, several of the thumbnail images shown in this Web page are backed up by a medium resolution version accessible by clicking on the thumbnail. Such medium resolution images (1000 by 1000 pixels) are sufficiently good to invite commercial applications; they would be attractive on commercial web sites, would make excellent printed images at column width, would be adequate for use in CD-ROM multimedia presentations, and would be more than adequate for printing on T-shirts, coffee mugs, or other mementos. It is easy to imagine other uses which might embarrass the Library, or deprive it of otherwise likely revenue.

Figure 3: Genesis excerpt from a 15th century French volume.

At the request of Fr. Boyle, then Prefect of the Vatican Library, we developed and refined a visible watermarking technology, whose application is illustrated in Figures 1 and 3. The objective is an indelible ownership mark on each sensitive image, without hiding any visual information needed to support scholarship. Our watermarks have three core features:

IBM Joint Work with Other Cultural Collections

The following subsections appear in approximately the order in which the studies were started. Further studies, at Tsinghua University and Fudan University in China, at the Japanese Museum of Ethnology, and at the Hermitage in Leningrad, are omitted because of time and space constraints.

The Lifetime Work of Andrew Wyeth

Andrew Wyeth's staff gathers, organizes, and maintains information on his paintings, which they use to produce or support publications or showings of his art. They also use this information to answer any questions about his art, such as authenticity, that come to them. The IBM/Andrew Wyeth project was begun in 1987 to satisfy this staff's desire to use digital images to improve their daily operation and to satisfy IBM's desire to explore computer applications of high-quality digital imaging.

A core objective was an image-assisted digital catalogue raisonné of Wyeth's works, which then numbered in excess of 10,000 pieces. Because of the collection size, a database was required to manage the textual information, which included information on provenance, on exhibits, and specifically about the paintings. A text-based search facility was also provided. In retrospect, what was developed had many similarities to museum collection management systems.

For the catalogue raisonné, the imaging requirements were modest; the images needed to be of sufficient quality so that a painting could be recognized from the representation. Other imaging requirements were more challenging, as it was also required that the system be able to produce digital images of sufficient quality for multimedia presentations or for printing coffee-table-quality books on Wyeth's work. The image quality required for printing is, of course, quite high. It entails not only high spatial resolution, but also good tonal reproduction, good color reproduction, good contrast, and good sharpness. But, indeed, the printing requirement for this project transcended what one would normally think would be required of printing, as the images were also required to satisfy the artist's vision of what his painting should convey.

As already mentioned, the scanning system for VLP and other projects originated in the Wyeth project. In addition to factors touched on elsewhere, software was developed to undo color fading that had degraded some of the older elements of the set of transparencies that were the source of digital copies. Pictures were scanned at a spatial resolution of 2500 by 3000 pixels and a color depth of 24 bits-per-pixel, and were color calibrated. By 1987 standards, the quality of these images is outstanding; by today's standards, the quality is not exceptional. With modest reprocessing, these images could be used today for some glossy publication or to produce a dazzling web site.

Wyeth's staff understood the marketplace value of the images and the potential for misuse. In particular, they were concerned with poor quality reproduction; that is part of the reason we have not included any samples in this Web site. Still today, this on-line collection is protected against unauthorized re-use by tight enough physical security so that other measures we touch on have not been employed.

El Archivo General de Indias de Sevillia (AGI)

The work towards a digital library for the Indies Archive of Seville (AGI), Spain is among the earliest practical digital libraries. The Indies Archive consists mostly of Spanish commercial records related to the Americas between the fifteenth and nineteenth centuries. AGI still today boasts one of the largest collections of digital images of cultural works. (Digital collections of commercial records and of instrumental measurements, such as space probe data, are much larger than digital collections depicting artistic and historical antiquities.)

A key rationale for the project was preservation, because handling of the often fragile Archive contents is threatening their longevity. In contrast to VLP, AGI did not seek network distribution of images and still today does not extend to this. Nevertheless, improved information access for scholars is dramatic, enabling significant user productivity gains.

The AGI project began in 1985, almost exactly 200 years after King Charles III of Spain requested that documents scattered among Simancas, Cadiz, Seville, and Madrid be collected into a single archive. To coincide with the 500th anniversary of Columbus' landfall in the West Indies, the AGI project was to capture 10% of the collection estimated to consist of 86,000,000 pages. By 1992, it had indeed collected about 9,000,000 digital image pages onto optical disks, together with a set of finding aids. A beautiful sample of the collection is available on a CD-ROM which includes more technical detail than we can reflect.

Prior to this project, almost all of archive computerization projects were directed at finding aids (guidebooks, inventory completions, and the like) with the objective being publication of hard copy. In contrast, the AGI project had a much more ambitious objective: to develop the first computerized system integrating all the functions of a historical archive. This included, of course, providing automated access to collected descriptions of holdings, which the project team regarded the most important task, but not the most spectacular or innovative. The proposed objective was the creation of an information system capable of gathering and administering all the descriptive information in the Archive (not only the newly-created information but also that which already existed). For 1985-92, this was an ambitious objective, but it has been achieved.

From the point of view of the Spanish Ministry of Culture, this project was a "pilot" for later work in a score of other Spanish collections. When the project agreements were signed, the Ministry postponed most of its other historical archive computerization. Some detail of the digitization process and how the digital library improves access for many scholars is available on the CD-ROM.

Between 30% and 40% of the documents pose legibility problems, due mainly to their great age and rough handling. The damage includes faded ink, stains, and seepage of ink from the reverse side of documents with writing on both sides ("bleed-through"). Sometimes such damage makes it is extremely difficult for a scholar to read the document.

The project team investigated procedures to remedy these problems, and built into software the best options they found. Of course, the scanned images are stored at full capture resolution. Capture itself necessarily involves information filtering, both because it is a spatial sampling process and because wavelength filtering always occurs. For uncolorized document pages this provided the opportunity of selecting spectral bands which minimized illegibility. Researchers are offered two types of control to improve document legibility on the screen and for printing:

Figure 4: Examples of ink bleed-through reduction and of stain removal.

Of course the system offers well-known zooming and selective display and printing services, and all the advantages of rapid switching between search services and image viewing services. Together these procedures provide dramatic productivity improvements to historical researchers.

Archivos y Bibliotecas technology has evolved since the completion of the project described. Nevertheless, because of computing performance constraints, the Spanish team still think that the approach taken at Seville -- 100 dpi and 16-level gray scale is sufficient for screen images of high quality for reading and for applying legibility enhancement algorithms.

Lutherhalle in Wittenberg, Germany

Lutherhalle is dedicated to the history of the Protestant Reformation (1517-21). It contains materials read and written by Martin Luther, as well as items written about him, both in his own time and later. It has rooms where Luther lived as a monk, in which he taught, and in which he lived later in life with his wife. Some of its most treasured possessions are Reformation-era pamphlets that aimed to win the hearts and minds of the believers.

The intellectual property and image quality circumstances at Lutherhalle are sufficiently similar to those at the Vatican Library that further comment is not needed. However, we include three Lutherhalle images below to suggest what the collection contains, and also because we are enthusiastic about their beauty.

(JPEG, 245K)

(JPEG, 220K)

(JPEG, 180K)

Figure 5: Bookplate crest of the Scheurl family. Christoph Scheurl, Rector of Wittenberg University, was a friend of Martin Luther. He used this woodcut in all his books.

Figure 6: First edition of a song book printed by Joseph Klug in 1529. It included "A mighty Fortress is our Lord," the most popular hymn composed by Martin Luther. Luther translated Latin hymns into German and was a composer himself. Since 1828, all Luther's books had been considered lost. In 1932, Lutherhalle bought this book of hymns for 80 Reichsmark. Later it was identified as an original edition.

Figure 7: Holy Bible printed in 1582 by Hans Lufft, Luther's preferred printer. Quite possibly it was collaboration with Luft that enabled Luther to popularize his case. The full picture shows Luther and the Elector praying at the cross. The sky is decorated by St. Matthew as an angel, St. Mark as a lion, St. Luke as a bull, and St. John as an eagle.

Klau Library of the Hebrew Union College

In a later experience with Hebrew Union College, the visible watermark was used to mark on-line images. Here, the visible watermark is deemed useful in another way. It serves as a reference to the library which supplied the materials.

All of the images were scanned with the same type of scanner illustrated in Figure 2. All of the images available have been reduced and compressed for access on the Internet. The attitude of the curatorial staff at Hebrew Union was different than others suggested above; they do not object to reappearance of their material in unexpected surroundings.

A few manuscript images from Hebrew Union College are present in an IBM Research departmental home page. They include images from The First Cincinnati Haggadah, Ms. 444, a magnificently illuminated work on parchment produced in Germany in the late 15th century by Meir Jaffe ha-sofer, a copyist, illuminator and renowned leather tooler. We include two pages below.

image of manuscript image of manuscript
Figure 8. This page focuses on "YaKNeHaZ", an acronym comprised of the initial letters of: Yayin (wine), Kiddush (santification), Ner (light), Havdalah (separation), Zeman (time). It indicates the correct sequence of blessings when the eve of Passover coincides with the conclusion of the Sabbath. That the abbreviation sounds similar to the German "jag den Has" (hunt the hare) is motivation for the hunting scene. Figure 9. Or le-Arba'ah 'asar (On the morning of the fourteenth), using a feather, a man cleans up crumbs of leaven from a citron yellow cupboard. These crumbs will be later burned, accompanied by a statement repudiating any leaven remaining in the household. Birds native to Germany have gathered, perhaps hoping to snatch crumbs the householder might drop.

Yale University Beinecke Rare Book and Manuscript Library (BRBL)

IBM work for and with the Yale Beinecke (BRBL) collection began about a year ago. BRBL intends to use digital library technology to extend its support for scholarly and classroom use by making it easier and less expensive for patrons to obtain reproductions. BRBL intends world-wide access eventually, but is limiting use to the Yale campus during its first year of digital services, partly to give itself time to understand intellectual property issues.

The initial focus for digitization has been about 15,000 black and white photonegatives, 80 papyri (in color), and a handful of color transparencies. The photonegatives represent the entire range of BRBL material--pre-1600s illuminated manuscripts, Western American photographs, modern manuscripts, and papyrus. 2000 photonegatives are of Western America. The rest represent a working collection created as patrons requested copies of material over the course of the last 15 years or so. Each patron received a print and BRBL kept the negative.

The intellectual property rights management concerns are similar to those of the Vatican, e.g., preventing the re-use of images in popular art, with additional consequences of the fact that part of the collection is subject to copyrights held by individuals outside Yale. That is, BRBL is concerned to maintain and enhance its good name and to maintain the possibility of cost recovery by fee services. The only protections (beyond the usual physical ones) BRBL is using are watermarking and, for the first year of service, limiting Internet access to on-campus workstations identified solely by IP address checking.

BRBL has not yet decided whether or not watermark usage will be continued indefinitely. It sensed that it would be easier to start by watermarking all Internet-accessible copies and possibly later providing unmarked copies than to start with clear copies and possibly later impose marking. The BRBL team feels that marking does not reduce the usefulness of text images, but that whether it interferes with some applications of some photographs is yet to be determined.

Intellectual Property Rights -- Environment and Challenges

We have for several years been collecting customers' statements of intellectual property protection needed, analyzing the tools deployed and the techniques available, and are devising a modular framework within which each tool can be implemented just in time for the applications it enables. We have a technical plan for offerings which will impede all likely forms of deliberate and inadvertent abuse. Through pilot implementations (as in an IBM/Case Western Reserve University joint study implementing a prototype permissions manager), we have been exploring the intricacies of applicable practice and are developing prototype components for certain service and data types.

Which of about a dozen constraints against information misuse is helpful depends on the kind of data and environment, the kinds of acceptable use of the data, the anticipated threats, and the policies and objectives of the data owners. Above, we have indicated that watermarks are helpful for Internet-accessible manuscript collections because they discourage unauthorized reproduction. Some curators also value them because they signal image origin; while this would be trustworthy in a benign world, it is in fact easy to copy a watermark onto contents from other sources, just as unauthorized manufactures are sold with Gucci handbag labels.

For curators who wish to prove authenticity and provenance to their clients, we have already described Cryptolope© machinery to guarantee content authenticity and provenance. We are preparing another D-Lib article to describe in more depth Cryptolope technology which IBM is preparing to offer later this year. The basic idea is to encrypt each set of related files (e.g., an HTML text and image files it links into itself) in a package which includes supporting administrative information and cryptographic keys which only authorized clearance centers can unlock. This mechanism will also support limiting usage to organizations who have paid subscriptions, and future versions will support various forms of pay-per-use access and other copyright privileges.

Readers' privacy is another value which may be wanted for certain uses of manuscript data. Colleagues have described one way to provide such privacy by adding to the computing network isolating way-houses we call "campus servers". IBM is also working on other protection mechanisms, but these are not discussed in this article because we do not think them of current interest to manuscript scholars.

Some Comments on Document Images and Marking

Markings for Different Media and Applications

Marking can be obvious or hidden (not easily seen without instruments), as on U.S. currency, or secret (not apparent or interpretable without comparison with an unmarked original). Marking can be used to identify the source of a document (often called a watermark), or to identify to whom the library delivered a document (often called a fingerprint). Marking can either overstrike the most important information or be relegated to a margin. Methods of marking must be different for different kinds of data; what works for a photograph is different from what works for a radio performance. Moreover, choice of document marking might depend on anticipating the user's intent; visible marks are unacceptable for a photograph to be reproduced in a magazine.

The current article has illustrated only translucent visible watermarks on raster images, because this technology is the one best suited to Internet distribution of images for which piracy should be hindered. We have illustrated that such marks can be made unobtrusive, both by allowing their prominence to be adjustable as they are applied by curators and by choosing their patterns to be thematically related to the collection. For example, the Lutherhalle mark was patterned after the "Luther rose," the historical symbol for Martin Luther. The VL mark is patterned after the VL seal.

Since this article focuses on European manuscripts, its examples are mostly raster images of manuscript pages. We emphasize that the IBM Digital Library product already supports many different document formats, including high-fidelity music, full-length videos, formatted text, and most digital text formats in common use.

A Controversy about Marking

Computing service organizations providing a new kind of service or extending an existing service to a new class of content often hear a criticism which is as perplexing as it is well known -- dissatisfaction with interactive response times measured in minutes or even seconds for information access that previously cost a cross-campus or even cross-Atlantic trip. The amount and quality of information provided on each screen or print presentation seem not to matter. Until the perceived response time is as fast as the time to flip a page when the volume is at hand, even sophisticated and relatively tolerant users want more speed.

Some viewers of watermarked images evince a similar reaction. We have received many comments about the visible watermark and its utility. While most people find it unobtrusive and some do not notice it at all, a few find it truly offensive. These viewers are dissatisfied with any perceived image quality lower than what they imagine the original source provides. The extreme ones are not placated by explanations that, without watermarks or some yet-to-be-invented means providing similar protection, the images simply will not be made available via the Internet. In the example of the VLP, the team members from IBM find mentioning that marking was demanded by the Library provides little defense against what borders on personal attacks questioning professional ethics.

We know no way to avoid either the response time criticism or the image adulteration criticism, and find it best to smile weakly and let the incident pass. Marking does impair the artistic value of images; we find it better suited to some collections, such as manuscript pages, than others, such as art collections. For collections of uniquely artistic materials, an invisible watermark is almost always requested. (We are preparing to discuss invisible watermarks in a D-Lib article in the late autumn.)

The reader may be interested in a reminder that marking images to indicate ownership is hardly new. For instance, Figure 1 illustrates how 17th century-owners added Spanish coats-of-arms by over-painting 16th century Flemish music. Modern collections sometimes value the same device, having their watermark remind readers where Internet pictures come from.

Blurring the Distinction between Protection and Aesthetic Quality

The first motivations for high quality digital rendering of the contents of archives, libraries, and museums creating digital libraries are conveying adequate legibility for scholarship and representing the artistic values of the contents. These objectives are closely followed by considerations of authenticity, of the good name of the holding libraries, and of ensuring that any revenues consequent on digitization flow to the libraries to fund further digitization and other efforts to preserve and make accessible further contents. We find that both the means and the objectives of marking get intertwined with aesthetic issues.

In this article, we have not yet included any links to pages made available at the full scanned resolution. One kind of data for which resolution reduction obscures essential information is geographic maps, so for some of these the Vatican Library has decided to make unreduced images available. The Vatican Library page from which the following thumbnails, and the page sections they link to, illustrates the level of detail being captured in all the original images from which Web versions are derived.

Figure 10. 184K JPG stylized map of Venice Figure 11. 200K JPG detail from map of Venice


This article attempts to convey the excitement and the lessons we have learned in the Vatican Library Project and in about 10 other IBM joint studies centered on antiquities around the world. A common thread is the importance of maintaining the intellectual integrity both of disseminated manuscript images and of the reputations of the institutions making their collections more accessible to scholars and to the public at large.

Protecting intellectual property is currently a "hot" topic for scholars, for university administrators, and for the media and publishing industries. This has drawn lawyers, legislators, and computing service professionals. Unfortunately, the topic is often construed relatively narrowly -- as copyright privileges and whatever is needed to extract service revenues. More is needed to protect critical values sometimes taken for granted, such as the right of readers to privacy, the right of authors to faithful representation, and the right of enterprises to confidentiality. For fine arts and manuscript sources, quality of representation in the Internet and other digital image distributions is as important to the curators as aspects more commonly considered intellectual property rights. This can be seen as an extension of an author's right not to be misrepresented. For this reason, and also because the methods for good reproduction are closely associated with some methods of protection, we feel a blurring of the distinction between quality reproduction and intellectual property rights protection.

In related articles, both existing and yet-to-appear, we have identified techniques for mitigating most risks that librarians and other users have brought to our attention. Some of these are addressed in current IBM offerings. For other risks, we have devised solutions which are not yet implemented simply because customers do not yet need them. Still other risks are the topics of active investigation. IBM intends to provide the technical components of complete and convenient protection systems.


This material is drawn from the work of many colleagues, including most prominently J. Barker, J. Bescos, L. Boyle, F. Giordano, M. Kelmanson, L. Pavani, M. Treu, S. Weaver, A. Wyeth, and their associates.

IBM and the individual members of the IBM team have been delighted to work with the Vatican materials not only because we expected the technical challenges would be as great as those for any other collection, but also because the special role of the Vatican collection in Western civilization permit us to contribute social value beyond what might be possible with any other collection.


Alonso92. V. Cortes Alonso, The Archive of the Indies, Manuscripts XXI(1), 2-10, (1969).

Bescos89. Julian Bescos, Image processing algorithms for readability enhancement of old manuscripts, Intl. Electronic Imaging Exposition and Conf., Pasadena, 1989. Electronic Imaging '89, Vol. 1, 392-397, (1989).

Bescos90. Julian Bescos, Juan P. Seçilla, and Juan Navarro, Filtering and compression of old manuscripts by adaptive processing techniques, Intl. Symp. Society for Information Display, Digest of Technical Papers, 21, 384-387, (Las Vegas, 1990).

Braudaway93. Gordon Braudaway, Restoration of Faded Photographic Transparencies by Digital Image Processing, Proc. IS&T's 46th Annual Conference, 287-289, (May 1993).

Braudaway96. Gordon W. Braudaway, Protecting publicly-available images with a visible image watermark, SPIE/IS&T Intl. Symp. on Electronic Imaging Science and Technology, Feb. 1996, Proc. Optical Security and Counterfeit Deterrence Techniques, 126-133, (1996)

Choy96. D.M. Choy, J.B. Lotspiech, L.C. Anderson, S.K. Boyer, R. Dievendorff, C. Dwork, T.D. Griffin, B.A. Hoenig, M.K. Jackson, W. Kaka, J.M. McCrossin, A.M. Miller, R.J.T. Morris, and N.J. Pass, A Digital Library System for Periodicals Distribution, Proc. ADL96, (May 1996).

GladLots97. H.M. Gladney and J.B. Lotspiech, Safeguarding Digital Library Contents and Users: Assuring Convenient Security and Data Quality, D-Lib Magazine, (May 1997).

Gladney97. H.M. Gladney, Safeguarding Digital Library Contents and Users: Access Control, D-Lib Magazine, (June 1997).

Lotspiech97. J. Lotspiech, U. Kohl, and M.A. Kaplan, Cryptographic Envelopes and the Digital Library, IBM Research Report RJ 10069, (1997). To be presented at Verlässliche Informationssysteme, German Computer Society (GI), Freiburg, Germany, (Sept. 1997). Safeguarding Digital Library Contents and Users: Cryptographic Envelopes, D-Lib Magazine, (being prepared for September 1997).

Mintzer91. F. Mintzer and J.D. McFall, Organization of a System for Managing the Text and Images that Describe an Art Collection, SPIE/SPSE Intl. Symp. Electronic Imaging Science and Technology, (Feb. 1991); Proc. Image Handling and Reproduction Systems Integration Conf., 38-49, (1991).

Mintzer92. F. Mintzer, Y.L. Yao, and J.D. McFall, A Computer System for Scanning and Cataloging the Art of Andrew Wyeth, Spectra, the Magazine of the Museum Network, 9-14, (Summer 1992).

Mintzer96. F.C. Mintzer, L.E. Boyle, A.N. Cazes, B.S. Christian, S.C. Cox, F.P. Giordano, H.M. Gladney, J.C. Lee, M.L. Kelmanson, A.C. Lirani, K.A. Magerlein, A.M.B. Pavani, and F. Schiattarella, Towards On-Line Worldwide Access to Vatican Library Materials, IBM J. Research and Development 40(2), 139-162, (March 1996).

Seçilla95. J.P. Seçilla et al., The Documents of the New World, CD-ROM available from Archivos y Bibliotecas, AIE, Apdo. de Correos 179, 28080 Madrid, Spain; ISBN 84-605-0430-1.

Yao91. Y.L. Yao, F.P. Giordano, H.S. Wong, W. Kang, and J.C. Lee, Design Considerations for a High Quality Camera Type Scanner, Proc. IS&T 7th Intl. Congress on Adv. Non-Impact Printing Technology, Portland, Oregon, 441-450, (October 1991).

© Copyright IBM Corp. 1997. All Rights Reserved.

Copies may be printed and distributed, provided that no changes are made to the content, that the entire document including the attribution header and this copyright notice is printed or distributed, and that this is done free of charge.

We have written for the usual reasons of scholarly communication. This report does allude to technologies in early phases of definition and development, including IBM property partially implemented in products. However, the information it provides is strictly on an as-is basis, without express or implied warranty of any kind, and without express or implied commitment to implement anything described or alluded to or provide any product or service. IBM reserves the right to change its plans, designs, and defined interfaces at any time. Therefore, use of the information in this report is at the reader's own risk.

Intellectual property management is fraught with policy, legal, and economic issues. Nothing in this report should be construed as an adoption by IBM of any policy position or recommendation.

D-Lib Magazine |  Current Issue | Comments
Previous Story | Next Story