D-Lib Magazine
May 2003

Volume 9 Number 5

ISSN 1082-9873

Keepers of the Crumbling Culture

What Digital Preservation Can Learn from Library History


Deanna Marcum
Council on Library and Information Resources

Amy Friedlander
Council on Library and Information Resources

Red Line



Technological advances come equipped with both affordances and unintended consequences. Preservation has its own well-known examples of technological advances leading to problems for librarians to solve:

  • "Brittle books," the product of nineteenth-century technological innovations that had produced inexpensive paper;
  • Bulk microfilming techniques employing scanning equipment that required the spines of books to be cut off;
  • Reformatting — or "re-purposing" as our colleagues in the music industry call it — of analog tapes to digital, only to discover that the digital media are more fragile than we realize.

Preservation of new media is even more problematic than preservation of earlier ones. By one estimate, as much as half of the global motion picture library may become inaccessible in 10 to 15 years because the storage media have degraded [1]. To understand how preservation will be addressed in the 21st century, we need to look at how librarians have responded to preservation needs in the past and ask what can we learn from that experience to enable our culture save in usable form its proliferation of electronic information. One major change is the shift from preservation of the medium or the physical artifact — that is, the book or the reel of tape — as a means of preserving content, to preservation of content that may be platform-independent together with meaningful access to that content.

Who Is Responsible?

Consider this lengthy quotation, dated 1900, from Ainsworth Rand Spofford, then Librarian of Congress:

No one who does not know how to use the odd moment is qualified for the duties of a librarian. I have seen, in country libraries, the librarian and his lady assistant absorbed in reading newspapers, with no other readers in the room. This is a use of valuable time never given to be indulged in during library hours. If they had given those moments to proper care of the books under their charge, their shelves would not have been found filled with neglected volumes, many of which had been plainly badly treated and injured, but not beyond reclamation by timely and provident care [2].

Anachronistic images in this century-old text notwithstanding, Mr. Spofford's basic insight rings true today: preserving collections is an integral component of managing libraries. Spofford may have been a bit off in calling (as he did in the work cited above) for keeping the library's temperature at seventy degrees Fahrenheit. But even back in 1900, he recognized the need for regulating the internal environment, for good housekeeping, and for pest controls. He also warned of the damage that collections could suffer from inappropriate use, carelessness, and even malice. He waxed particularly eloquent about vandalism and theft, citing a "custodian" of a public library in Albany who had reported that all the plates were missing from certain books and that poetry and illustrations had been cut out of magazines left on tables. "Strange to say that many of these depredations were committed by women," Spofford added. But men, too, came in for a share of his criticism, particularly for thievery. On that score, Spofford cited a librarian in Massachusetts who said that "it was common experience that clergymen and professional men gave the most trouble" [3]

As Librarian of Congress, Spofford's message reflected the importance that leaders in the library profession had come to attach to preservation. But his remark also demonstrated that many libraries made preservation less than a core activity, something that could be neglected by librarians more interested in reading newspapers than in safeguarding them. Today, preservation suffers more from inadequate funding than from negligent librarians, and today's libraries have much more material than in Spofford's time, both in quantity and in kind, that needs preservation.

Part of the problem reflects our uncertainty in this country about who is responsible for preservation of library resources. In other developed countries, a national library typically has responsibility for acquiring and preserving a country's published output. Our Library of Congress serves as a national library in some respects but has no universal preservation charge for the nation — although it has recently been assigned the responsibility for forging an infrastructure to support long-term preservation of digital content through the National Digital Information Infrastructure and Preservation Program (NDIIPP), which was reported in this magazine a year ago.

Historically, the Library of Congress has been first and foremost the Congress's library. The personal library of Thomas Jefferson — the core from which the Library of Congress grew — was a broad collection but certainly not an assemblage of all books published in the United States. While the Library of Congress attempts to collect widely, and even comprehensively in many subjects, following a collection policy like every other American research library, it does not collect in such fields as medicine and agriculture (the responsibilities for which are handled by the National Library of Medicine and the National Agricultural Library, respectively).

What the Library of Congress particularly preserves tends to be its special collections — those unique maps, manuscripts, photographs, films, radio broadcasts, and materials in other formats held only by the Library of Congress. There is not now, nor has there ever been, a national library in this country that takes responsibility for the preservation of American publications overall. So what gets preserved is a result of individual decisions by widely distributed research libraries throughout the country. What can be said about the results?

Books and Paper: Permanence and Durability

Through the twentieth century, preservation methods focused initially on achieving permanence and durability for two core kinds of materials in library collections: the book and the newspaper. Technological innovations in the mid-nineteenth century had produced a less expensive kind of paper made from wood pulp, which made possible an explosion in printed materials that contributed to public literacy and to the organization of public libraries with print collections. The unintended consequence of the new paper technology was the later "brittle book."

The chemistry of paper and of paper processing was not fully understood at the time. Indeed the chemistry of paper was the subject of fairly substantial research in the early twentieth century. The new, cheaper books, which were a boon to literacy and libraries, turned out to deteriorate fairly quickly. Spofford and his contemporaries already had seen books begin to decay. In 1897, his successor as the Librarian of Congress, John Russell Young, took note of the "grave questions arising out of the modern conditions in the manufacture of paper" [4]. Young presented a plan to Congress calling for materials presented for copyright protection to be printed on good-quality paper. He wrote:

While this [cheap paper] might be dismissed as one of the necessary developments of business and a result of modern invention, there is no reason why libraries should not be protected. The expense of printing a few copies of any publication — the matter in type and on the press — would be a trifle. A remedy for the anticipated evil could be found in an amendment that no certificate of copyright should issue until the articles copyrighted were deposited and, at the same time, printed on paper not below a fixed grade. There would be no hardship in this — a small advance upon the cost of a few sheets of paper and a moments delay in the pressroom. It seems feasible and could be in no sense a grievance when we consider the value of the protection accorded by the copyright. Such a provision would assure the permanence of so much of the collection of this and other great libraries, that it is earnestly commended to the attention of Congress [5].

The plan was not implemented, and what we came to call "the brittle books problem" grew.

Late-nineteenth century librarians like Mr. Spofford could see that parts of their collections were crumbling but did not fully understand the scientific reasons. But by 1910, federal scientists had begun to explore the chemistry of paper and paper products, and these efforts continued. Some in the professional library community tried to stay abreast of the research. Nonetheless, for most librarians in Spofford's professional lifetime, binding and repair were conservators' primary preservation activities, along with managing library buildings protectively, training staff in the proper handling of materials, and trying to persuade patrons not to mark, abuse, or steal the books. Little systematic effort occurred in professional librarianship programs or in libraries themselves to address the source of the problem until the late 1950s and the 1960s [6].

In that period, the Council on Library Resources (CLR) made a number of grants to support preservation efforts. CLR sponsored research by W. J. Barrow into the causes of paper deterioration and the possibilities for development of a "permanent/durable" text paper. Barrow horrified the library world when he reported research indications that only three percent of the books published between 1900 and 1949 would last more than fifty years. In 1960, the Association of Research Libraries (ARL) appointed a Standing Committee on the Preservation of Research Library Materials, which recommended establishing a central federal agency to preserve a physical copy of every "significant written record" and to provide copies to other libraries as needed. Microfilming, in use in a few libraries as early as the 1930s, was endorsed as a method to preserve rare and endangered texts and to reduce use of fragile originals [7].

Televised images of the great floods in Florence of 1966 further increased awareness of preservation among research libraries and collections in the U.S., provoking research into preservation, conservation, and disaster recovery methods for books and other artifacts. Many preservation specialists went to Florence to help save materials and learned much about preservation techniques in the process.

The 1960s also saw the inauguration by Chicago's Newberry Library of the first local conservation program, the organization of professional associations for conservation, the introduction of formal training programs for conservators and museum professionals, and more publications dealing with preservation and conservation in libraries and museums [8]. In 1972, the Library of Congress established a research laboratory, headed by John C. Williams, which spearheaded work on diethyl zinc de-acidification, work that continues to be important at the library. More recently, the Library of Congress awarded contracts to Pittsburgh-based Preservation Technologies, L.P. (PTLP) for de-acidification. Other preservation methods have been explored at the Barrow Laboratory and the Public Archives of Canada. Institutions of higher education such as Carnegie-Mellon University and the Rochester Institute of Technology set up centers for studying the chemistry and technology of paper, binding, paints, inks, and varnishes [9].

Realizing that the cycle of deterioration would not end unless alkaline paper was adopted in book manufacture, the library community worked with other concerned communities and standards-setting organizations to promote guidelines for and use of durable paper. The Council on Library Resources initiated work on production guidelines for book longevity in 1979. A committee, headed by Herbert Bailey of the Princeton University Press, made important studies of "permanent" paper and binding. And in 1981, the Z39 Committee of the American National Standards Institute drafted a standard for permanent paper for printed library materials.

In the 1970s, major university libraries began to undertake large, cooperative preservation projects. An important by-product was that preservation departments became commonplace in research libraries. Unfortunately, institutional preservation programs were not generally funded as a regular part of the research universities' budgets. Instead, many preservation efforts were made possible by external funding from the National Endowment for the Humanities (NEH) and private foundations, especially The Andrew W. Mellon Foundation.

In the mid-1980s, both the book-preservation problem and the effort to address it took on new dimensions. CLR spun off the Commission on Preservation and Access to give brittle books special attention. The commission estimated that at least 3.3 million and possibly more than ten million books remained at risk in the U.S., and that de-acidification would cost between $250 million and $1 billion [10]. NEH expanded grant-making for preservation, setting a goal of helping research libraries microfilm at least three million of the most at-risk volumes. NEH also promoted standards and guidelines for microfilming so that individual library preservation activities would produce a decentralized but nationally accessible collection of works on durable microfilm. Book deterioration may be slower than was projected, but microfilming seemed essential at the time to ensure preservation of the content of millions of valuable texts that librarians feared would crumble on research library shelves. Librarians also recognized the need to conserve unique paper artifacts in special collections.

The biggest debate during the brittle books era focused on centralization. Patricia Battin, president of the Commission on Preservation and Access, believed that centralization would finally link preservation and access. She argued for the creation of a central repository for microfilm masters that would serve as an archives as well as a distribution system. Any library microfilming a book would add a permanent master copy and a service master to the central repository. Any library wishing to add a preservation copy to its own collections would be able to acquire, at cost, a copy of the service master. Though NEH funding encouraged collaboration among libraries, centralization lost out because libraries aspiring to research status established their own individual preservation offices, and preservation became a recognized career track within individual institutions.

Federal and philanthropic funding continues to support preservation within individual libraries and archives rather than to promote centralization of preservation in a single institution or handful of institutions. Thus, NEH has supported microfilming nineteenth century newspapers state by state and the National Agricultural Library has supported microfilming projects in state libraries to protect endangered materials documenting agricultural history [11]. Not surprisingly, this decentralized and local approach coincided with the local history movement in the 1980s and increased attention to conservation and preservation of local heritage resources of all sorts. NEH supports brittle-book microfilming at a number of university libraries that avoid redundant effort by sharing information about materials scheduled for filming [12].

Two years into the twenty-first century, then, what can we say about preservation in libraries? First, preservation has become part of the management of a modern library, embracing not only the reformatting of texts but also the improvement of library buildings and their internal environments, the conservation and repair of items in library collections, the adoption of commercial standards for books and paper, the education of readers and staff about handling and caring for fragile materials, the training of librarians in preservation, and the adoption of disaster planning and recovery strategies [13]. Second, we have learned a lot from scientific study about library materials and how to manage them, and awareness of such research must become part of professional librarians' training and library management. Third, effective preservation programs require organizational coordination to ensure an appropriate and consistent standard, to reduce unnecessary redundancy, and to enable future users to find the materials. Fourth, preservation increasingly requires value judgments about whether to retain an original book, journal, or other artifact or just its intellectual content. That question is growing as libraries confront a huge additional set of preservation concerns and challenges associated with digital content.

Why Digital is Different

Digital technology causes us to look at content in new ways. Our brief synopsis of the history of preservation in libraries centered on the book, arguing that the degradation of books and newsprint, which was an unanticipated adverse consequence of technological innovation, provoked the modern, scientific study of preservation requirements. One solution has been re-formatting, first to microfilm and subsequently to digital media. High-quality film is a relatively stable medium. Microfilm readers are simple, if not always pleasant, to use, and conscientiously microfilmed images of printed artifacts — including the binding, the pages, the illustrations, and even surviving handwritten marginalia — convey information that these artifacts frequently embody beyond their texts. More recently, the rise of electronic publishing, first in creating scholarly e-journals and then in digitizing rare texts, decoupled content from the artifact, so that information became separated from its original vehicle and expressed in a new form — as text on a screen that might look different depending on which computer program presented it.

For at least some forms of communication, such as scholarly journals, electronic delivery of information in digital form has been a godsend. Researchers quickly recognized the convenience of having electronic information delivered to their desktops and being able to search it. Librarians grew concerned, however, over questions about the stability of electronic materials. Concern that long-term preservation represented an early barrier to the acceptance of electronic journals as a form of publication equivalent to print has led many publishers to treat electronic versions as add-ons they provide in parallel with print formats. JSTOR, Project Muse, and related programs arose to provide electronic journals consisting of digitized page images with so-called "dirty OCR" behind them to support searching. Other programs provide straight text delivered from aggregated databases or make material available in one of the mark-up languages. Thus the same journal article may be delivered in multiple formats, which raises preservation questions. Is any one electronic version more authentic than the others, and which, if not all, should be preserved?

The research literature is fundamental to the research endeavor, which is deeply embedded in higher education and in industry. But journal publication has become increasingly expensive since the 1970s as prices of printed journals have risen. Options for electronic publishing have engaged the attention of both the library and the publishing communities. The National Library of Medicine and The Andrew W. Mellon Foundation have supported electronic-journal projects that teach us a lot about the challenges of digitization, delivery, and archiving. Possibly most important is the recognition that archiving must be considered at the time the material is created rather than at the end of the distribution chain.

But for the long term, who is preserving electronic content? And who should? If libraries become gatekeepers, aggregators, editors, and consultants to researchers, leasing rather than purchasing the resources researchers need, who will ensure ongoing access to those resources? Who will preserve them as part of the human cultural record? Let us be clear: to select, advise, and educate are time-honored rather than newly arising roles for librarians. But the tacit assumption that future journal volumes — future forms of electronic communication and scholarship — will be as locally available a hundred years from now as libraries' runs are today of, say, the nineteenth-century The Gentleman's Magazine and Fortnightly Review, is not necessarily justified by the leasing mode of access.

One solution is to keep publishing material in print and reformatting it to microfilm and digital formats. But that seems expensive and unrealistic. Moreover, we are already seeing all kinds of information created in digital form only — scientific databases, visualization tools, geographic information systems, e-journals, and a wealth of cultural history contained in images, sound, television, radio broadcasts, and cinema. Librarians are concerned about preserving these resources, too.

Electronic storage media degrade, just as paper does, only perhaps more quickly. Signals stored on electronic media also degrade, and not at a consistent rate, and hardware and software become obsolete. Data must therefore be transferred to new media or migrated to newer platforms, operating systems, and program applications. An alternate strategy is to emulate the original; that is, to provide a way through software to mimic the hardware on which a given system ran. Either way, each item in a digital archive requires active management. Discs, tapes, and other electronic media, like print, must be maintained in controlled environments, but may take more labor than print to preserve. Finally, metadata is vital for information management but is labor-intensive and hence expensive to create.

Assuming, however, that we solve the problems of preserving electronic data, we then must deal with questions about who owns it and what can legally be done with it. Consider the World Wide Web itself. When I hold a book, I intuitively know what that "object" is. But what is the Web that I search? What are its boundaries? Who owns it? Is the Web what I see on my browser, or is it the cluster of files, utilities, tools, and executable code that might be required to make a seamless "page" appear on my screen? And if I download information to my cell phone or Palm Pilot, and it looks different from what I might see if I accessed the same information from my desktop, how many "objects" are there? What is an archivist obligated to save to preserve the "record"? For example, from a radio broadcaster's perspective in the 1930s, the "show" consisted of a series of elements: the script, the musicians, the actors, the announcer, the advertisers, sponsors, and so on. To the listener, it was seamless. Libraries hope to provide future scholars with both: the production elements and the performance. However, we do not yet know how to do that in the digital world.

Finally, today's information technologies have been justly celebrated as democratizing information production and access. On the Web, we can find an enormous range of information of potential value. Determining what out of all this will be saved, and by whom, will require innumerable local judgments. Determining how Web resources can be saved will require some large national and international decisions about best practices, standards, and organizational structuring.

We believe that many of the things we learned about managing preservation in print will stand us in good stead in managing digital preservation. Archiving digital works and records is a pressing need that engages attention at home and abroad with efforts under way at national libraries and archives, including both the Library of Congress and the National Archives and Records Administration here in the United States; research libraries and universities; and non-profit organizations (for example, the Research Libraries Group, OCLC). There is also evidence that the entertainment, publishing, and other content industries are coming to realize the potential commercial importance of future use of their content, which means attention to preservation now.

We know that many parties will want — and need — to participate, which means that organization is a critical issue. One approach is described in the Library of Congress' recently released plan (, which proposes a distributed technical and organizational architecture rendered coherent by technical protocols, standards, and coordinated practices and agreements. Librarians inherit a tradition of local and global coordinated practice and procedures — namely, interlibrary loan, shared cataloging, and the development of directories of microform and manuscript collections, practices that are perpetuated through library school, professional training and continuing education. Just as preservation of analog required libraries to expand their organizational functions and librarians to alter the professional curriculum, we can expect that learning to manage digital resources will require similar adjustments. Indeed, we have already begun to see professional courses, seminars, and workshops on digital librarianship and digital preservation.

As we work our way in the twenty-first century toward the library of the digital era, we need new tools that can help us organize expanding kinds of information, enable us to coordinate and share our resources, and bring us together in preserving them. Librarians also bring a history of distributed organizational practices to the table that will be as necessary as tools and technologies. Just as the danger of "brittle books" spurred work on print preservation, the threat of losing digital information is driving efforts to save electronic resources. It will require us to do things differently but our mission remains constant: to preserve the resources on which research, teaching, and learning so heavily depend [14].

Copyright © Deanna Marcum and Amy Friedlander

