Volume 5 Number 3
The California Digital Library
Assistant Director for Education and Communication
California Digital Library
University of California Office of the President
- Strategic action and traditional planning
- Progress to date
- Core technologies
- Future enabling technologies
- End notes
- Hyperlinks to sites mentioned in this story
In late January 1999 -- less than two months ago as of this writing -- the California Digital Library (CDL) opened its "digital doors" to the public. Predictably, the doors in this case are represented by the CDL's web site, which serves as a gateway to new collections, services, and tools as well as to legacy digital resources hosted or produced by the University of California, such as the Melvyl® online union catalog and the California Periodicals database.
This article describes some of the assumptions and strategies that led to the formation of the CDL and that represent its underlying organizational motivation. It goes on to describe some of the digital resources as well as core and future technologies that underlie its digital presence, and that largely will determine its usefulness.
The news of public availability, announced in D-Lib Magazine, and widely elsewhere [note 1], downplays the fact that, like many other large-scale digital library efforts, the CDL is an organizational innovation as well as a collection of, and framework to support, digital resources and technological innovations. The University of California, with its nine campuses, 180,000+ students, and 140,000+ faculty and staff, sometimes lumbers in a way evocative of the mascots of two of its campuses -- the Bruins (UCLA) and the Bears (UC Berkeley). The two years of planning that led to the formation of the CDL and the scant year following that led to its public "release" [note 2] represent an explicit set of assumptions and goals, and an unusually rapid pace and sense of urgency for the University.
Some key characteristics of the CDL's current and proposed digital content and technologies are that:
- they are focused on resource sharing;
- they complement print holdings and, in many cases, enhance the ability to share print resources;
- they are connected to the core mission of the university and of the organizations, including libraries, that manage its information resources, rather than being relatively isolated experiments;
- they will link to (and may serve as a testbed for) digital library research (for example, to a selection of the NSF/DARPA/NASA Digital Libraries Initiative);
- they represent a "consumption of best practice" (for example, of standards or principles promulgated by the Digital Library Federation and others);
- in many cases they are available to the public at large.
Key characteristics of the CDL as an example of organizational innovation include:
- a "co-library" model which draws from and depends upon expertise, resources, and priorities across all of the UC campuses as well as strategic partners such as the State Library of California (this principle is well articulated in an interview with founding University Librarian Richard Lucier in a February 1998 D-Lib editorial);
- a recognition of the inter-relatedness of the library function with scholarly communication and with technological innovations;
- establishment of a framework in which further technological innovation can take place that is deeply tied to the core mission and programs of the university.
A caveat may be necessary here. While the CDL has goals and an agenda that carry well outside of the university boundaries, it is, to date, primarily a product of a large academic research university. That context necessarily affects not only its origins but also the way its story is told. While the following descriptions are couched largely in academic terms, the principals in the CDL have well in mind that the CDL, and digital libraries writ large, are about much more than academic libraries going digital. They are about organizational change, new forms of information, the dissemination of and newly-enabled interaction with scholarship; about collaboration, and matching technologies to information resource and management challenges, to name just a few of the contextual themes mentioned in the following.
Strategic action in addition to traditional planning
In August 1996, University of California President Richard Atkinson announced the Library Planning and Action Initiative. The 18-month initiative was charged to create a framework for library development for a five to ten year horizon while also recommending immediate activities in order to supplement planning with lessons learned from strategic action.
Several critical conclusions were reached quickly and are named in the Initiative's task force report. They include:
- There is indeed a serious library crisis, multifactorial in scope, which threatens the ability of UC's libraries to support adequately the University's education, research, and public service missions.
- The crisis in scholarly and scientific communication is not confined to UC; its impacts are international.
- Current practices, including the building of [separate] comprehensive research collections [for the nine campuses of the University of California system], cannot be sustained.
- The [campus] libraries have been leaders in re-engineering processes for operational efficiencies, but further re-engineering to achieve additional cost savings, while potentially practical, does not address the fundamental crisis.
- Solutions to this crisis need involvement from all stakeholders: the libraries cannot solve this crisis in isolation as it has deeper roots in current policies and practices of both scholarly and scientific communication and academic advancement.
- Certain immediate strategic actions need to be taken as steps to building a foundation for a sustainable UC library system.
The task force went on to recommend seven strategies to address the planning conclusions and move toward a goal of comprehensive access to scholarly communication for the University community. The strategies include:
- UC should seek innovative and cost-effective means to strengthen Resource Sharing.
- UC should establish the California Digital Library.
- UC should sustain and develop mechanisms to support campus Print Collections.
- UC should seek mutually beneficial Collaboration with Libraries, Museums, other Universities and Industry.
- UC should develop an Information Infrastructure that supports the needs of faculty and students to disseminate and access scholarly and scientific information in a networked environment.
- UC should lead the national effort to transform the process of Scholarly and Scientific Communication.
- UC should organize an environment of Continuous Planning and Innovation.
While the second recommendation led to the creation of the CDL in October 1997 as a new organizational unit, the digital library has quickly become a primary focal point for all of the strategies.
Of course, just as is true throughout the world, many units in the University -- chief among them libraries -- were already taking strategic actions and building digital collections and services. The desire to use digital technologies to capture more economies of scale and scope, to leverage expertise and momentum, and to work collaboratively for the common good was reinforced by a longstanding, and increasingly urgent, desire to preserve local strengths while enhancing the sharing of resources. At UC this has been captured in the catch phrase "One University of California, One Library."
This has led to a creative tension between form and activity that underlies the "co-library" model of the CDL. While the CDL is being established as an independent library, with the hope that it will be recognizable and predictable as a set of collections, services, and tools, it is also a "framework for collaboration." If successful, it will enable many organizations, starting with the nine UC campuses and select strategic partners, to work together toward those economies of scale and toward synergies that produce innovations.
To put it another way, the CDL is both a coalesced set of digital technologies: hardware, lines of code, digital objects, and a set of softer organizational technologies: a focus of energies and resources, a condensing of experience, and a structure for experiments in collaboration and integration of digital library components.
Progress to date
Building, sharing, and preserving collections
The CDL's commitment to supporting the University's scholarship depends on the development and acquisition of high-quality digital content. The CDL has an aggressive program in licensing scholarly materials, including abstracting and indexing databases and full-content electronic journals and databases. It is creating digital access to unique and valuable special and archival collections of the University and of its California partners. Selection decisions are based on rigorous criteria of quality and value, foremost for the academic programs of the UC campuses. Intensive collaboration with faculty and librarians across the University led to priorities for the founding Science, Technology, and Industry Collection (STIC). Collaboration is rapidly expanding to establish priorities for Social Sciences, Humanities, and Government information.
The CDL provides access to the following categories of digital content and is exploring methods to ensure perpetual access to them. The first three -- the Online Archive of California, Melvyl Union Catalog, and California Periodicals database -- are freely available to the public. While most of these resources can be reached directly, the CDL's Directory of Collections and Services also serves as a browsable and searchable gateway for their discovery.
- The Online Archive of California (OAC) -- a union database of digital descriptions of archival and manuscript collections from all of the UC campuses and from around California. These archival finding aids use the standard for Encoded Archival Description (EAD). EAD is a document type definition (DTD) for the Standard Generalized Markup Language (SGML). Over 3,000 finding aids from more than 20 institutions describe collections that are located in California. In some cases, primary sources themselves have been digitized and are available. Work to select primary source content for digitization from the UC collections is ongoing.
- The Melvyl Union Catalog -- records for materials (books, archives, audio-visuals, computer files, videorecordings, dissertations, government documents, maps, music scores, and recordings) in the libraries of the nine UC campuses, the California State Library, the California Academy of Sciences, the California Historical Society, the Center for Research Libraries, and the Graduate Theological Union in Berkeley. There are currently over 9 million unique titles representing over 14 million holdings. This database has long captured widespread attention as a successful pioneering effort in "library automation" (developed by the former University of California Division of Library Automation).
- The California Periodicals database -- built in partnership with the California State Library, it represents journal holdings not only in the University of California system, but also in over 500 libraries statewide. Contributors include the 9 UC campuses, the 22 campuses of the California State University system, the Center for Research Libraries, the California Academy of Sciences, the California Historical Society, Stanford University, the University of Southern California, the Getty Center for the History of Art and the Humanities, and the Graduate Theological Union in Berkeley. Other contributors include community college libraries, public libraries, selected special or corporate libraries and California medical libraries in other universities and hospitals.
- Electronic journals and full content -- More than 2,000 electronic journals are now licensed from major scholarly publishers and information providers. The licensing program is identifying additional priority titles. Journals and reference texts, such as the Encyclopedia Britannica Online, can be found by browsing or searching the CDL's Directory of Collections and Services.
- Abstracting and indexing databases -- many of these are hosted locally by the CDL and access is provided to authorized users via the same interface as is used to search the Melvyl Catalog. Still others can be searched via the same interface but access to the content is provided by a Z39.50 link to provider's servers. Others are licensed for access via the vendors' sites and interfaces.
Services and tools
The CDL is pursuing technological innovations that enhance services for discovering, sharing, accessing, manipulating, and integrating scholarly content in all forms. Already available are the following tools and services:
- Topical browsing of digital resources via the CDL's "Directory of Collections and Resources".
- User or library-oriented views of/windows into digital resources. Links into the Directory of Collections and Services can be constructed to produce a search result with filters by topic, resource format (electronic journal vs. abstracting and indexing database, etc.), or local campus availability. A branch library might thus provide a link to "Health Science electronic journals" available from that library or campus.
- Update, a service that runs user-defined weekly searches to retrieve new items in selected databases.
- Request, a service that enables UC-authorized users to borrow books in the Melvyl Union Catalog from any campus in the UC system.
Supporting innovations in scholarly communications
The development of the CDL is, in part, one of UC's responses to trends in scholarly communication (e.g., increased costs for traditional methods of communication, continuous and significant increase in the volume of scholarly information, and increasing demand for new technologies and new capabilities with little abatement in the demand for the traditional). The CDL's activities include:
- Creating, with its campus library partners, a database of University faculty members who are editors of prestigious scholarly journals and who used it to co-host forums for faculty discussion of the challenges and opportunities in scholarly communication.
- Under direction from the UC President, exploring alternative forms of scholarly publishing.
- Joining, as a founding member, the Scholarly Publishing Academic Resources Coalition (SPARC), an organization sponsored by the American Association of Universities (AAU) and the Association of Research Libraries (ARL), whose charge is to work with academic and publishing partners throughout the country and abroad to create alternatives in scholarly publishing.
Success of the CDL in achieving and maintaining its charge is dependent on collaboration with librarians and academics on all of the UC campuses as well as with partners across California and the US. Some recent highlights include:
- Experiments with other libraries, including the California State Library and its "Library of California" initiative to develop new, sustainable, methods and services for sharing resources among multitype libraries.
- Several major licenses for the full content of core scholarly journals, including those with the American Chemical Society and with JSTOR, include the flexibility to experiment with extending access to the California State University system, community college campuses, and public and school libraries.
- Collaborations on grant proposals to explore technological innovations in digital libraries. Our partners have included the Berkeley and Santa Barbara campuses, the San Diego Supercomputer Center, and Stanford University. Serving as a testbed for technology transfer in NSF/DARPA/NASA Digital Libraries Initiative -- Phase 2 proposals is an important aspect of this activity.
- Membership in the Digital Library Federation (DLF), Scholarly Publishing Academic Resources Coalition (SPARC), and various consortia such as the International Coalition of Library Consortia (ICOLC) ensure cooperative progress in our mutually recognized goals.
There are several principles underlying the core technologies of the CDL. These include a devotion to standards, and thus interoperability; a belief that digital collections and services will continue to become highly distributed; the pursuit of "seamless integration" of resources and access to them as a worthwhile, if elusive, goal; and a goal of ubiquitous, location-independent access to the CDL and the resources it maintains.
Although the CDL is very young, it has inherited significant core technologies represented by the Melvyl Union Catalog and the telnet and web interfaces to that catalog and other CDL-hosted resources. The CDL encompasses the activities -- formerly carried out by UC's Division of Library Automation -- to maintain and enhance these key technologies.
More specifically, the CDL has among its core technologies the following list which is likely to be familiar to D-Lib readers:
- Bibliographic databases and the standard record formats (e.g. MARC), linking algorithms and associated protocols.
- HTML and web browser standards, including dynamic access to underlying databases.
- SGML and the EAD as current digital publishing standards.
- Interoperability protocols such as Z39.50.
Future enabling technologies
An equally important list, whose elements are frequently discussed in this magazine, is of the technologies that are needed in the short term to address known roadblocks and inefficiencies, and in the long term to continue to make progress toward our vision.
Although a representative list of these enabling technologies is presented below, the overall strategy to identify and prioritize technology development is threefold:
- To establish advisory and working groups that help us choose technologies to deploy and on which to focus for development. Two such groups -- the Technology Architecture and Standards Working Group and the Strategic Innovations Working Group -- have been established and charged. Further information on these and other CDL advisory and working groups is available.
- To contribute resources and energies to emerging best practices such as those promulgated by the Digital Library Federation.
- To work with research partners in their development of technology innovations, and, where appropriate, serve as a testbed for technology transfer to a "production" environment.
- Metadata standards for digital objects and resources -- to further, among other things, the distributed architecture already emerging.
- Persistent naming of resources and objects -- to increase the stability and decrease the maintenance of pointers to resources.
- Better authentication and authorization -- to allow location-independent ubiquitous access and increased ease in defining authorized users and user groups.
- Digital object standards, such as for image quality -- for example, to distinguish archival/preservation level objects from those in regular use.
- New representation of search processes and results that can be absorbed and manipulated by users -- to better match discovery tools with desired functionality and ease of use.
- Viewer technologies for different data (e.g. multimedia, geospatial) -- to increase the ease and dimensions of use immediately available after discovery of a resource.
- Flexible "profiling" and user customization of environments -- to better match services and tools to particular needs and behaviors.
The new California Digital Library is both a set of digital collections, services, and tools and an important organizational innovation for the University of California and beyond. It operates on principles of intensive collaboration and integration. Its success, and its usefulness to others as a model, depends not only upon its existing and future core technologies, but upon its ability to create and support innovations in sharing resources, in scholarly communication, and in meeting information needs of scholars and students.
Ober, John (1999). California Digital Library Website Opens. D-Lib Magazine, February. http://www.dlib.org/dlib/february99/02clips.html#CA_DL.
University of California (1998). Library Planning and Action Initiative Task Force Final Report. http://www.lpai.ucop.edu/outcomes/finalrpt/.
Friedlander, Amy (1998). A Conversation with Richard E. Lucier of the University of California. D-Lib Magazine, February. http://www.dlib.org/dlib/february98/02editorial.html.
1. See, for example, Lisa Guernsey (1999). University of California's Digital Library Opens Its On-Line Doors. Chronicle of Higher Education, January 29.
2. The ongoing challenges of labelling in this arena extend even to simple concepts such as "a beginning." Should one follow a physical metaphor -- the CDL's "opening;" a high-tech/software metaphor -- the CDL's "release;" or an explorer's metaphor -- the CDL's "launching?" In the end, making an exception to a general principle of consistent labelling, we have used all three.
Hyperlinks to sites mentioned in this story
The CDL web site features a browseable and searchable Directory of Collections and Services and descriptive information about the CDL. See http://www.cdlib.org/.
The University of California currently includes nine campuses at Berkeley, Davis, Irvine, Los Angeles, Riverside, San Diego, San Francisco, Santa Barbara, and Santa Cruz. A tenth campus is being planned for Merced, California. See http://www.ucop.edu.
For historical information about UC's Library Planning and Action Initiative, as well as information about continuing planning and advisory activities, see the Systemwide Library Planning web site at http://www.slp.ucop.edu/.
The EAD standard is maintained by the Library of Congress in partnership with the Society of American Archivists. See http://lcweb.loc.gov/ead/.
The State Library of California is a strategic partner of the CDL. See http://www.library.ca.gov/.
Scholarly Publishing Academic Resources Coalition (SPARC) is an alliance of libraries that fosters expanded competition in scholarly communication. See http://www.arl.org/sparc/.
The Digital Library Federation (DLF) was founded in 1995 to establish the conditions for creating, maintaining, expanding, and preserving a distributed collection of digital materials accessible to scholars, students, and a wider public. It is composed of participants who manage and operate digital libraries. See http://www.clir.org/diglib/dlfhomepage.htm.
Advisory working groups for the CDL are described as part of its planning structure. See http://www.cdlib.org/about/planning/advisorygroups.html.
Copyright © 1999 John Ober
Top | Contents
Search | Author Index | Title Index | Monthly Issues
Previous Story | Clips and Pointers
Home| E-mail the Editor
D-Lib Magazine Access Terms and Conditions