The National Digital Library Project
Wei Dawei, Sun Yigang
The National Digital Library Project (NDLP), which was launched in 2005, has attracted wide attention across Chinese society. This paper introduces the project, and covers its basic characteristics, objectives, and content. It also explains the basic concepts and the overall structure of the project and gives an introduction to the design of its basic platform, application platform, business management system and its standardization control system. As of the end of 2009, the National Library of China maintained over 320 TB of digital resource. Finally, the article also presents the digital resources involved, including collection policies, such as the collection of web information and new media.
Keywords: Digital Library, National Digital Library of China, National Digital Library Project
The National Library of China (NLC) is the largest collector of traditional Chinese resources. By the end of 2008, the collections of the NLC totaled 26,980,000 items. Its massive collections enable the NLC to provide information service to the central government and other governmental organizations, educational, scientific, and research institutions, and the general public.
Figure 1. Collections of the NLC
Since the 1980s the Internet has developed rapidly in China. When the China Internet Network Information Center (CNNIC) released 1st Statistical Survey Report on the Internet Development in China in October 1997, the number of 'netizens' in China had reached 620 thousand, with 290 thousand computers connected to the Internet. According to the 24th Statistical Survey Report on the Internet Development in China, by the end of June 2009, the total number of netizens in China had reached 338 million, among which 155 million obtain Internet access through their cell phones. Meanwhile, the shortage of Chinese digital material means that only 12% of Internet resources are Chinese.
As an important online information provider, the NLC shoulders great responsibility for the collection of Chinese digital resources and subsequent service provision. In October 2005, the NLC launched the National Digital Library Project with the support of government funding.
1. Targets of the National Digital Library Project
The targets of the National Digital Library Project include the following:
2. Components of the National Digital Library Project
The components of the National Digital Library Project include hardware infrastructure, digital library application systems, digital library standards, and digital resources collections and services. Through the construction of the hardware and software systems, the following targets will be met:
The components of National Digital Library of China are illustrated in Figure 2 below.
Figure 2. Components of National Digital Library of China
2.1 Hardware Infrastructure
The hardware infrastructure includes network, storage, and cluster systems.
The National Digital Library will be connected to the major networking systems of China, such as those of the China Telecommunications Corporation, China United Network Communications Corporation Limited, and State Administration of Radio Film and Television. In order to provide better service to government, educational and research institutions, the NLC systems will interconnect at high speed with the China Research and Educational Network and the network systems of the main government organizations. As NLC Phase II is completed, it will connect with the network systems of Phase I and branch libraries. Furthermore, wireless access to the network will be available to readers in NLC Phases I and II.
The storage policy of NLC is to combine the online, near line and offline storage systems. The NLC will use disk, tape and CD for online storage, FC-SATA disk for near line storage, and CD and tape for offline storage. The capacity of online storage, near line storage and offline storage will grow to 150 TB, 150 TB, and 360 TB respectively.
The following are stored online:
For the data with low-frequency of use, high-quality digital files for permanent preservation, and resources collected from the Internet, near-line storage is applied. For little-used digital resources, digital resources that need to be permanently preserved, and for data backup, offline storage is used. Cluster computing will also be employed, and a cluster management system to process a multi-cluster system will be built.
2.2 Digital Library Application Systems
The main target for digital library application systems is the construction of a digital resource collection and acquisition system, a digital resources processing system, a digital resources organization and management system, and a digital resources distribution and service system, with the management of digital resources life cycle as its core.
The function of the digital resources collecting and acquiring system is to digitize print, audio and video materials; collect web resources with special topics and provide a channel for the educational, scientific and research institutions to deposit their doctoral dissertations and other e-resources. The system includes three subsystems:
The major missions of the digital resources processing system are to produce and combine the metadata, produce the database, carry out knowledge discovery and organization, to share the metadata of printed documents and electronic resources among all the literature information agencies nation-wide. The system includes four subsystems:
The function of the digital resources organization and management system is to manage the above-mentioned digital resources in an orderly way and preserve them so as to make them available decades later, or even several hundred years later. The digital resources organization and management system can also register and manage the copyright of digital resources. Thus, the digital resources distribution and service system can provide information service according to the copyright of the resources. The system includes three subsystems:
The major tasks of the digital resources distribution and service system are to package all types of digital resources, provide service according to user requirements and manage the digital resources. The whole system can be divided into many subsystems. For example, these include but are not limited to the metadata search subsystem, the virtual reference subsystem, interlibrary loan and document delivery subsystems, the grass-root level resource distribution, on-demand subsystem, and the full text search system.
The National Digital Library workflow will be established on the basis of the four systems mentioned above.
First, the digital resources collection system and digital resources production system manage the digital resources to meet the requirements of digital library management and service provision, including the digitization of print, audio and video materials, processing the harvested web information, format transaction and metadata indexing of databases, e-books, and e-periodicals.
Next, the processed digital resources enter into the digital resources management system. The management system manages the digital objects, metadata and related digital copyright, homogenizes heterogeneous resources, creates a uniform retrieval portal, and distributes the processed digital resources.
Lastly, the digital resources distribution and service system will interact with users to provide convenient service. For example, users can customize special subject information through the "My Library" system to receive information pushed from the distribution and service system; interoperate online with professional reference librarians; perform cross-platform retrieval; and accept documents from the delivery system.
2.3 Digital Library Standards
The NDL Project will formulate a series of standards regulating resource construction, description, organization, long-term preservation and service provision so as to make all the data normative and easy to deal with. The National Digital Library Standard System primarily includes standardized processing of Chinese characters, digital object identifiers, digital object management, general regulation of metadata, knowledge management, and digital resource statistics.
Figure 3. Standards of the China Digital Library Project
2.4 The Construction of Digital Resources and Service Provision
The digital resources of NLC come from:
As of 30th June 2009, the total volume of its digitized resources had already exceeded 250 TB.
2.4.1 Deposit of Digital Resources
According to the nation's regulations, the National Library receives the legal deposit of digital publications, which includes audio tapes, video tapes, laser disks, VCDs, DVDs, electronic newspapers and other electronic publications. The NLC currently has 1,620,000 pieces of all types of electronic publications.
2.4.2 Purchased Digital Resources
By the end of 2008, the number of purchased databases reached 136; 59 are Chinese and 77 are foreign language. The content of these databases includes periodicals in Chinese and foreign languages, newspapers, books, dissertations, conference papers, etc. According to the copyright of the databases, some of them are available on the Internet, while others are only available through the NLC's Intranet.
2.4.3 Digitized Special Collections
The NLC began the digitization of its collections in 2000. By the end of 2008, the total volume of its digitized special collections exceeded 180 TB, which includes electronic books, dissertations, Min Guo documents, on-line lectures, oracle bones, Dunhuang materials, rubbings, digital chronicles, New Year pictures, etc. Some of the NLC's special collections are described below.
2.5 Collection of Web Information
It is an important responsibility of NLC to collect the increasing amount of web-based information. NLC started the tentative work for web information collection and preservation in 2003. To date, the NLC has collected and preserved all of the information from more than 20,000 governmental websites, 245 e-newspapers, and special subject information such as Chinese studies and the Olympic Games.
3.New Media Services
New media services refers to the newly developed mobile media services aimed at smart phones, as well as digital library services based on digital television.
Mobile devices such as smart phones provide portability, real time response, interactivity, and other characteristics which reduce the limitations on time and place for public information usage, information collection, and reader interaction, thus enabling information sources to be more diverse, comprehensive and timely.
In recent years, the technology of digital television has been advancing rapidly. Digital television is not only a brand new broadcast technology, but also a new kind of life style. Its effect can spread into every part of life. It has changed television from one-way communication to interactive information dissemination, as well as greatly improving the quality of sound and image. The State Administration of Radio Film and Television has announced that digital video broadcasting would be launched in 2010 and that analog television broadcasting would cease. Digital television technology will be a key media platform for digital library services.
The NLC, having considered the impact of new media, has decided to energetically explore new methods and forms for digital library services, as illustrated in Table 1 below.
Table 1: Digital Library Service Based on New Media
The market-based development of digital television and mobile media have enabled new media services to become a growth point of the National Digital Library.
About the Authors