Stories

D-Lib Magazine
June 1999

Volume 5 Number 6

ISSN 1082-9873

A Report on the PEAK Experiment

Context and Design

Maria S. Bonn
University of Michigan Library
Digital Library Initiative

[email protected]

Wendy P. Lougee
University of Michigan Library
Digital Library Initiative

[email protected]

  

Jeffrey K. MacKie-Mason
School of Information
and Department of Economics
University of Michigan

[email protected]

Juan F. Riveros
Department of Economics
University of Michigan

[email protected]

blue line

Introduction

For the past two years, researchers in Economics at the University of Michigan have worked in collaboration with the University of Michigan Library to design and run an experiment in Pricing Electronic Access to Knowledge. PEAK is both a production service for electronic journal delivery and an opportunity for experimental pricing research that provides access to the 1,100+ journals published by Elsevier Science -- journals that include much of the leading research in the physical, life and social sciences. The project provides an opportunity for universities and other research institutions to have electronic access to a large number of journals, access that allows for fast sophisticated searching, nearly instantaneous document delivery, and new possibilities for subscriptions.

The underlying economic impetus for the experiment is to learn how additional value can be extracted from existing content by means of innovative electronic product offerings and pricing schemes. The research team seeks to determine how users respond to different pricing schemes and assess the additional value created from the different product offerings. The team is also analyzing the impact of the different pricing schemes on producer revenues. The team's aim is to be able to generalize our results to various business models, customer populations and information goods. Finally, we would like to contrast the empirical results with the current conclusions of the economic literature on bundling of information goods.

To create a meaningful context for these economic questions, the research team worked with the University Library to design an online system and to market this system to a variety of information clients. The group primarily targeted libraries, focusing on academic and corporate libraries. Contacts were made with institutions expressing interest and institutions already invested in digital library activity. Over thirty institutions were contacted as potential participants, resulting in twelve who agreed to join the effort. Decisions not to participate were frequently driven by cost/budget or by the availability of pricing options of interest to the institution. The resulting mix of institutions provided diversity of size and information technology infrastructure, as well as organizational mission. PEAK participants now include: the University of Michigan, the University of Minnesota, Indiana University, Texas A & M, Lehigh University, Michigan Technological University, Vanderbilt, Drexel, Philadelphia College of Osteopathic Medicine, University of the Sciences in Philadelphia, Dow Chemical, and Warner-Lambert. Although the economic models were designed by the University of Michigan research team, the process was also overseen by a joint advisory board comprised of two senior Elsevier staff members (Karen Hunter and Roland Dietz), the University of Michigan Library, Associate Director for Digital Library Initiatives (Wendy Lougee), and the head of the research team (Professor Jeffrey MacKie-Mason).

The PEAK system allows these participants full text search and retrieval of the entire body of Elsevier content for the duration of the experiment, including some content from earlier years that Elsevier provided to help build a critical mass of content into the service. Users have several search and browse options available to them, including search mechanisms that limit searches to discipline specific categories that were designed and assigned by librarians at the University of Michigan. Any authorized user may search the system, view abstracts, and have access to all free content (see below). Access to full-length articles (a designation assigned by Elsevier) depends on the user's institutional subscription package. Articles may be viewed on screen or printed.

PEAK in Context: Electronic journal publishing and the University of Michigan Library

The scholarly journal has a tradition of purpose and structure dating back several centuries, with little change. Despite the combined effects of price inflation and fluctuations of currency exchange that libraries weathered in the 1970's and 1980's, the basic construct of journals and subscriptions has remained stable and, in fact, the journal has continued to flourish in a world of scholarly publishing that is increasingly global and conglomerate. In contrast to this tradition-laden history, the rapid change stimulated by information technologies in the 1990's has been remarkable and unprecedented.

Early efforts to harness the potential of digital technology for journals focused primarily on distribution and access, with a far more gradual and separate process of re-engineering editorial review and production processes emerging in the background. Major publishers undertook an array of projects with heightened activity evident at the dawn of the Web. Efforts such as Springer Verlag's Red Sage project and Elsevier Sciences' TULIP initiative broke ground in testing the limits of Internet distribution and catalyzing the development of more robust access systems. TULIP (The University LIcensing Program) involved nine institutions and addressed a broad set of issues, including both technical and behavioral concerns. The four-year project reflected significant progress, but failed to assess issues of economics and pricing for the new electronic media (see TULIP report).

In the aftermath of this early experimentation in e-journal publishing, a number of inter-related issues emerged that have stimulated interest in the economic questions surrounding journals and their electronic versions. Nearly every major publisher launched electronic publishing initiatives and, typically, tackled issues of price, product, and market in a manner that extrapolated from print practices. Early pricing models were tightly coupled with print subscriptions, generally reflecting 15% or more increases to the print charge. Almost simultaneously, the phenomenon of preprint services emerged. These factors -- plus a growing appetite for enhanced journal functionality -- have contributed to the heightened interest surrounding pricing and product models for scholarly journals.

The University of Michigan was one of the institutional participants in TULIP, with a joint project team drawing from Engineering, the School of Information and Library Studies (now the School of Information), the Information Technology Division, and the University Library. Michigan was the first site to implement the 43 journals in materials science offered through TULIP and was also the first to move the service to the Web environment. TULIP's outcomes included a far better understanding of the distribution and access issues associated with electronic journals, but also underscored the inadequacy of experimentation without a critical mass of journals.

The TULIP experience, coupled with an early history of SGML text development in the 1980's, provided a unique environment for digital library development and contributed to Michigan's selection as a technology service provider for the Mellon Foundation-funded JSTOR project [Guthrie, 1997]. The unique organizational collaboration begun with TULIP was expanded in 1993 and institutionalized in a campuswide digital library program that today encompasses a full production service and development capability [Lougee, 1998]. It was within this new program that the TULIP legacy was pursued with an eye toward a better understanding of value, price, and product for electronic journals.

In 1996, an agreement was reached with Elsevier Science to launch PEAK in an attempt to address issues left outstanding in the TULIP process. Through PEAK Michigan hoped to gain a better understanding of large-scale management of electronic journals through the development of production systems and processes to accommodate the large body of content published by Elsevier Science. While this goal was important, PEAK also provides a large-scale testbed in which to explore issues of pricing and product design for electronic journals.

PEAK in Context: Economic and experimental design

Central to the PEAK experiment are the opportunities that electronic access creates for unbundling and rebundling scholarly literature. A print-on-paper journal is, in itself, a bundle of issues. Each issue, on the other hand, contains a bundle of articles [Note 1], each of which is again a bundle of bibliographic information, an abstract, references, text, figures and many other elements. In addition, the electronic environment makes possible other new dimensions of product variations. For example, access can be granted for a limited period of time (e.g., day, month, and year) and new services such as hyperlinks can be incorporated as part of the content. Permutations and combinations are almost limitless.

Choosing among different bundling alternatives in general is not an easy task. In the PEAK experiment [Note 2], we were constrained by the demands of the experiment and the demands of the customers. Given the limited number of participants, bundle alternatives had to be limited in order to obtain enough experimental variation for significant statistical analysis. Also, the products had to be familiar enough to potential users to generate participation and reduce the learning effects. After balancing the different alternatives and constraints, the project team selected three different bundle types as the products for the experiment:

Traditional subscription: A user or a library can purchase unlimited access to a set of articles designated as a journal by the publisher for $4/issue (if a print subscription is held.) These electronic journals correspond to the Elsevier print-on-paper journal titles. Access to these continues at least until the end of the project. This is a seller-chooses bundle, in that the seller, through the editorial process, determines which articles are delivered to subscribed users.

Generalized subscription: An institution (typically with the library acting as agent) can pre-purchase unlimited access to a set of any 120 articles selected by users. These pre-purchases cost about $4.50 each but must be bought in bundles of 120 (for $548). This is a user-chooses bundle. In this model the user selects which articles are accessed, from across all Elsevier titles, after the user has subscribed; once purchased the article is available to anyone in that user community. This bundling approach allows the participants to capture value from the entire corpus of articles, without having to subscribe to all of the journal titles. This opportunity is premised on the notion that there is low incremental cost to delivering additional articles once the server database is constructed.

Per article: A user can purchase unlimited access to a specific article for $7/article. This option is designed to closely mimic a traditional document delivery or interlibrary loan (ILL) product. With ILL the individual receives a printed copy of the article that can be retained indefinitely. This is different from the "per use" pricing model often applied to electronic data sources. The article is retained on the PEAK server, but the user can access a paid-for article as often as desired. This is a buyer-chooses scheme, in that the buyer selects the articles before paying for them.

In addition, all older materials (i.e., pre-1998) are freely available to all project participants, as were bibliographic and full-text searches, with no charges levied for viewing citations, abstracts or non-article material such as reviews and notices.

Participants in the experiment were assigned randomly to one of three different business models:

Red Group: could choose any combination of the three available products: Traditional and Generalized Subscriptions and Per-Article Purchases.

Green Group: could only choose any combination of Generalized Subscriptions and Per-Article Purchases.

Blue Group: could choose any combination of Traditional Subscriptions and Per-Article Purchases.

No matter which color group a user's institution is a member of, access is further determined by whether the user has logged in with a PEAK password. Use of a password allows access from any computer (rather than only those with authorized IP addresses) and, when appropriate, allows the user to "spend" a token, thus taking advantage of the generalized subscription model. (If a user has logged in with a password, token usage is invisible; if the user has not logged in with a password token usage is prohibited.) The password is also required to purchase articles on a per-article basis and to view previously purchased articles.

The password mechanism has been valuable in allowing PEAK to provide an increased level of service, particularly remote access, and in collecting data for the research project. Users are categorized into two types: those who log-on with their PEAK Passwords and those who do not. For those who do not, the system tracks which IP addresses access which articles from which journal titles. When users use their PEAK Passwords, the system records which specific users access which articles. These data can be aggregated back to the institution creating a record of total accesses and repeat accesses. Uses are also classified according to which purchasing method is used to buy the article. Additionally, there is a record of those transactions when a person clicked on an article that was not in the subscription base and then, facing a decision of whether or not to pay to view the article, chose to pay on a per-article basis.

Another important experimental design consideration has been duration and learning effects. Given the novelty of the products offered and the lack of experience of most of the customer population regarding electronic access to scholarly journals, the research team expected significant learning effects. These effects make it more difficult to generalize results. To decrease the impact of learning effects, there has been an effort to actively educate users about the products and pricing. Data have also been collected for almost two years, thus helping to isolate some of these effects.

In designing the experiment, it was also a priority to work with a diverse group of customers. Scholarly journal users tend to have very specialized interests, and journal usage studies suggest that only a small fraction of the articles in a given title are read [Note 3]. In order to obtain sufficient participation and usage, the project team decided to include clients outside the University of Michigan. Having a larger community of users clearly improves the breadth of the data, but it also introduces new complications; certainly user behavior will be conditioned by the individual characteristics of the participating institutions.

System design and implementation

The delivery and management of such a large body of content (about 11,000,000 pages by May of 1999) and the support of the PEAK experiment have required a considerable commitment of both system and human resources. In addition to the actual delivery of content, project staff have been responsible for managing the authentication mechanisms, collecting and extracting statistics, and providing user support for, potentially, tens of thousands of users.

PEAK runs primarily on a Sun E3000 with four processors, and is stored on several different configurations of RAID (a combination of Sun fiber channel RAID and traditional SCSI RAID systems). For user authentication and subscription/purchase information, it communicates with a subsidiary Sun UltraSparc.

Searching is conducted primarily with the locally developed search engine called FTL. A faster bibliographic search uses the OpenText search engine. The authentication/authorization server runs the current version of Oracle to manage user and subscription information. Several other types of software come into play with use of the system. They include:

Middleware to bring all of the tools discussed above is written at the University of Michigan by project staff at the University of Michigan Digital Library Production Service (DLPS).

Designing and maintaining the PEAK system, as well as providing user support and service for the participant institutions, also takes significant staff resources. Once the system was specified by the research staff, design and maintenance of the system were undertaken by a Senior Programmer working close to full time in collaboration with the DLPS Interface Specialist. In addition, DLPS has contributed work as needed by other programming staff and management by the Head of DLPS. A full time programmer provides PEAK database support, collecting statistics for the research team and the participants, as well as maintaining the database of authorized users and the transaction database. Two librarians provide about one FTE of user support; one is responsible for the remote sites, one for the University of Michigan community. In addition, other DLI staff put in considerable time in the setup phases of PEAK in marketing the service to potential participants (time that included a high degree of education about the methods and aims of the experiment) and in formalizing the licensing agreements with the participants.

In order to facilitate per-article purchases, PEAK also needed to have the capacity to accept and process credit card charges. In the early months of the service, this billing was handled by First Virtual, a third party electronic commerce company. This commercial provider also verified the legitimacy of users and issued virtual PINs that were to be used as passwords for the PEAK system. Less than half way through the PEAK experiment, First Virtual restructured and no longer offered its services. At that point, DLPS began its own processing of applications and also took on issuing passwords. Credit card transactions were handled through the University of Michigan Press using traditional merchant credit card mechanisms.

Because of the research model and because the system exists both to conduct research and to serve the information needs of a large and varied user community, there are a number of complexities and tensions that are inherent in PEAK. In designing PEAK, we have sought to balance conflicting demands and to adhere to some fundamental goals:

These tensions are sometimes exacerbated by the experiment's use of content from one large commercial publisher -- a vital element of the experiment. As John Price-Wilkin, Head of DLPS, has pointed out elsewhere [Price-Wilken, 1999]:

The research model further complicates these methods for access, where all methods for access are not available to all institutions, and not all institutions choose to take advantage of all methods available to them. This creates a complex matrix of users and materials, a matrix that must be available and reliable for the system to function properly. In implementing PEAK, our production technologies and especially our production organization allowed us to extend the digital library more fully into the University's mission of research and teaching. Independence from Elsevier was critical in order for us to be able to test these models, and the body of Elsevier materials was equally important to ensure that users would have a valuable body of materials that would draw them into the research environment. The ultimate control and flexibility of the local production environment allowed the University of Michigan to perform research that would probably not have otherwise been possible, or could not have been performed in ways that the researcher stipulated.

Early Trends and Further Analysis

In the second part of this article, to be published at a later date, we will analyze in more detail the preliminary data generated in the experiment. There are, however, some clear trends in participant reactions to the service that are worth pointing out here. Notably, there has been a high degree of interest in generalized subscriptions. Of the purchasing options made available by the PEAK experiment, generalized subscriptions have proved to be the most popular and to have generated the most interest. Libraries see the generalized subscription as a way of increasing the flexibility of their journal budgets and of tying purchasing more closely to actual use. Another point of great interest has been the statistical reports on local use that we have generated monthly for the participants, drawing upon the tracking of article and title usage conducted in support of the experiment. Participants have been eager to work with these to help assess current subscription choices and to further understand user behavior.

As the PEAK experiment draws to a close -- the addition of new content will cease in June and the service finishes in August -- the work of analysis and reporting has just begun. The follow-up to this article will report on some of our preliminary findings, and we will discuss and compare the revenue raised under each business model and the determination of the pricing levels. Finally, we will comment on some interesting usage findings and economic behavior.

Notes

Note 1. One unresolved issue in using bundling models is the status of journal content that is not strictly articles. Many notices and reviews, as well as editorial content that are integral to a journal's identity, do not fall into articles. How and when these items are indexed (and therefore searchable) as well as how they should be priced are still open questions in electronic journal delivery and pricing.

Note 2. For a more detailed description of the experimental and product design see [MacKie-Mason and Riveros, 1999].

Note 3. See [King and Griffiths, 1995].

Works Cited

Guthrie, Kevin M. "JSTOR: From Project to Independent Organization." D-Lib Magazine July/August 1997. http://www.dlib.org/dlib/july97/07guthrie.html.

King, D.W. and J.M. Griffiths. "Economic issues concerning electronic publishing and distribution of scholarly articles." Library Trends 43, no. 4, (1995), 713-740.

Lougee, Wendy P. "The University of Michigan Digital Library Program: A Retrospective on Collaboration within the Academy." Library Hi Tech 16:1 (1998), 51-59.

MacKie-Mason, Jeffrey K. and Juan F. Riveros. "Economics and Electronic Access to Scholarly Information." The Economics of Digital Information (tentative title), D. Hurley, B. Kahin and H. Varian, eds. (MIT Press, forthcoming 1999). http://www-personal.umich.edu/~jmm/papers/peak-harvard97/

Price-Wilkin, John. "Moving the Digital Library from 'Project' to 'Production'" Presented at DLW99 in Tsukuba, Japan, March 1999. http://jpw.umdl.umich.edu/pubs/japan-1999.html

TULIP Final Report. NY, Elsevier Science, 1996. http://www.elsevier.nl:80/homepage/about/resproj/tulip.shtml

Copyright � 1999 Maria S. Bonn, Wendy P. Lougee, Jeffrey K. MacKie-Mason, and Juan F. Riveros

Top | Contents
Search | Author Index | Title Index | Monthly Issues
Previous story | Clips and Pointers
Home | E-mail the Editor

D-Lib Magazine Access Terms and Conditions

DOI: 10.1045/june99-bonn