As the deployment of institutional repositories (IR) becomes mature, more libraries will take advantage of consortial or regional ties to provide support, training, and expertise in IR development. This support structure is essential for organizations that otherwise would not have the staff, time, or infrastructure to creation an IR. The Utah Digital Repository (http://harvester.lib.utah.edu/utah_ir/), an LSTA grant-funded project, serves as a model for the creation of a statewide repository. This case study will explore the development and growth of institutional repositories in academic libraries in the state of Utah. Built on the existing framework of the Mountain West Digital Library, the Utah Digital Repository project provided a librarian's toolkit, training sessions, outreach, and technical assistance as pilot sites developed an IR. This framework of support ensures that an academic library of any size can launch an institutional repository. A single web site allows users to search the aggregated metadata of multiple institutions. This article includes statistics and survey results in addition to an overview of the development process that will aid other libraries considering the development of similar shared resources.
The number of institutional repositories in academic libraries has been growing substantially in recent years. In a 2005 survey, the Coalition for Networked Information (CNI) found that of its member institutions, 40% of the respondents had an institutional repository and 88% of those that did not were planning to create one or participate in one (Lynch and Lippincott 2005). While many academic libraries have launched institutional repositories, according to the Census of Institutional Repositories in the United States MIRACLE Project Research Findings the typical IR contains less than 1,000 items. In addition to problems of collection size, smaller academic institutions often do not have the resources to create and support the services or system for an institutional repository. Clifford Lynch predicted that future IRs would consist of consortial repositories or use "federating" for searching across repositories and involve public organizations that are not necessarily academic libraries (Lynch 2003). To this date, very few consortial repositories have been created and most repositories rely on Google for cross-repository searching. The majority of repositories remain housed at academic institutions. (A model example of a consortial repository is OhioLINK's theses and dissertation collection.1)
The academic libraries in Utah have had a long history of cooperation and consortial arrangements. The Utah Digital Newspapers2 program and the Mountain West Digital Library (MWDL)3 are established aggregated gateways sharing digital content in the region. The Mountain West Digital Library includes institutions from Utah, Idaho, and Nevada and hosts collections from academic libraries, museums, historical societies, public libraries, archives, and others. The MWDL is sponsored by the Utah Academic Library Consortium (UALC). Through its member institutions, it provides digitization and hosting centers to support other organizations across the state in creating and hosting digital collections. It employs a workflow where a partner organization identifies the content for a collection and secures funding while the digitization center provides training on metadata, digitization, and uploading and digitization services if needed. The partner completes metadata and uploads items to the hosting server where the metadata is harvested to aggregate in the collective library (Arlitsch and Jonsson 2005, p.224). With this existing model in place and a strong interest in institutional repositories, the creation of a consortial IR was a logical progression. The Utah Digital Repository Initiative takes advantage of the framework of the MWDL, where smaller institutions can partner with one of the regional digitization and hosting centers to create institutional repositories.
In 2005, UALC and the J. Willard Marriott Library and Spencer S. Eccles Health Sciences Library at the University of Utah4 applied for a Library Services & Technology Act (LSTA) grant to create a model IR at the University of Utah and a framework for other institutions in Utah. The project focused on working with academic institutions but also recognized and sought relationships with other public institutions in the state that were existing partners in the MWDL. The project began in the summer of 2006 as a one-year project, providing funding for a program manager, travel funds, hardware, and software. The goals of the project were to:
There were two important questions in defining the content scope for the Utah Digital Repository. Does the content fit within the scope of an institutional repository? Does the content belong in the Utah Digital Repository or the more general Mountain West Digital Library? There was also a need to maintain local control for partner institutions, which had proved to be one of the major reasons for the success of the MWDL (Arlitsch and Jonsson 2005, p.223). How much did we want to control the scope and content selection for the partners?
As Bailey notes (Bailey 2005), institutional repositories are new enough that there is not a uniform definition for an IR, and definitions range from emphasizing long-term preservation, to a set of services, to goals of changing the scholarly communication landscape and journal pricing, to the necessary technologies implemented (such as OAI-compliant software). For the Utah Digital Repository, we did not initially restrict the definition of what constituted acceptable materials. We emphasized preservation, "openness," OAI-compliance, relevance to higher education in Utah, and discovery. We expected that the content for a small liberal arts college, community colleges in the state, and large resource institutions would vary. Gibbons outlines possible types of content for institutions that are not research oriented, including historical documents, student work, and research data (Gibbons 2004, p.5). We wanted to keep options open for the other types of collections that both Lynch and Gibbons mention from non-academic institutions. In the end, the state-wide collection reflects a large range of content including the items mentioned above.The University of Utah Institutional Repository collection development guidelines5 served as a guide. It breaks down an institutional repository into the following sub-categories:
Two defining criteria separate content between the Utah Digital Repository and the MWDL. If the content is either academic in nature, authored by someone at a Utah institution, or about an institution of higher education in Utah, then it belongs in the Utah Digital Repository.6
In order to define what formats we would allow and "guarantee" for preservation purposes, we looked at the policies of existing institutional repositories and measured it against our own capabilities and existing offerings. The main model we used was the University of Rochester's Institutional Repository, UR Research.7
A major component of the project was providing tools and training for libraries in the initial stages of planning an IR. The Utah Digital Repository Toolkit8 is designed to provide background information, marketing templates, an overview of copyright issues, administrative models, introductory information about IR metadata, an overview of open access advocacy issues, training materials, project ideas, and additional reading. Although there are other excellent toolkits available,9 the goal of this toolkit was to provide as practical a tool as possible and limit the amount of research that a librarian needed to do to get started. Since the University of Utah IR served as a model for the project, many of the documents were created for that IR and then adapted for more general use in the toolkit. Response to the toolkit from the pilot sites was positive, with librarians saying that it was easy to navigate, and provided background information and marketing materials that could be adapted to their own institution.
When the project started, the University of Utah and Brigham Young University had fledgling institutional repositories. Utah State University had plans to develop an IR and was in the process of forming a campus advisory board. These libraries had the staff and existing digitization programs to provide initial support to a new project like an IR, so they were not chosen as pilot sites. Although initial institutions were identified by the grant project directors, the eventual pilot sites volunteered to take part in the project. Voluntary participation was key to the success of the project.
There were three pilot sites for the project. The two initial sites were Westminster College, a private liberal arts college, and Utah Valley State College. Both have existing digital collections in the Mountain West Digital Library, but an IR represented a new undertaking for each of them. Salt Lake Community College also signed on to the project in the late spring, with their IR effort representing their first digital project.
While institutions with a research focus will naturally concentrate on collecting faculty publications, institutions with a teaching emphasis are more interested in showcasing student work and preserving institutional history. Electronic Theses and Dissertation programs are an ideal first project for a library wanting to showcase undergraduate or graduate level work. Most of the questions brought up by the pilot sites centered on copyright, metadata, and staffing issues. Librarians at pilot sites wanted sample permission forms to use when recruiting content and staff training on IR metadata. Feedback from the pilot sites prompted the program manager to create additional toolkit materials and develop staff training sessions.
Another major concern of pilot institutions was how to create time to work on the IR and how to organize their staffing. Utilizing the framework of the Mountain West Digital Library and hosting institutions helps alleviate some but not all of these problems. Since all of the larger research institutions have very different staffing models from each other and these smaller institutions, it was hard to recommend a best practice. In 2006, a study conducted by Boock and Vondracek on the organization in libraries for digitization found that a single wide-spread model did not exist (Boock and Vondracek 2005). The same will most likely be true for institutional repositories, and institutions will have to identify a staffing structure that works within their own organization.
In addition to the pilot sites, serendipity landed us access to the Utah Heavy Oil Repository. The Utah Heavy Oil Center, funded by the Energy Policy Act of 2005, developed an online repository on the DSpace platform containing materials related to heavy oil, tar sands, and oil shale. The University Of Utah Marriott Library is currently digitizing historic theses and dissertations on similar topics, so an exchange of metadata between both repositories will be mutually beneficial.
The Utah State Library had been gathering a resource of data from the Office of Public Health Assessment, Center for Health, and Utah Department of Health to create the Online Public Health Library. This collection had previously been aggregated into the larger MWDL. Because of its specialized research content and benefit of linking it with research from the University Of Utah School Of Medicine and other health related research, we chose to aggregate it in the Utah Digital Repository.
Technological Infrastructure and Aggregated Interface
Using the infrastructure of the MWDL, the majority of the institutional repositories in the Utah Digital Repository use CONTENTdm. By aggregating the collections through an OAI harvester, we were able to include repositories that chose to use a different platform. In the initial set-up for the Utah Digital Repository, we harvested data from two CONTENTdm servers (representing six institutions) and two DSpace installations (Brigham Young University's IR and the Utah Heavy Oil Center Repository).
We implemented a modified version of the Public Knowledge Project (PKP) Open Archives Harvester10 to aggregate the data. The initial harvests enabled us to identify metadata issues and other problems in the aggregated data. We encountered a significant problem creating irrelevant search results and extensive harvest times due to the large transcripts, primarily of theses and dissertations, which were mapped to the Dublin Core description field. Because of this, an agreement was made between participating institutions to unmap the transcripts from the description fields. The result is that the items are currently not full text searchable through the Utah Digital Repository site. Continued development will look at correcting this problem.
By using the OAI harvester, we were able to provide a central interface for searching and to maintain the identity of individual institutions. When a user selects an item from the results list, they are then taken from the aggregated interface to the collection of that institution showing all their branding, which was a feature essential to the success of the MWDL. There are unresolved usability issues about navigation between the aggregated interface and individual collections. A prototype interface was created as a proof of concept in time to finish with the grant's conclusion. Usability testing will proceed in the future to refine the interface and the capability to search the Utah Digital Repository from the MWDL interface.11
Feedback and Evaluation from Pilot Sites
The pilot sites at Westminster College and Utah Valley State College met targets for faculty outreach, with each contacting more than seven faculty about opportunities for placing their content in their new repositories. They explored possibilities for collaborating with departments and organizations outside the library for institutional repository work. Utah Valley State College recruited content in a variety of categories, ranging from faculty papers presented at Frankenstein: Penetrating the Secrets of Nature, a traveling exhibit produced by the National Library of Medicine and the American Libraries Association Public Programs Office, to senior honors projects and staff presentations. The college recruited a total of 31 items, with 17 currently available. They expect to continue their efforts recruiting content for their repository in the fall. Westminster College experienced an administrative delay in making their content available due to a request to establish a campus wide committee to approve institutional repository content, but they recruited 28 items and have 8 uploaded in a test version of their new institutional repository. Most of the currently available Westminster College content consists of senior honors projects. Salt Lake Community College has chosen to focus on historical institutional content to learn more about digitization and how the systems function before recruiting academic content and marketing to faculty, and they have 135 historic photographs about the college in their collection.
Future of the Utah Digital Repository Project
Seven out of eleven library directors of Utah-based UALC institutions indicated that those who have an existing repository plan to maintain it into the future, while those who do not currently have a repository plan to start one in the next three years. Pilot sites for the project plan to expand their repository efforts, and they will continue to be supported based on the existing framework of the Mountain West Digital Library, with a new MWDL program director providing institutional repository support and training.
In August 2007, 6,857 items were successfully harvested from operational repository targets. Repository targets included the University of Utah's Institutional Repository, Brigham Young University's Institutional Repository and Electronic Theses and Dissertations, Utah Valley State College's Institutional Repository, the Online Public Health Library, NOVEL (Neuro Ophthalmology Virtual Education Library), and the Utah Heavy Oil Center Repository. By providing a unified way to search all of these resources, the Utah Digital Repository provides the state of Utah with a showcase for its higher education materials. A future challenge will be to create an effective and user friendly search mechanism across the Mountain West Digital Library, Utah Digital Newspapers, Utah Manuscripts Association EAD Finding Aids, and the Utah Digital Repository.
During this initial phase of the Utah Digital Repository project, a large amount of time was dedicated to training staff before content could be recruited. Institutions without a dedicated digitization department needed extra time for staff to learn new roles and become comfortable with unfamiliar processes. Teaching institutions are likely to have a lower rate of recruitment for content such as scholarly journal articles, but a new IR provides them with an opportunity to collect and preserve the work of faculty who publish and to highlight undergraduate work. A consortial or regional repository will need to have broad criteria for the materials it collects in order to accommodate the diverse needs of the institutions it will serve.
By providing a toolkit, training, and support for the system and infrastructure, we were able to help three academic libraries create IRs and produce a framework for additional organizations to use in the future. As the Utah Digital Repository grows with both content and participation, it will benefit both researchers and the community as a gateway to the variety of materials produced by higher education institutions in Utah.
5. University of Utah IR Collection Policy, <http://ir.utah.edu/materials/utahIRcollectionpolicy.doc>.
6. Note: previous collections that were aggregated into the MWDL fit this definition for inclusion in the Utah Digital Repository.
7. University of Rochester Research, <https://urresearch.rochester.edu/>; University of Rochester Libraries. (2005, February 8). Institutional Repository (IR) Crib Sheet. Retrieved January 21, 2006 from: <http://docushare.lib.rochester.edu/docushare/dsweb/GetRendition/Document-17647/html>.
9. See, for example, ARCL'S Scholarly Communication Toolkit, <http://www.ala.org/ala/acrl/acrlissues/scholarlycomm/scholarlycommunicationtoolkit/tools/tools.cfm>.
Arlitsch, Kenning, and Jeff Jonsson. "Aggregating distributed digital collections in the Mountain West Digital Library with the CONTENTdm Multi-site Server." Library Hi-Tech, 23:2, June 2005, 220-232.
Bailey, Charles W., Jr. "The role of reference librarians in institutional repositories." Reference Services Review, 33:3, 2005, 259-267.
Boock, Michael, and Ruth Vondracek. "Organizing for digitization: a survey." Libraries and the Academy, 6:2, 2006, 197-217.
Gibbons, Susan. "Establishing an institutional repository." Library Technology Reports, 40:4, 2004.
Lynch, Clifford. "Institutional repositories: essential infrastructure for scholarship in the digital age." ARL: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC, 226, 2003. <http://www.arl.org/newsltr/226/ir.html>.Lynch, Clifford, and Joan Lippincott. "Institutional repository deployment in the United States as of early 2005." D-Lib Magazine, 11:9, September 2005. <doi:10.1045/september2005-lynch>.
Copyright © 2007 Karen Estlund and Anna Neatrour