Soo Young Rieh,* Karen Markey, Beth St. Jean, Elizabeth Yakel, and Jihyun Kim
Institutional repositories (IRs) have increasingly been deployed in academic institutions in order to organize, preserve, access, and facilitate use of digital content produced by members of their communities. There are numerous definitions of IRs (e.g., Branin, 2005; Crow, 2002; Lynch, 2003); for the purposes of our survey, we adopted a broad definition from Lynch: "a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members." Our take on IRs is supported by the wide range of IR activity evidenced in case studies (Buehler & Boateng, 2005; Graham, Skaggs, & Stevens, 2005; and Nolan & Costanza, 2006). These case studies point to great uncertainties underlying IRs with regard to practices, policies, content, systems, and other infrastructure issues for both institutions that have implemented IRs as well as those that are just contemplating such implementation. This is primarily due to the nature of IRs, which must be locally developed to meet the needs of each institution. The purpose of this article is to discuss how five key components of IRs leaders, funding, content, contributors, and systems are perceived by IR staff at academic institutions where IRs have been implemented, pilot-tested, and planned. Findings are based on the Census of Institutional Repositories in the United States carried out by the Making Institutional Repositories A Collaborative Learning Environment (MIRACLE) project at the University of Michigan with funding from the Institute of Museum and Library Services (IMLS) (Markey, Rieh, St. Jean, Kim, & Yakel, 2007).
The discussion of IRs in this article focuses on a comparison across four categories of IR involvement: (1) no planning to date (NP); (2) planning only (PO); (3) planning and pilot-testing one or more IR systems (PPT), and; (4) public implementation of an IR system (IMP). We believe that it is important to compare the IR census findings with respect to these four stages of IR development, because the experiences and lessons learned from IMP and PPT institutions can be useful for less mature IRs. The perceptions and expectations from NP and PO institutions are of interest as well, because they provide both misconceptions of IRs as well as new approaches to them. This article addresses the following topics and questions:
The article begins with background of IR topics and issues, and introduces our methods and characteristics of census respondents. It concludes with an examination of long-term issues pertaining to IRs.
Why Conduct a Census of Institutional Repositories?
There are several previous surveys of IRs in academic institutions. These surveys have been hampered by a small sample size (e.g., Lynch & Lippincott, 2005) or focus solely on large research universities (e.g., Shearer, 2006; Bailey et al., 2006). The 2006 survey of ARL libraries by Bailey et al. reports that the library is the primary unit leading and supporting the IR effort. Bailey's respondents cite the following as the main reasons why they have implemented an IR: "to increase the global visibility of, preserve, provide free access to, and collect and organize the institution's scholarship" (Bailey et al., 2006, p. 20). Bailey's research team also note differences between the perceptions of institutions planning IRs and those with operational IRs, particularly in the areas of resources, time, and levels of difficulty. They conclude that ARL libraries have demonstrated a strong commitment to IRs. Of 87 ARL libraries responding to the Bailey et al. survey, 37 (43%) have an operational IR, 31 (35%) are planning for one by 2007, and 19 (22%) do not anticipate developing an IR.
Shearer (2006) reports on a 2005 survey of IR projects in the Canadian Association of Research Libraries (CARL). Since 2003, CARL has conducted a periodic survey that contains questions about the content, policies, software platforms, and advocacy activities of member IRs. Out of 17 respondents in 2005, 9 had working repositories, up from 4 in 2004. Shearer notes that while the benefits of IRs have been widely acknowledged, the concept of an IR is still evolving in Canada. Many of the current practices employed in CARL IRs do not strictly adhere to the SPARC definition of an IR as "a digital archive of the intellectual product created by the faculty, research staff, and students of an institution" (Crow, 2002). Rather, each of the CARL libraries determines the scope of collection policies of its own IR.
The deployment of IRs in the academic sector is an international phenomenon (van Westrienen & Lynch, 2005). The van Westrienen and Lynch survey results reveal great diversity among IRs. In some countries, such as Germany, Norway, and the Netherlands, IRs have already become common infrastructure in academic institutions. In Finland, IRs are just getting started; in Norway, 90% of the current IR records are for books and theses; IRs in France tend to have more articles; and Australian IRs have more primary source data than those in other countries. Van Westrienen and Lynch also found that European IRs have been aided by national policies and other activities that promote the development of IRs.
To avoid duplicating previous IR surveys, we conducted a census of academic institutions in the United States that included institutions not yet involved in the IR movement. We are further intrigued by the notion of IR activity beyond research universities. While Lynch and Lippincott (2005) point out that "deployment of institutional repositories beyond the doctoral research institutions in the United States is extremely limited," they find that most of the IRs in non-research universities are in institutions with strong commitments to locally created materials for teaching and learning. Being inclusive in our census enabled us to identify the wide range of practices, policies, and operations in effect at institutions where decision makers are contemplating, planning, pilot-testing, or implementing IRs.
Picking up on the finding of Bailey et al. (2006) that college and university libraries are the driving force behind most IRs, for our census population we decided to target academic library directors in four-year colleges and universities in the United States. Through the American Library Directory Online and Thompson-Peterson mailing lists we identified 2,147 library directors. Then we conducted the census in two stages. First, we sent email messages to the library directors asking them to participate in the census by first characterizing the extent of their involvement with IRs with respect to:
Next, after we received a positive response, we sent those library directors a link to the survey questionnaire geared to their stage of IR development. Respondents were advised that they could complete the questionnaire themselves or could delegate the task to other administrators or staff in their library who were more knowledgeable about their institution's IR plans. Each institution responded to no more than one questionnaire. The IR census was conducted from April 2006 through June 2006.
Of the 2,147 library directors we contacted, 446 responded to our survey, resulting in a 20.8% response rate. Characterizing the extent of their involvement with IRs, 236 (52.9%) respondents have done no planning of IRs to date (NP), 92 (20.6%) respondents are only planning for IRs (PO), 70 (15.7%) respondents are actively planning and pilot-testing IRs (PPT), and 48 (10.8%) respondents have implemented an IR (IMP). When those in the latter group were asked how long their IR has been operational, 52.1% of respondents with operational IRs cite up to 12 months, 27.1% from 13 to 24 months, 4.2% from 25 to 36 months, and 16.6% for more than 36 months.
Library directors were the primary respondents to our questionnaire (n=288, 73.7%). Other respondents included library staff (n=40, 10.2%), assistant or associate library directors (n=31, 7.9%), archivists (n=11, 2.8%), CIOs (n=8, 2.0%), vice president (VP) or provost (N=1, 0.3%), and others (n=12, 3.1%). At IMP institutions, a larger percentage of library staff (43.3%) and associate or assistant library directors (27.0%) completed the questionnaires. Library directors were the most common respondents at NP (90.6%) and PO institutions (71.3%).In general, the percentage of library directors responding to the census decreases as IR activity at their institution increases, probably a result of IR staff becoming more knowledgeable about their institution's IR. It also appears that associate or assistant library directors get more involved with IR activities as the institutions move from the PO to PPT stages. Associate or assistant library directors responded to the questionnaire at both PPT (26.7%) and IMP institutions (27.0%) at a much higher rate than those at PO institutions (0%).
Institutional Repository Leaders
A variety of different individuals lead IR efforts across the different stages of development: planning only (PO), planning and pilot-testing (PPT), and implementation (IMP) (Table 1). In general, librarians lead the IR effort in all stages of IR development. In the planning stage, library directors take the lead, but they relinquish it when the IR reaches the PPT and IMP stages, at which time one particular staff member or an assistant or associate library director takes over the IR. Archivists, CIOs, and faculty members are more likely to lead the IR effort in the planning stage. At NP institutions (not shown in Table 1), respondents report that the people who will be active in the IR effort are external to the library, and they list the following people as being active in their future IRs (ranked order): library directors, institution's VP or provost, faculty members, institution's chief information officer (CIO), and library staff members.
Table 1. Positions of People Leading the IR Effort
We asked who participates in IR planning and implementation committees. Figure 1 depicts the dynamics of committee membership at different stages of IR development. IR committees represent the broadest spectrum of the university community at the PO and IMP stages. Library directors are more likely to be present on IR committees during the PPT stage than in the PO and IMP stages. Library staff and assistant/associate library directors are likely to be on committees as the IR progresses toward implementation. The archival presence increases as the IR enters the planning and pilot-testing stage and nears implementation. The likelihood that staff members from the VP's or provost's office are on IR committees decreases as the IR moves from the PO stage to the IMP stage, as does faculty involvement.
Funding for the Institutional Repository
We queried respondents about the financing for the IR. Across the board, respondents agree that the funding comes or will come from the library. A typical strategy is to absorb costs into routine library operating expenses. Respondents also agree that funding does not and will not come from academic units. As seen in Table 2, the top-ranked funding sources do not differ greatly across different stages of IR development.
Table 2. Top-ranked Funding Sources
Curious about how IR funds are being spent, we asked respondents what percentage of their IR's annual budget was allocated to various line items. Costs for staff and vendor fees represent about 75% of the budget, with staff costs exceeding vendor fees during planning and pilot-testing (PPT) but with vendor fees exceeding staff costs during implementation (IMP). Hardware acquisition makes up approximately 10% of the budget while software costs are 7% and 2.5% of the PPT and IMP budgets, respectively. Together the costs of software and hardware maintenance and system backup account for only one-eighth (12.5%) of the IR budget (Figure 2).
When open-ended comments are offered, several IMP respondents note the informality of their IR's budget:
Digital Documents in Institutional Repositories
The small size of many IRs surprised us. About 80% of PPT respondents and 50% of IMP respondents report that their IRs contain fewer than 1,000 documents. Only four (8.3%) IRs in the PPT stage and seven (19.4%) in the IMP stage host over 5,000 documents. There is also no relationship between the number of digital documents held in the IR and its age or stage of IR development; some IRs in the pilot-testing stage have larger collections than those in the operational stage.
PPT and IMP questionnaires listed three dozen digital document types and asked respondents to estimate how many documents per type were in their IRs. At IMP institutions, top-ranked document types are closely related to the research enterprise of faculty and graduate students, such as doctoral dissertations, working papers, journal articles, and raw data files that result from doctoral dissertation and master's thesis research. Doctoral dissertations are also a very prevalent document type at PPT institutions; however, similarities with IMP digital content end there. At PPT institutions, the second most common document type is preprints, followed by "other learning objects prepared by faculty, lecturers, teaching assistants, etc." Master's theses and journal articles are ranked the fourth and the fifth most popular document types at PPT institutions. Respondents seldom give high estimates for document types that would be packaged in numeric and multimedia files, e.g., video recordings of performances, e-portfolios, raw data files, software, sound recordings of interview transcripts, and maps; however, there is evidence that numbers for non-text files will grow in the years to come (see, for example, Ithaka, 2006).
Initial IR proponents envision them as competition for peer-reviewed journals and as containing research results of the faculty (e.g., Johnson, 2002; Pinfield, 2002) and to a lesser extent student papers. The reality is a bit different. Dissertations and pre- and post-prints do frequently populate IRs, but other materials are also prevalent. We categorize some of the digital document types as archival (from special collections or the university archives). When we look at the content through this lens, we find that overall 20.5% of IR content is archival. Fifteen IRs have between 90% and 100% archival content, and 26 IRs contain over 50% archival content.
Digital Content Recruitment
Previous studies demonstrate that recruiting content is one of the most important factors determining whether an IR will be successful (Branin, 2005; Chan, 2004; Foster & Gibbons, 2005). Census respondents report that recruiting content for the IR is difficult. At institutions with operational IRs, IR staff are willing to consider institutional mandates that require members of their institution's learning community to deposit certain document types in the IR. Asked why they think people will contribute to the IR, respondents give high ratings to reasons that enhance faculty's scholarly reputations and that assign responsibility for research-dissemination tasks to others so that faculty can focus on intellectual tasks. Lower-ranked reasons pertain to enhancing the institution's standing (e.g., "boosting the institution's prestige") and reforming scholarly communication.
Census questionnaires asked respondents about their digital content recruitment methods. Differences emerge between what respondents at NP and PO institutions think would be good methods and what respondents at IMP, and to a lesser extent at PPT institutions, have found to be successful. The majority of PO, PPT, and IMP respondents give "very successful" ratings to only one of the nine listed methods "Staff responsible for the IR working one-on-one with early adopters." IMP respondents are the least positive about this method with 61.1% giving it a "very successful" rating compared to PO and PPT respondents at 72.5%, and 79%, respectively. Another content-recruiting method asked about is "word-of-mouth from early adopters to their colleagues in the faculty and staff ranks." While most PO (92.3%) and PPT (90.3%) respondents (with less recruiting experience than IMP respondents) think that this is a very or somewhat successful method, IMP respondents have doubts; quite a few of the IMP respondents (19.4%) select either "do not know" or "no opinion" response categories. As for other digital content recruitment methods, the same pattern emerges. PO and PPT respondents are more likely to see all types of content recruitment methods as potentially successful, whereas IMP respondents tend to view some of the methods with more skepticism.
Institutional Repository Contributors
We asked two questions about the contributors to the IR. The first question pertained to those who are authorized to contribute content to the IR (Table 3).
Table 3. Authorized Contributors to IRs
(A "T" in the Table above indicates tied rankings.)
Faculty and librarians top the list across all stages of IR development. Librarians and archivists are especially likely to be active contributors, because they may be proxies for faculty and research scientists or they may deposit digitized archival or manuscript materials into the IR, respectively. It is interesting to note that research scientists are the third most highly ranked active contributors at IMP institutions. They are ranked in the middle at PPT and PO institutions and toward the bottom of the list at NP institutions. This makes sense, since a large percentage of operational IRs are in institutions classed as research universities. Another interesting finding is that college and university administrators are more likely to be authorized contributors at NP, PO, and PPT institutions than they are at IMP institutions.
The second question asked respondents to choose the major contributor to their IR (Table 4).
Table 4. The Major Contributor to the IR
Although IMP respondents credit faculty more than anyone else with being the major contributor to their IRs, the percentage (33.3%) is lower than that of PO (48.1%) and PPT (59.7%) respondents. PO and PPT respondents do not foresee graduate students being the major contributor to their IR, even though we find that graduate students are major contributors to IRs at more than 20% of IMP institutions. A large percentage of PO respondents envision archivists as the major IR contributor; however, IMP respondents do not perceive archivists to be as active as other contributors. The large percentage of PPT respondents who choose librarians as the major contributor may be due to the fact that librarians are often surrogate contributors for faculty and other university staff. Only 10.3% of IMP respondents choose librarians as their IR's major contributor.
Institutional Repository Software and System Features
PPT and IMP respondents have pilot-tested and/or implemented numerous IR applications (Table 5).
Table 5. IR Systems Pilot-tested and Implemented
As expected, DSpace is the most prevalent system both in terms of pilot-testing and implementation. Fedora and ContentDM are regularly pilot-tested but rarely implemented. Bepress, developed by Berkeley Electronic Press, is also selected at eleven institutions where IRs have been implemented. ProQuest's Digital Commons, which also runs on the bepress system, is the third most often implemented IR application. The number of IR applications is growing, and IR staff at newer IR implementations have many more commercial software applications from which to choose.
Questionnaires asked PPT and IMP respondents to rate their IR system's adequacy with respect to different features. Overall they agree that the current level of support for different file formats and adherence to open-access standards are adequate. PPT respondents are less satisfied with the technical support and the scalability of their systems than IMP respondents. Generally IR-system functionality for browsing, searching, and retrieving digital content is satisfactory; however, end-user interfaces do not receive high ratings. Both PPT and IMP respondents rank controlled vocabulary searching and authority control the least satisfactory. Asked how likely IMP respondents are to modify their IR's software; 75% said that they are "very" or "somewhat" likely to do so.
The findings of the Census of Institutional Repositories in the U.S. build on previous research about IRs. Librarians are leading IR development and implementation, serving on planning and advisory committees, pilot-testing software, recruiting content, identifying early adopters, etc. Archivists' roles in the IR also gain strength over time, perhaps due to the critical IR needs for content recruitment and digital curation. Committee membership becomes increasingly less inclusive as the IR project progresses from pilot-testing to implementation. In terms of funding, our results verify previous studies (e.g., Bailey et al., 2006) that demonstrate that the typical approach to funding the IR is through a special initiative fund of the library or by absorbing its cost in routine library operating costs. From the perspective of sustainability, the lack of a strong funding model could potentially be a problem for IRs. Given the fact that most IRs promise long-term preservation to their community, limited funding based on simple reallocation from library budgets could possibly hinder the future migration of IR content and metadata to new versions of current systems as well as to entirely new systems.
The number of digital documents in pilot-testing and implemented IRs is very small (IMP IRs contain an average of 3,207 digital documents while PPT IRs contain an average of 2,313 digital documents), and the major contributors to operational IRs are faculty or graduate students. All respondents report having difficulty recruiting content from faculty and researchers. The low contribution rates to IRs found in the current census echoes that from previous surveys (Lynch & Lippincott, 2005; Shearer, 2006; Bailey et al., 2006). Also, we find that the more mature an IR is, the more skeptical respondents have become about the success of any given content recruitment strategy. This challenges existing IR definitions that posit IRs as an alternative tool for the current scholarly publishing model. In the Census, we could not identify any evidence of IRs influencing the traditional scholarly publishing paradigm as anticipated by Crow (2002) and other authors (e.g., Chan, 2004; Prosser, 2003).
IR-system functionality is satisfactory, but the user interface, including controlled vocabulary searching and authority control, needs serious reworking. At this early point in the development and deployment of IRs, few people have searched these systems. Now is the time to make user-interface improvements before too many users have negative experiences and abandon IRs altogether.
IRs are likely to be a curious mix of primary, secondary, and tertiary sources (e.g., encyclopedias, annual reviews, yearbooks, bibliographies). IR system designers can expect people with varying levels of domain expertise from undergraduate students to senior faculty members to be potential users of IRs at academic institutions. A key objective for the designers should be to make IRs usable regardless of their users' domain expertise.
The majority of institutions where IRs have been implemented are research universities. Most of the institutions that are in the NP and PO stages are master's and baccalaureate colleges and universities. IRs serve diverse purposes in these different categories of higher education institutions. Lynch and Lippincott (2005) identify two types of IRs those that serve as a dissemination tool for e-prints of faculty work and those that form a repository holding the documentation of intellectual work including both teaching and research. In fact, the census (Markey et al., 2007) results indicate that one-quarter of institutions pilot-testing or implementing an IR have two or more IRs available to their institution's learning community. What will happen at institutions with multiple IRs? Will they join forces and consolidate their efforts? It appears that the majority of institutions have not yet decided their IR's focus in terms of digital content and its recruitment, or even which particular directions to take.
Once each academic institution has a clear vision and definition of what the IR will be for its own community, subsequent decisions such as content recruitment, software redesigning, file formats guaranteed in perpetuity, metadata, and policies can flow from that vision. In spite of this uncertainty, as of 2006 the library administrators we surveyed are generally positive and remain enthusiastic about the future of IRs. Perhaps this uncertainty and the ability to shape IR collections and services are creating an opportunity space, enabling IR staff and administrators to develop IR models specifically for their community by learning and adapting strategies, policies, and techniques from others' IR experiences.
This work was supported by the Institute of Museum and Library Services (IMLS) through its National Leadership Grant Program (grant number LG-06-05-0126-05).
Bailey, C. W., Jr., Coombs, K., Emery, J., Mitchell, A., Morris, C., Simons, S., & Wright, R. (2006). Institutional Repositories. SPEC Kit 292. Washington, D.C.: Association of Research Libraries.
Branin, J. (2005). Institutional repositories. In M. A. Drake (Ed.), Encyclopedia of Library and Information Science (Second ed., Vol. 2005, pp. 237-248). Boca Raton, FL: Taylor & Francis Group, LLC.
Buehler, M. A., & Boateng, A. (2005). The evolving impact of institutional repositories on reference librarians. Reference Services Review, 33(3), 291-300.
Chan, L. (2004). Supporting and enhancing scholarship in the Digital Age: The role of open-access institutional repositories. Canadian Journal of Communication, 29, 277-300. <http://eprints.rclis.org/archive/00002590/01/Chan_CJC_IR.pdf>.
Crow, R. (2002). The case for institutional repositories: A SPARC position paper. <http://www.arl.org/sparc/bm~doc/ir_final_release_102.pdf>.
Graham, J.-B., Skaggs, B. L., & Stevens, K. W. (2005). Digitizing a gap: A state-wide institutional repository project. Reference Services Review, 33(3), 337-345.
Ithaka (2006). Ithaka's 2006 librarian and faculty studies: Overview of key findings. <http://www.ithaka.org/research/Ithaka.Surveys.2006.Overview.pdf>.
Lynch, C. A. (2003). Institutional repositories: Essential infrastructure for scholarship in the Digital Age. ARL Bimonthly Report, 226, 1-7. <http://www.arl.org/resources/pubs/br/br226/br226ir.shtml>.
Markey, K., Rieh, S. Y., St. Jean, B., Kim, J., & Yakel, E. (2007). Census of institutional repositories in the United States: MIRACLE Project Research Findings. Washington, D.C.: Council on Library and Information Resources. <http://www.clir.org/pubs/reports/pub140/pub140.pdf>.
Nolan, C. W., & Costanza, J. (2006). Promoting and archiving student work through an institutional repository: Trinity University, LASR, and the Digital Commons. Serials Review, 32(2), 92-98.
Pinfield, S. (2002). Creating institutional e-print repositories. Serials, 15(3), pp. 261-264.
Prosser, D. (2003). Institutional repositories and open access: The future of scholarly communication. Information Services & Use, 23(2-3), 167-170.
Shearer, K. (2006). The CARL institutional repositories project: A collaborative approach to addressing the challenges of IRs in Canada. Library Hi Tech, 24(2), 165-172.
Copyright © 2007 Soo Young Rieh, Karen Markey, Beth St. Jean, Elizabeth Yakel, and Jihyun Kim