D-Lib Magazine
The Magazine of Digital Library Research
transparent image

D-Lib Magazine

July/August 2011
Volume 17, Number 7/8
Table of Contents


Open Repositories 2011: Community Meet-up in the "Live Music Capital of the World"

Carol Minton Morris



Printer-friendly Version



The Sixth International Conference on Open Repositories convened in Austin, Texas on June 8, 2011, bringing people from all over the world together to focus on how repositories might be more closely integrated into the technically and community-driven digital scholarly landscape. The program consisted of 24 general track presentations and four blocks of 24/7's (24 slides presented in 7 minutes) under the broad theme of "Collaboration and Community". Institutional repositories do not stand alone — they are mechanisms for advancing policy and best practices that must co-exist with other systems. This conference showed that in order for repositories to continue to meet the changing needs of knowledge organizations, there is much interesting work to be done to develop repository communities and collaborations, in support of preserving the scholarly record in open repositories.

Photographs of the conference site

Views of the University of Texas at Austin campus.


The Sixth International Conference on Open Repositories once again brought people from all over the world together to focus on how repositories might be more closely integrated into the technically and community-driven digital scholarly landscape. The main program got underway on June 8, 2011 in Austin Texas with introductions from Conference host Mark McFarland, University of Texas, and Conference Program Chair Tom Cramer, Stanford University. Cramer offered details about how the popular OR Conference had grown. This year OR11 received 160 submissions from more than 250 authors on 6 continents. The program consisted of 24 general track presentations and four blocks of 24/7s (24 slides presented in 7 minutes) under the broad theme of "Collaboration and Community." Meetings, workshops, a poster session, sponsor-led events and a CURATECamp were all part of pre- and post-conference activities. The ever-popular JISC-sponsored Developer Challenge yielded out-of-the-box demonstrations of new concepts around how repositories and evolving technologies might better meet the needs of knowledge organizations. The almost weeklong conference wrapped up with DSpace, Eprints and Fedora User Group meetings held on June 10 and 11.


Leading an open source meritocracy

Jim Jagielski, President, Apache Software Foundation (ASF), opened the program with a keynote address entitled, "Open Source: It's Not Just for IT Any More." His talk was focused on the democratic and community-driven open source approach to writing software used to develop many of the world's most successful technologies. He is a self-proclaimed open source and IT enthusiast in all things web and cloud-related, and a core developer of Apache.

He believes that information technology development is now being led by open source-style initiatives and projects. The ASF exists to support the open source software development model by providing:

  • A legal infrastructure so that developers can do what they want
  • Two related operational units: development and administration
  • A direction in the development of software code that puts it in the hands of developers
  • A vision that code will be community-created, and exceptional

Secret sauce for creating community

With more than 70 active ASF projects Jagielski suggests that open source development allows each person who contributes code to have an impact on the IT industry. The draw of open source is that it creates an environment where the makers or developers are connected with the meaning behind their contributions.

Those outside the OS development world often wonder "how crazy are these people?" Money is not exchanged, but institutional resources do change hands. Access to source code and the "tinkering" factor draw geeks to a process that places their skills at the center of an iterative development process.

Jagielski suggests that there is an Open Source ladder of licensing — give me credit, give me fixes, give me everything — that is relative to how much intangible benefit a code contributor receives in exchange for work on a particular project.

Community-created software code works mainly because participants love writing code. By creating a grassroots, peer-based atmosphere of mutual respect and trust where all votes hold the same weight the Apache Software Foundation (ASF) has enabled the development of valuable software by and for a community of developers.


A community of reporters

It may have been the dynamic conference content, or perhaps the ample quantities of southwest-style beef barbeque and music that inspired an avalanche of OR11 posts and tweets from attendees. Reports included interesting, on-the-spot reflections, reviews, analyses, and in-depth summaries of emerging technologies, trends, implementations and impressions in posts and tweets written by OR11 conference participants.

This (partial) list of references is provided for those who want to learn more about specific aspects of the OR11 program as well as pre- and post- conference events, with apologies for any missing links, and thanks to the reporters: Richard Davis, Michael Giarlo, H.J. (Driek) Heesakkers, Leslie Johnston, Bram Luyten, Mahendra Mahey, Peter Murray, Peter Sefton, R. Sutton, and Elias Tzoc.

Slides and Abstracts

Slides and abstracts that are available are linked from the Presentations and Authors page on the conference website. OR11 presenters may still send their presentations to mailto:rsteans@austin.utexas.edu.

OR11 Tweets

Blog posts

da blog, ULCC Digital Archives Blog, by Richard Davis

CURATEcamp at OR11

Library Spring "Accentuate the positive." On innovation for academic research libraries, and keeping up with the Googles, by H.J. (Driek) Heesakkers

The Signal, Digital Preservation, Library of Congress, by Leslie Johnston

@mire, by Bram Luyten

DevCSI blog, by Mahendra Mahey

The Disruptive Library Technology Jester blog: We're Disrupted, We're Librarians, and We're Not Going to Take it Anymore, by Peter Murray

DuraSpace Blog, by Carol Minton Morris

ptsefton blog, by Peter Sefton

Emory Libraries Tech Know-how, by R. Sutton

Elias' blog: Just another blog/or a way to save/share an idea, by Elias Tzoc


Closing keynote: Clifford Lynch on repositories as a catalyst for policy evolution

Clifford Lynch, Executive Director of the Coalition of Networked Information (CNI), offered closing observations on the evolution of repositories over the last decade. In his view there is no good way to measure total content in all repositories, everywhere, because we have many different ways of determining what constitutes a digital object and measuring terabytes of data does not have real meaning. In his closing keynote address to a plenary session at OR11 on June 10, 2011 Lynch took a step back to examine what some of the discussions around the development of global repositories had accomplished.

"Repositories have provided a focus and a fulcrum for an absolutely critical series of policy discussions," he said. Universities' role in the curation of knowledge — how the evidence upon which inquiry is based should be curated — in determining what the academy's role in the dissemination of knowledge is, are all key questions at universities that have been brought to the forefront by the development of repositories. Repository development has advanced important policy issues.

Repository growth has also led to the confusion of mechanism and policy. Institutional repositories are services that support policy choices and are not always the right mechanisms for advancing policies. Focusing on institutional assets and the balance between faculty control, institutional goals and open access are all adjacent points for discussion.

The movement towards open access to knowledge and research is another big win for repositories in the last decade. This debate has in turn surfaced questions about access to research data.

Repositories have also pointed to the need for universities to have a role in dissemination of educational resources.

Collaborations around technical and policy aspects of repository development have created ongoing dialogs and brought new voices into conversations around changing institutional scholarship.


Name authority is built on dirty data

Many nationally and internationally bizarre practices have emerged around name authority and Identity management (IdM). (Wikipedia: "... identifying individuals in a system and controlling access to the resources in that system by placing restrictions on the established identities of the individuals.")

Looking back over the history of IdM, there was too much "stuff" by the beginning of the last century to manage identity of anything but books. There were too many authors to track other types of materials, so a rational IdM system was only established for books and other materials that were shared among institutions.

Lynch feels that an authority file should be simple. National dictionaries of literary biographies do not couple well with name files. We can start doing things with author data from scientific and journal literature. We need to disambiguate by assigning unique author identifiers and cleaning up the current mess.

The ongoing development of institutional repositories can be a big part of matching literary names with identity files from an institution. Most big universities manage information systems that include course management systems, digital libraries, university records, active research, databases and more. Where does the institutional repository fit in and where are its boundaries?


In conclusion: notable trends and a few questions

  • You will see increasing convergence of the concepts of a digital library, institutional repository and/or digital collections. Who does the curation, how is it sourced and how it's scoped are all determining factors.
  • Learning Management Systems (LMS) have become ubiquitous at universities. What happens to the content in learning management systems when the course is over? Is this content institutional records or scholarly materials? Different institutions handle it differently. Currently we lack good taxonomies and policies. It is the exception rather than the rule for a university to regularly export LMS content to an institutional repository.
  • We do not understand where institutional repositories sit in the stream of active work that now includes computation and versioning. How volatile are the contents of institutional repositories? Sometimes high performance, data intensive researchers do not participate in those workflows because the available infrastructure does not support it. Big data sets can be too big to back-up or replicate. How do these issues intersect with institutional repositories?
  • We are seeing the emergence of virtual organizations as collaborative efforts spanning both institutions and nations. We can mobilize groups of scholars to work on a problem in the sciences, social sciences and now, in the humanities as well. What are the rules of the road?
  • Scholarship is tied up in active, current software systems. "This makes me nervous," said Lynch. Reproducibility is lacking. We do not have clear ways of talking about the provenance of work that sits among a huge stack of software. We are going to have to deal with software in our repositories. Software can become obsolete very fast.
  • When faculty members retire what happens to their work? Repositories could have a role in taking stock of scholarly work and migrating it to an institutional repository.
  • Do repositories extend beyond universities? How should scholarly records from public libraries, historic societies and other civic organizations be included?
  • Geographic replication of repository content by removing geographic points of failure using cloud storage is on the horizon.
  • Distributed control and organizational redundancy goes hand-in-hand with distributed content.
  • The contents of digital repositories did not all start life as published materials. Thinking in smaller chunks of time for preservation is a good idea. "Eternity," "duration of the species," and "perpetuity" are terms that are too big and not well understood.

It is clear that institutional repositories and society, writ large, will need to work together because IRs do not stand alone. They are mechanisms for advancing policy that sit among many adjacent systems. There is still work to be done on developing the appropriate scope for repository development while considering linkages that can work.


Open Repositories 2012

The Seventh International Conference on Open Repositories (OR12) will be hosted by the University of Edinburgh in Scotland from July 9-13, 2012. To get announcements about this event and other OR-related communications, you may join the OR announcements mailing list here: http://groups.google.com/group/open-repositories.


About the Author

Photo of Carol Minton Morris

Carol Minton Morris is Director of Marketing and Communications for DuraSpace, and is past Communications Director for the National Science Digital Library (2000-2009) and Fedora Commons (2007-2009) where she was a research assistant in the Digital Libraries group at Cornell University Computing and Information Science. She leads editorial content and materials development and dissemination for DuraSpace publications, web sites, exhibits, initiatives and online events, and develops strategies to connect open access, open source and open technologies people, projects and institutions to relevant news and information. She was the founding editor of NSDL Whiteboard Report (2000-2009) featuring information from NSDL projects and programs nationwide. Follow her at http://twitter.com/DuraSpace.

transparent image