Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Conference Report

spacer

D-Lib Magazine
March 2006

Volume 12 Number 3

ISSN 1082-9873

Web Wise 2006 Inspiring Discovery - Unlocking Collections

Conference Report

 

Stuart L. Weibel
Senior Research Scientist, OCLC Research
Visiting Scholar, University of Washington iSchool
<stuart.weibel@gmail.com>

Red Line

spacer

I was invited to do the end-of-conference summation for Web Wise 2006, the seventh conference in the series, cosponsored by the Getty Trust and OCLC, and supported by IMLS. The sold-out conference was held in Los Angeles, February 15-17, 2006, around the corner from the Disney Concert Hall, Frank Gehry's splendid ode to sheet music rendered in stainless steel (LA should be proud). It was a thrill to have the conference reception there, sort of like being inside an elaborate, oversized guitar (the interior, unless I'm mistaken, is trimmed throughout in tight-grained cedar... you have to wonder if they induced a shortage of materials in the world of luthiers).

Photo of the Disney Concert Hall

Photograph by Stuart Weibel of the Disney Concert Hall

The Keynotes

Paul Courant of the University of Michigan highlighted changes in scholarship that Web technology has fostered and foisted. A few highlights:

Soon, that which is not online will be irrelevant. If there is but dross, then our culture will be drossful, and it is in part the mission of cultural heritage organizations to assure otherwise.

Collaboration across time and space is the essence of scholarship, and modern Web-based information infrastructure affords unparalleled support for that. Some of the changes underway (for example, Google Print and the Open Content Alliance) are truly revolutionary, but the current framework for intellectual property (IP) is broken (a common theme in the conference). Some salient sound bites:

  • Copyright is intended to give credit for ideas, not lock them up.
  • 95% of all copyrighted materials are out of print.
  • Business models for libraries look nothing like a business. We are in the business of creating public non-rival goods, whose values are undiminished (in fact, are enhanced) by wider use.
  • The cost of first-use may be very high, but the marginal cost for the next user rounds to zero.

Dan Greenstein's talk was a witty rendition of fear and loathing (of the Amazoogles) <http://orweblog.oclc.org/archives/000562.html> among cultural heritage institutions – it is hard not to be threatened by much of what they do. My own thoughts during his talk turned to the Gary Larson Farside Cartoon of a Dinosaur at a podium opining:

"Gentlemen, the picture is bleak. The earth's climates are changing, the mammals are taking over, and we all have a brain about the size of a walnut."

But meanwhile, Greenstein enjoined us:

  • Collect more data to support recommender systems.
  • Aggregate content ('rent' one another's data).
  • Create global value by aggregating unique, local collections.
  • Be less defensive, faster, more Amazoogle-like.
  • Study our users and learn what they need.
  • Remake the professional cultures of our fields.
  • Squeeze, squeeze, squeeze more efficiencies from our workflows.

Ken Hamma of the Getty Trust, spoke Friday morning, suggesting that each of us has a different understanding of our obligations in the public commons, and often our understandings don't coincide, even within our own communities.

Policy and business decisions inform the shape of the future even more than technical constraints, and these business decisions should be aligned with the missions of our respective institutions and brought into productive alliance wherever possible.

In the course of Ken's talk he praised and commended to our attention a report on fair use from the Brennan Center for Justice at the NYU School of Law: Will Fair Use Survive? Free Expression in the age of Copyright Control, <http://www.fepproject.org/policyreports/WillFairUseSurvive.pdf>.

Panel on Copyright and Intellectual Property

Ken's talk was followed by a panel on copyright, intellectual property, and related issues (Jim Gilson, Maureen Whalen, and Sara Hodson presenting), that addressed some of the IP issues attendant to mounting exhibits in the current age.

This panel immediately preceded my summary talk, and my editorial frenzy made it difficult to follow a dense and lawyerly discussion largely unencumbered by PowerPoint slides. I felt a little like another Gary Larson cartoon (what dogs really understand). Nonetheless, several ideas found a home in my short attention space:

  • Our challenge often lies in finding the balance among institutional mission, project objectives, contract law, fair use, IPR, privacy, and risk assessment.
  • Privacy rights mostly die with us (but not the ethical obligations of our survivors).
  • No (response) means "NO" in the search for orphan works permissions. Out of print or not, it is important to make a strong effort to secure appropriate permissions for the use of orphaned materials.
  • The most important (or at least most encouraging) idea of the session was that IPR laws are slippery and ambiguous, and policy decisions and legal advice are different animals. In planning museum projects, a well-structured, mission-justified plan is the most powerful tool in defense of fair use.

News You May Not be Able to Use

Paul Gherman of Vanderbilt spoke about the Television News Archive, which struck me as a perfect example of Paul Courant's public good and the dilemmas faced in the Commons:

  • Public assets are often costly to create and are undiminished by broad access and use (they are non-rival goods), but
  • Copyright and control issues limit access and an unstable business model combined with budgetary constraints threaten this unique and irreplaceable asset.

Assuring persistent access to content of this type of material is among the most difficult tasks of our community, complicated as it is by both technological challenges and the IP climate of the current era.

In other news from the news sector, Victoria McCargar spoke about a range of newspaper digitization efforts, pointing out that business considerations (not public good creation) dominate decision making in this area, the result being that public access is often severely limited. There is poor coordination of standards and best practices within the sector, leaving interoperability among various systems low.

McCargar suggested that print drives most other news, but is imperiled by an unstable business climate (how many of us, in the face of models such as the New York Times Select program, simply look elsewhere for their online news? If the bastions of serious journalism are all laying-off staff and curtailing coverage, are we seeing the Tragedy of the Commons unfold before our eyes? Whither (wither?) democracy in the face of decline of a strong Fourth Estate? And what will the bloggers have to talk about, if serious journalism fails?

Karen Cariana of the WGBH Media Library presented work being done to integrate finding aids from three disparate media programs of increasing sophistication. The project illustrates the difficulties inherent in integrating evolving technology as well as our evolving understanding of that technology. The content is complex multimedia, the metadata is rich, the users diverse, and understanding of their expectations rudimentary.

Automation

The need for greater automation emerged in a number of talks as an alternative to 'hand crafted metadata'. (Makes you think of quilts, doesn't it?)

Steve Mitchell of the University of California (Riverside) spoke about iVIA Data Fountains automated metadata creation tools (http://ivia.ucr.edu/). These open source tools are designed to assist in the creation of metadata in a portal environment and then do focused crawling for related resources about a given topic.

Doug Holland of The Missouri Botanical Garden described SciLINC (Scientific Literature Indexing on Networked Computers), an approach to processing large amounts of scientific literature by taking advantage of unused computer processing power much like the SETI efforts to process signals for signs of extra terrestrial life. The expectation is that a rich annotation and linkage corpus on botanicals will result, improving access and coverage of materials for both public and scientific use.

Both of these projects emphasize new technical applications to open collaboration models that have long been our community mainstays.

Numbers are Important

Bill Moen described work on MARC Records as forensic evidence, if you will, of crimes against simplicity (my metaphor, not his). A couple of statistics:

  • 4% of all fields account for 80% of all occurrences
  • Field-subfield combinations in MARC have grown from 200+ to 2,000 in 30 years.

Creeping elegance of this sort is a natural human foible, it would seem, but threatens the usefulness of our discovery systems and has implications for cataloging practice and rules that we can ill afford to ignore.

Bill's data tell us what we've been doing, but we must also find out what we should be doing. Quantitative studies of cataloging practice are as rare as successful business models during the high tech bubble. Sir... more, please!

Aggregation and Consolidation

These ideas emerged from many presentations at Web Wise 2006. Coincidently, Lorcan Dempsey published a long piece the day before on his blog that would have fit very nicely indeed into the framework of this conference, and which I commend to the reader's attention: Libraries, logistics and the long tail (http://orweblog.oclc.org/archives/000949.html).

Katherine Walter, of the Walt Whitman Project (http://www.whitmanarchive.org/) described the benefits and challenges of integrating several varieties of metadata and encoding standards to achieve cross-community cooperation and integration that are essential for value creation.

Diane Hillmann of Cornell University Libraries reminded us that aggregation is a key strategy for bringing attention to our collections. Good quality metadata (consistent within a domain) is important to providing access to distributed collections such as are the common stock of the NSF's National Science Digital Library program. She identified a number of obstacles to effectiveness that resonated with other reports in the conference: Resistance to sharing of assets, of metadata, and giving up control of presentation (bespoke portals).

Elisa Lanzi of the Smith College Imaging Center suggested that, while a picture may be worth a thousand words, we still need the thousand words to fix the pictures in their context and to find them (metadata). Elisa pointed out that image aggregation and description in the academy is still nascent and in need of better standards and best practices. My own editorializing on this issue might be summed up as:

No More Hand wringing about authoritative metadata! There is room and value in variety and diversity.

As a community, we tend to be uncomfortable with multiple varieties of metadata. Authority control is part of our soul, and consistency (hobgoblin or no) is one of our core values.

These things have their place, but it is also true that handcrafted metadata is expensive, and our notions of useful metadata must expand to include diverse varieties and sources of metadata. We should:

  • Use everything we can to strengthen the fabric of public metadata linking to cultural heritage assets;
  • Provide incentives for its creation by users as well as professionals – reviews and recommender systems are essential parts of the public bibliography, and we need to provide better incentives; and
  • Make these things first class objects with stand-alone identity and make them harvestable.

Having said this, does it mean we don't have uses for expensive, handcrafted, highly detailed metadata? No, and several presentations on Friday testified to what can be done with rich, well-structured, authority-controlled metadata.

My colleague in OCLC Research, Diane Vizine-Goetz, spoke about Fiction Finder <http://fictionfinder.oclc.org/>, a treatment of the fiction represented in WorldCat which includes:

  • Clustered views of related records based on the IFLA Functional Requirements of Bibliographic Records (FRBR) model as implemented in OCLC's FRBR algorithm work;
  • Value based on high quality, authority-controlled data;
  • Better choices provided to users through the richness of FRBRized records – diverse choice and coherence.

Lewis Lancaster and Michael Buckland described the Electronic Cultural Atlas Initiative (ECAI) <http://www.ecai.org/>, an ongoing effort that provides the means to address the what, where, when, and why questions for digital cultural content (especially the where and when parts).

Digital collections make it possible to pull together related information as never before, but only if we learn how to map across vocabularies, conceptual structures, context models, metadata, and more. The demo of the dynamic ECAI map of the expansion and collapse of the Mongol empire is a great illustration of what can done with rich, bespoke metadata.

Jay Hoffman of Gallery Systems <http://gallerysystems.com/> described a vendor's approach to "Distributed Search and Sharing of Collaborative Collections" (renting our collections to one another in Dan's parlance). The business perspective on aggregation is an important one, reminding us that in metadata-intensive applications of this sort, it is a major challenge to keep costs manageable (both development costs and demands on staff).

Of course, the IPR issues are critical here as well, and the need for better collection-level descriptions becomes evident.

Knowledge to Wisdom?

The day and a half of keynotes and presentations at WebWise 2006 came together with a coherence that helped to clarify major issues and challenges facing the library, museum, and archives communities. These issues are not new to us, but bringing them into focus is an important step towards responding to them effectively.

Technology, workflow, standards, and best practices are important issues, and project presentations in the conference afforded recognition and provided interesting approaches to coping with these challenges.

Business models, intellectual property rights, policy, and community culture issues are the backdrop for the technology and are, unsurprisingly, the critical features of our future. Clever technology, deep numerical insights, and clear standards of practice are necessary but are not sufficient for success. Our organizational missions are founded in the collaborative creation of public goods. It is ironic that the Web, the world's most collaborative technology, threatens our business and cultural models, even as it affords extraordinary opportunities to deliver more value from our collections. If it is true, as Paul Courant suggests, that digital culture will be the only culture, then our professional futures and the cultural futures of our society are one and the same. The defining moment of WebWise 2006, for me, then, was Ken Hamma's question: Can we marshal:

"A will to share, equal to our ability to do so?"

(On April 4, 2006, the spelling of Victoria McCargar's name was corrected.)

 

Copyright © 2006 OCLC Online Computer Library Center, Inc.
spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Previous Article | In Brief
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/march2006-weibel