Letters to the Editor


D-Lib Magazine
September/October 2008

Volume 14 Number 9/10

ISSN 1082-9873

To the Editor

The letter below was received as a rebuttal to a Letter to the Editor from Henry Gladney that appeared in the July/August issue of D-Lib Magazine. (Dr. Gladney's Letter to the Editor was in response to the two-part commentary, Rethinking Personal Digital Archiving, by Catherine C. Marshall in the March/April 2008 issue of D-Lib Magazine.)

To the Editor:
July 15, 2008
Dr. Gladney's response to my two-part commentary underscores the dangers of writing 15,000 words on a topic you care about: your audience may not walk away with a clear message. Certainly I did not intend for the stories and examples to obscure the article's main points – that home computer users are accumulating overwhelmingly large digital collections; that they are storing multiple copies of these collections distributed over on- and offline media; that they cannot and will not become digital curators; and that for several clear reasons, retrieving items from long-term storage is a different problem than the retrieval problems we've tackled to date.
Every researcher who presents qualitative study results is familiar with the accusation that their readers have no idea "how many people the anecdotes represent." First, any vignettes or examples I have recounted represent trends in my data; they are not anecdotal. Even quantitative research uses samples, not the entire population. Because this article fell under D-Lib's rubric of commentary, I felt it unnecessary to talk about methods, such as screening participants to represent a broader population; the papers I cited discuss how the individual studies were conducted.
So how many home computer users do these findings represent? How broadly do the implications apply? I'd assert that the vast majority of experienced home computer users have grappled with the four challenges that I outlined. The variation is only in the details. For example, while most home computer users have a de facto distributed store for their digital belongings, some of them implement it by writing lots of CDs and DVDs (and before that, floppies) while others scatter files among online services (Yahoo mail, Flickr, PhotoBucket, and countless others). It's less important where they put these items than it is that they've stored copies in many different places for many different reasons and they have no cause to centralize them. If a reader were concerned with numbers (e.g. Just how many people have used an online service like Flickr or Facebook? Just how many consumers have had trouble with viruses?), they would have no trouble consulting the Pew Internet project's website1 or their own company's marketing figures. Or they can consult the services themselves: when the commentary was published, Flickr held 2.2 billion consumer photos; at this writing, it has more than 2.6 billion.
That I have not discussed potential solutions comes as something of a surprise to me. Part 2 was intended as an exploration of technology directions, although I'd readily admit it is still rough and not close to being a software specification. Others have found this discussion sufficiently prescriptive to come back to me with their implementation plans or new technologies that would fit under this factoring of the issues. One of the strengths of the approach I discussed is that it may be used in conjunction with other archiving approaches (including Dr. Gladney's own!). For example, say a home computer user is already using Mozy as an informal archive. A union catalog of her digital belongings can include the Mozy store without assuming that it is the sole place where her digital belongings are stored. Mozy goes away? Fine – this approach anticipates the potential demise of individual stores, but it also allows consumers to take advantage of the existence of cool new products and services, services beyond those that either Dr. Gladney or I have been able to foresee.
One of the issues that I discuss – one that I've seen very little serious work on – is how we're going to handle the inconceivable accumulation of medium- and low-value items that Dr. Gladney alludes to this in his third bullet. I'd argue that the specific size of these collections matters less than the fact that they are overwhelming; they are growing; and – except for a small number of high-value items – they are full of undifferentiated things – ten almost identical photos of the view off the balcony of our vacation rental; ten thousand email messages, thousands of CDs we have ripped and podcasts we have downloaded. Of course storage is becoming cheap enough to save all of it, but human attention is becoming more precious: we may have all 50,000 of our vacation photos, but we may be reluctant to browse them for fear of being swamped by their number.2 We'd like, in an ideal world, to have some but not all of these things and we fear we will not have selected the most evocative ones, since it is so difficult to assess value far in advance.
In the final regard, I'm mildly nonplussed by Dr. Gladney's response, not because I disagree with it, but rather because I agree with so much of it and feel like I've said as much in my commentary. It's inarguable that people will only use extremely convenient and very low cost (or apparently free) methods of preserving their data, and that automation of the clerical portions will be essential. Of course! (I believe that was the point of an entire section of Part 2.) I acknowledge that adoption of standard file formats is vital, and we're at least part of the way there.3 And as Neal Beagrie has reminded us in this very publication (in echo of Richard Feynman), 'There's plenty of room at the bottom.' We all have plenty of bits and ample places to store them; now the trick is to keep them, to use them, and to enjoy them going forward.
Cathy Marshall
Microsoft Research, Silicon Valley
1. <http://www.pewinternet.org/>
2. There are promising research prototypes for browsing large photo collections, but I'm arguing that as it stands, home computer users frequently feel their digital belongings are an overwhelming burden.
3. Here's where numbers would be nice: We encountered home computer users who stored most of their photos as JPEGs; however, the really important photos they saved in their camera's RAW format, believing this was the best way to ensure image quality going forward. Whether this is the prevailing practice, I cannot say. It seems important to find out and to work with consumers and the manufacturers of consumer electronics to ensure that consumers are aware of the long-term implications of their decisions.

Letters concerning articles selected for possible publication as Letters to the Editor will be forwarded to the article authors for response. If published, the Letter to the Editor will appear with the article authors' responses whenever possible. D-Lib Magazine reserves the right to edit or shorten letters. If you prefer, you may request that your letter not be published.

Letters to the Editor present the opinions of their authors. They do not necessarily reflect the views of D-Lib Magazine, its publisher, the Corporation for National Research Initiatives, or participants in the D-Lib Alliance.

Please send your Letters to the Editor to [email protected].

Copyright© 2008 Corporation for National Research Initiatives

Top | Contents | Editorial
Search | Author Index | Title Index | Back Issues
First Article