By digitizing physical holdings, museums have an important opportunity to provide patrons unprecedented access to manifold media. Digitization of existing collections, which is the museum's form of the content creation problem, is a crucial and technically challenging step in continuing to narrow the digital divide between the patron's desire for rich multi-media information and what the museum actually provides. We present a case study of the issues involved in executing a digital content creation program in partnership with the Instituto de Cultura Puertorriquena. We focus on the challenges involved in acquiring, organizing and accessing the collections to make them meaningfully available within typical budgetary and technical constraints.
Over the course of the first ten years of widespread Internet access, a primary obstacle for average users has been access infrastructure: computers and network connections. The term digital divide was taken to mean the gap that exists between and among households of various demographic groups when considering availability of and access to infrastructure . As Internet technology has matured, however, the gap in available infrastructure has rapidly narrowed. While it is clear that improvements in basic access can and should continue, the trend toward equalization at the entry point, where many more people now have available computers and network connections, serves to highlight crucial aspects of a deeper digital divide. In particular, at the crux of this disparity is content .
Diverse digital content, which we know as multi-media, is the responsibility of the content provider, not the end-user. While tools and access may aid as a way to seek specific content, the user has no control over content availability, structure, searchability, presentation, or provenance, which are all the sole responsibility of the content provider. Herein lies a chasm, which is just as effective as lack of access infrastructure in keeping users from finding what they expect: access is useless if good content is not available.
The museum's digital media divide is the gap between the museum's physical holdings and its digital offerings. This gap is a significant barrier and directly affects the needs of patrons to whom the museum is responsible and for whom it exists. We uniformly accept the assumption that the modern museum holds enormously valuable content, certainly in historical and cultural terms, and arguably by many other metrics as well. Yet compelling, rich, digital representations of that content for the user, who increasingly has access to computers and network connections, is still not widely available . The exceptions tend to belong to the most prestigious institutions, where budgets allow the luxury of careful programs for digital multi-media content creation. Even though museums hold our deeply important cultural heritage, they lose ground in the struggle for education, enlightenment and access next to other more readily available content.
The difficult problem of content creation is made harder for museums since they shepherd collections that are complex and hard to model . This paper reports on an ongoing collaboration between researchers at the University of Kentucky and the Instituto de Cultura Puertorriquena. We discuss how the collaboration has approached the problem of creating accessible content. A major goal has been the development and application of novel techniques to streamline, improve, and reduce the cost of the process of acquiring, structuring, and presenting digital collections. We discuss our contributions and empirical results in these areas. We believe museums can and should meet the challenge of content creation and must work systematically toward bridging the digital media divide, which remains a significant gap between the emerging digital museum and its patrons.
Acquisition for Content Creation
We define content creation as data acquisition followed by data organization for access by users. For example, digital photography (or digitization of an existing photographic print) is data acquisition. Content creation is the subsequent handling of the data (mark-up, touch-up, etc.) in preparation for access, e.g., web-based presentation, or display in a virtual gallery. Complex acquisition setups are required for many collections, however, where three-dimensional representations must be constructed and where works need to be handled with special care.
Data acquisition, the first step in creating a digital collection, requires physical access and the application of technology to produce the desired digital representation. We consider two key tradeoffs that must be decided as part of a data acquisition regimen:
The resources required to gain appropriate access to a collection for digitization can be substantial. The degree of access depends on the collection's characteristics (value, importance, storage, and handling protocols) as well as requirements for digitization (lighting, instrumentation, and required handling during digitization).
Since acquisition requires systematic access to collections, museums can facilitate development of digitization programs by lowering the barriers to physical access. One immediate way to facilitate digitization is to establish policies that combine physical access for digitization with other systematic, periodic handling that a collection naturally undergoes.
In museums, an exhibit prepared for loan from a storage facility must be carefully handled by curators. The storage building shown in Figure 1, an example of one of these facilities located in San Juan, is just one of many archives where the Instituto de Cultura Puertorriquena maintains collections for preservation and analysis.
At the point where a collection is being prepared for exhibition it may be cost effective for curators to systematically digitize each piece. In libraries, certain pages of books and other materials could be digitized as part of the policy associated with normal access or check-out.
For digitization processes that require special-purpose technology, such as 3D reconstruction based on laser scanners  and image-based systems using background segmentation technology such as ChromaKey, the cost of physical access increases. Museums can address this by finding permanent studio space where practitioners can deploy and configure technology as needed without impact on other operations. True recognition of the importance of digital content creation will be demonstrated by updated museum policy that allocates appropriate space as well as staff trained for consistent digitization efforts.
Acquisition is the crucial starting point in the content creation chain and has a profound effect on data quality, fidelity, and resolution. Technology that is mature (digital cameras, flatbed scanners, etc.) can be reliably operated by curators and other staff with minimal training. However, technology under active research and development poses problems for staff who are not currently capable of using it reliably.
Museums seeking to establish digitization programs must employ practical strategies for improving digitization capabilities in the face of fast-moving technology and research programs using experimental digitization systems. As pilot programs with academic and industrial partners become more common, museums must dedicate additional resources for staff to exploit these partnerships as training opportunities. Even though it is common for museums to commit staff time in order to gain data acquired by technically equipped partners, when staff serve only as facilitators they miss an opportunity to obtain unique training as part of the partnership. New commercial systems for acquisition continue to become commonplace, and staff with exposure and training from collaborative partners can leverage their skills to begin their own in-house operations. As acquisition technology and follow-on algorithms for reconstruction and representation continue to improve, museum curators must be willing to systematically build digital collections.
Practice: Puerto Rico
In partnership with the Instituto de Cultura Puertorriquena, we have been given the opportunity to digitize three distinct collections. Each collection poses different problems for acquisition and for subsequent processing. We describe here how we performed data acquisition, with the ultimate goal of content creation for the museum. Later sections continue detailing our choice of data representations, methods for display, and strategies for providing access to these disparate digital collections.
The first collection consists of a set of sculptures that illustrates typical cultural celebrations and indigenous folk art found in Puerto Rico (see the example of the mask in Figure 5). These three-dimensional pieces require views from almost all directions in order to acquire a complete and accurate shape representation. Their physical condition varies from pristine to badly damaged.
The second collection is a set of paintings, most framed, some damaged, representing the variety and complexity of Puerto Rican painters over the past three hundred years (Figure 2). These pieces do not require 3D representations in order to capture the essence of the work, but do pose a challenge for lighting arrangements and spatial resolution of digitization .
The third collection is a number of petroglyphs, or symbols and figures carved into rock (Figure 3). These cultural heritage artifacts were carved by the Taino Indians before the Spanish discovered Puerto Rico, connecting the culture from before the colonial era to the present day. The petroglyphs, which are a cultural treasure within the Caribbean, are a good example of a set of objects that is very important to preserve. Furthermore, these artifacts cannot currently be viewed anywhere other than at the ceremonial park in Utuado which is located in the mountains of Puerto Rico.
We digitized the first collection (sculpture) using a calibrated rotating turntable (Magellan MDT-19) together with a set of calibrated stereo cameras (Canon XLS-1 DV; Sony DV) and a secondary digital "texture'' camera (Olympus C-5050). The static, calibrated cameras working together with the rotating table allowed us to acquire images of each sculpture in a full 360 degree rotation. We acquired high-resolution (5.0 MPixel) texture shots from the Olympus every 90 degrees. The pieces, handled exclusively by the museum curator, were placed one at a time on the center of the turntable for data acquisition. To reduce capture time, only the acquisition of images was done on-line. Segmentation and reconstruction were completed in post-production. The cost of this type of system is very reasonable. It is likely that a museum will already own a digital camcorder and high-resolution still digital camera. However, if these items would need to be purchased, they are available for close to $300 each. The rotating turntable can also be purchased for a few hundred dollars.
The second collection (paintings) was digitized using the Olympus C-5050 with appropriate shuttering and a combination of indirect reflective surfaces and direct, diffuse lighting. This acquisition process is shown in Figure 2. We used long-lens shots to minimize lens distortions where possible, and adjusted lighting positions to minimize specularities from the highly reflective oil-on-canvas paintings. Post-processing included segmentation and removal of background. The cost of this acquisition technique is very low, since it only requires moderate lighting equipment and a high-resolution digital camera.
The third collection provided the biggest technical challenge. Since a pure 2D photographic technique would be unacceptable in capturing all the information contained in each petroglyph, a more advanced 3D acquisition process was needed. Commercial products for 3D acquisition were considered. Many cost $50,000 or more, however. Therefore, we decided that a custom solution would be the most cost effective while still providing very accurate results. The developed product used a multi-view structured light reconstruction scanner . A combination of two high-resolution still digital cameras, 4 IEEE-1394 DV cameras, and a PAN/TILT mounted laser line generator, were all controlled by custom developed software. This setup allowed us to reconstruct each of the rocks containing petroglyphs with tens of thousands of 3D points by sweeping the laser line across the surface of the rock and viewing how the line deforms as it progresses in each of the camera viewpoints.
Content creation is the structured representation of acquired data. The digital museum must strive to create content that is true to the original object in appearance (shape, color, etc.) and that is structured, i.e., includes metadata describing aspects of the collection that cannot be automatically derived. For example, information about a piece's place in a larger collection, and its historical significance and cultural provenance, must be included in the structured representation as metadata along with data of the physical properties . The structured linking of metadata to highly realistic models/imagery, which is now widely recognized as a significant and important goal in the digital library community, builds a valuable collection that can be efficiently searched and at the same time remains as true as possible, at the lowest "data acquisition" level, to the appearance of the original.
A third representational goal that must be considered, along with the twin goals of high fidelity and structured metadata, is efficient access. Access includes digital manipulation by curators and museum staff as well as access by patrons. Representational choices must support a large range of possible access modes, making this goal challenging to achieve.
As part of our work with the Puerto Rican collections, we have built layered representations to support a range of access policies. The layers are designed to coincide mainly with resolution constraints, most obvious in heterogenous networked applications where available bandwidth can vary drastically from user to user.
We represent individual pieces in a collection as a set of descriptors. Each descriptor forms its own resolution hierarchy, allowing future access policies to select which descriptors to use, and for each descriptor to select which resolution is appropriate given resource constraints . This hierarchical structure is meant to include metadata, where data about a model is structured in a broad-to-fine hierarchy whenever possible. For certain kinds of metadata, such as regions of interest in the image of a manuscript, this hierarchy is very natural: letter, word, line, and paragraph.
We anticipate two broad access modes with museum content. The first is bandwidth-constrained access, where patrons view a collection over the Internet. The second is in-house access, where a collection is put on display with a special visualization system (as described in the next section on display) or where a dataset is manipulated by a number of experts who are editing and adding metadata to the collection over a dedicated network. In both of these cases, hierarchical data delivery and selection between and among a set of model descriptors can be very advantageous.
Results: Puerto Rico
We are representing the collection of sculpture using image textures, recovered 3D geometry, recovered camera geometry , and expert metadata. These descriptors are each meant to be hierarchical: textures are refined from low resolution to high resolution, for example.
Three-dimensional geometry is refined through surface simplification from coarse to fine, and camera geometry together with images form an image-based representation (texture onto geometry) that can be presented with or without geometry. Figure 4 shows the process of calibrating a camera so that geometry can be reconstructed from a scene. We envision access policies that will leverage the multi-resolution available in each descriptor set, in order to trade off resolution in geometry, texture, and number of distinct views as patrons access a collection.
For example, we use a pre-computed image-based object called the RPC (Rich Photo-realistic Content) . This data model is compatible with many popular modeling/rendering applications such as 3D Studio Max  and OpenGL Performer . The pre-calculated RPC allows the model to be delivered on demand in a visually rich context such as a virtual gallery, and supports interactive manipulation like navigation and exploration. Figure 5 shows how a piece is moved to a virtual gallery.
The painting collection is represented as an image set, refined by resolution and by restoration. We consider digital restoration attempts to be elements of metadata that should be viewable (or not) as desired. Although we have not incorporated multi-spectral lighting into our current acquisition strategy, the representational framework admits models consisting of descriptors based on lighting, which may be of interest in applications where, for example, erasures can be revealed through different spectroscopy signals .
The third digital collection, the petroglyphs, uses a combination of the above descriptors: image textures, recovered 3D geometry, pre-calculated RPC-based navigations, recovered camera geometries, and metadata. This metadata includes segmentation information showing parts of the petroglyph carvings that may not be obvious to the non-expert viewer. The collection of image textures (under varying lighting conditions), surface geometries and pre-calculated navigations provides a number of access and display options, starting with low-resolution access for Internet connections with client plug-ins and progressing all the way up to visually stunning worlds where users can navigate a virtual space populated with segmented petroglyphs.
We believe that layered representations are important for supporting and enabling a variety of access and display settings. Much depends on planning at the acquisition stage so that data can be properly registered. Multi-resolution hierarchies are increasingly important as acquisition technology outpaces the capacity of the average user's network access.
Display of content is arguably the most important goal for the digital museum. High fidelity in acquisition is meaningless when users are provided with only low fidelity at the display. In the case of Internet access for users with traditional displays (i.e., the computer monitor), as a user guides the way, software can help "drill down" in resolution, but this assumes the representational underpinnings for such operations. For patrons who visit museum-controlled exhibitions, it is possible to provide for them a display "milieu" that supports digital collections through a number of new visualization technologies [4, 12, 21]. It is an important challenge for the digital museum to equalize the disparity between the (relatively high) resolution stored in a digital collection and the (relatively low) available display resolution.
A second challenge for display of digital collections is the need for flexibility . Less can be done for the Internet user, who for the time being is locked into a traditional monitor. (This is slowly changing as High-Definition home theater systems become more affordable and versatile, including integrated connections to media centers and the Internet.) Within the museum, however, it is necessary to find a flexible display framework that can adapt to the kinds of environments that best showcase the collection at hand .
Technology for the Digital Museum Display
We are developing display systems for the digital museum that can address the issues of flexibility, cost, size and mobility [4, 5]. These systems provide low-cost scalability and reconfigurability, while leveraging the layered representation of the collections. Layering (and multi-resolution hierarchies) map nicely onto the scalable nature of the display, helping to find a way to take advantage of all available data resolution with available display resolution .
The dynamic properties of these display systems are enabled by a unique calibration process. Specifically, projectors are calibrated automatically to derive a coherent display space using a digital camera that monitors relative projector position. The image from the camera is used to extract projector positioning, which gives a basis for calculating a geometric warp and an intensity blend to make an entire group of projectors appear to be aligned seamlessly. Our prototype implementation can calibrate at the rate of one projector every 8-10 seconds.
This dynamic calibration process enables solutions to large-scale display problems such as scalability, positioning, and support for quickly-changing environments . For example, since the calibration process accepts arbitrary positioning of any number of projectors, a system starting with a few projectors can easily be scaled on-the-fly to many more. This adds a new dimension to the possibilities in resolution and size, and closes the gap between high-resolution data acquisition and lower-resolution display.
An additional benefit from the flexibility of auto-calibrated tiled displays is that they can incorporate different brands of commodity projection hardware. Sets of projectors with varying native resolutions, for example, can be used in one display setup with only minor configuration changes. Commodity projectors are now starting at $600 and up, so a museum can easily save money by using a mix of existing and newly-acquired equipment.
Results: Puerto Rico
The three digital collections we are building with the Instituto de Cultura Puertorriquena span a wide range of desirable display configurations. The differences in each of the collections encouraged us to leverage our flexible tile-based display system to highlight the strengths of each. In particular, the sculpture collection is strongly three-dimensional. The viewer profits from a higher degree of interaction and immersion for exploring the different views of each object and embedding them into a larger virtual gallery. Paintings are compelling as wall hangings, but it is challenging to create a display system that can be easily configured as a virtual gallery. The petroglyphs, like the 3D sculpture, are impressive when viewed at scale and in a number of modes simultaneously, such as textured with natural lighting with subtle enhancements to bring out the carvings.
For the sculptures, we are implementing a virtual exhibition hall where we can place models into a virtual space. The sculptures, placed in a scaled 3D environment, can give viewers a strong sense of immersion and realism. Naturally each model can be manipulated in 3D and explored in high-resolution detail.
The painting collection offers another challenge. The geometry of commodity projectors makes it difficult to align a projector so that its projection of a painting perfectly fits to scale within a rectangular region on the wall. Keystoning, a common type of projector distortion, and other effects make physical alignment tedious. We are implementing a gallery of projected paintings in which the digitized Puerto Rican paintings can be exhibited via rapidly configured projectors behind screens. In front of the screens, where viewers stand, we place "empty" picture frames. Using a variant of the camera-based calibration algorithm for tiled displays, we calculate the exact geometry required for projection of the painting into the frame, giving the effect of a correctly mounted, framed painting. Using a cluster of rendering PCs and projectors, each painting can be projected into the frame on the wall and quickly calibrated for exact shape, scale, and color. An example of the technique for a projected gallery is shown in Figure 6.
This technique generates a truly unique experience that is amazingly realistic and compelling. A museum can save thousands of dollars in insurance, transportation, and handling expenses by simply transferring data files and commodity hardware. This removes the concern and risk attached to the handling of real paintings. Because such a display system can be rapidly configured, viewers can be given "tours" in any location where the display system can be arranged. Furthermore, it is possible to drastically increase audience size by exhibiting the same works simultaneously at different locations, an impossibility when using the actual pieces.
With display flexibility as a feature, we can embrace diverse display configurations and experiment with presentation modes. The petroglyph collection provides another opportunity to experiment with projector position, resolution, and interaction. By using a large flexible display, patrons can study a petroglyph at actual scale, as shown in, for example, Figure 6. Also, the high-resolution objects can be displayed at their native resolution, so that nothing is compromised from the actual acquisition.
Another option that we are exploring for exhibit displays, using commercial technology, is a virtual exhibit on DVD (Digital Video Disc) as seen in Figure 7. By taking advantage of standard menus in the DVD format, we are able to create a virtual museum tour. A user may select various pre-rendered paths through the museum and choose which pieces to study in more detail, all by navigating with a remote control. These DVDs can be created and populated with any of the objects that fit into the acquisition model that we have previously described. Well known and widely available consumer technology like this may even be used to create another source of revenue for a museum. For example, a museum may sell a virtual tour DVD of special exhibits or provide them as a gift with a monetary donation. Furthermore, distribution of this type of product into the community will also increase the perceived technological literacy of a museum.
We believe that the traditional view of the digital divide, which emphasizes the disparity across populations in access to computational resources and network connections, is only a first step in understanding the issue. The next great challenge, which is crucially important to the digital museum, is the digital media divide between the holdings of a museum and what it actually offers in the form of digital collections. Without compelling, accurate, structured digital content, digital access is meaningless. The digital museum must intentionally address the content creation problem on behalf of its patrons.
In this paper, we have discussed issues concerning the acquisition of digital content for sculptures, paintings, and off-site artifacts (petroglyphs). We offer a solution to the physical access problem and show that technology is emerging that can have a profound effect on how the digital museum acquires, represents, and displays its collections.
With respect to content creation, we note the difference between raw data acquisition and subsequent content representation. With rapidly evolving data acquisition technology, one must not mistake raw data for a digital collection. At the same time, museums must build and support sustainable acquisition programs with trained staff who understand how to run them.
In terms of representation, the museum must create layered content that has the potential to solve access problems (heterogeneous networks) and can incorporate valuable metadata. In the same way, scalable display systems that support a range of display modes and flexible deployment offer the ability to overcome issues that plague most commercially available high-resolution multimedia displays. By allowing the use of commodity hardware and offering an easily scalable way to increase size and resolution of a display, the museum can begin to build a compelling display milieu for digital collections.
Advances from research in acquisition, representation, and display, must be captured to fuel the development of sustainable digitization programs for museums . The museum of the future will build on this and other progress toward the point where there is much less disparity between the richness of a museum's physical holdings and its offering of quality digital content.
3. D Bomford.
4. M. Brown and W. Seales.
5. Michael S. Brown, W. Brent Seales, Stephen B. Webb, and Christopher O. Jaynes.
6. Donatella Castelli and Pasquale Pagano. Jaynes.
7. Gregory Crane.
8. Katie Hafner.
9. M. Hereld, I. Judson, and R. Stevens.
10. Alcide L. Honore.
11. Greg Humphreys, Mike Houston, Ren Ng, Randall Frank, Sean Ahern, Peter D. Kirchner, and James T. Klosowski.
12. J. Kos, A. Barbosa, C. Krykhtine, E. da Silva, and R. Paraizo.
13. Marc Levoy, Kari Pulli, Brian Curless, Szymon Rusinkiewicz, David Koller, Lucas Pereira, Matt Ginzton, Sean Anderson, James Davis, Jeremy Ginsberg, Jonathan Shade, and Duane Fulk.
14. K. Li, H. Chen, Y. Chen, D. W. Clark, P. Cook, S. Damianakis, G. Essl, A. Finkelstein, T. Funkhouser, T. Housel, A. Klein, Z. Liu, E. Praun, R. Samanta, B. Shedd, P. J. Singh, G. Tzanetakis, and J. Zheng.
16. H. Maitre, F. Schmitt, and J. Crettez.
18. Thomas A. Phelps and Robert Wilensky.
19. C. Rocchini, Paulo Cignoni, C. Montani, P. Pingi, and Roberto Scopigno.
20. J. Rowe.
21. J. Severson.
23. R. Surati.
25. I.H. Witten and D. Bainbridge.
Copyright © 2005 W. Brent Seales and George V. Landon