B. S. Manjunath
Electrical and Computer Engineering Department
University of California at Santa Barbara
D-Lib Magazine, August 1995
The management of images, video, and in general, multimedia data, is an important issue in the design of digital libraries. In particular, two problems stand out: efficient storage and fast retrieval. We outline below the general approach taken to address these two problems in the University of California at Santa Barbara (UCSB) Alexandria Digital Library project whose goal is to create a database of spatially indexed data. Maps and satellite images are among the main data sets in this project.
In the Alexandria Digital Library (ADL) project, the size of the image files tends to be extremely large -- from few mega bytes to hundreds of megabytes of data. Even with access to very high speed networks, it is often impractical to transmit a large image as a single item, particularly if the user is in a browsing mode and trying to find items of interest. A simple solution is to maintain low resolution "thumbnail" images (e.g., subsampled image) for each of the large images. The thumbnails may then be used to support such browsing. While the storage of thumbnails consumes storage space, this overhead is typically insignificant compared to the advantages from their use.
Storing thumbnails and the original data does appear to be useful. Low resolution images are currently being used in many existing geographic information systems (GIS) as well as in the rapid prototype of the ADL project. However, this strategy addresses only one specific issue -- that of fast browsing through large number of images. But in an interactive system, users are likely to do much more than make binary decisions based on simple thumbnail images. They may want to select a certain region within the image and zoom-in on it. Or, perhaps the thumbnail does not offer enough information to make such a binary decision but getting a slightly better resolution image might help.
Such operations are not possible using just these low resolution images as thumbnails. Further, different groups of users may have different requirements. For example, a school teacher using a LANDSAT image for a certain demonstration may not need the same high resolution image as a scientist trying to classify the image data. Clearly, what is needed is a means of storing images at different intermediate resolutions -- that is, a hierarchical multiscale representation of these images. An obvious solution to this problem is the use of wavelet transforms.
Figures 1a and 1b show an image and its wavelet transform.
Wavelets have been widely used in many image processing applications including compression, enhancement, reconstruction, and image analysis, and a wavelet transformation provides a multiscale decomposition of the image data. The lowest resolution image (top left hand corner of Figure 1b) is now the thumbnail that can be used for browsing. Notice that the number of transform coefficients is exactly the same as the number of pixels in the original image. Fast algorithms exist for computing the forward and inverse wavelet transforms. Desired intermediate levels can be easily reconstructed as illustrated in Figure 2.
Storing the transformed images (wavelet coefficients) facilitates the design of hierarchical storage structures. Coarse resolution data are accessed more frequently than the higher resolution information, and hence can be stored in faster devices for efficient browsing.
Important issues related to wavelet based storage include the choice of decomposition (i.e. choice of filters) appropriate for the different image databases. Image compression is also important in storing large amount of data. Many GIS and medical imaging applications often require lossless compression. Although the total number of wavelet coefficients equals the number of pixels in the images, their storage requirements differ. The original intensity data, in most cases, consists of only integer numbers. Wavelet coefficients, in general, are real numbers, thus requiring more memory. Even for the case of no compression, these coefficients need to be quantized and encoded appropriately to ensure that they do not take more space than the original image data. How these coefficients can be quantized while maintaining a near perfect reconstruction is an important research problem.
Content based retrieval is about developing tools for intelligent browsing of the data. In traditional alpha-numeric databases, such as an on-line library catalog, we search using keywords, author names, or book titles. Similarly, generic image attributes useful for search include color, histogram, texture, and shape. However, research on content-based image retrieval is still in its very early stages. We now briefly describe our recent work on using texture information for image retrieval.
Examples of texture images include photographs of water, sand, a brick wall, a wire fence, or aerial photographs of agricultural regions. Textured images, in general, are hard to describe (i.e., they do not have good structure). Often, the resolution and distribution of objects in the scene determines if it is " textured"e; or not. For example, consider a bunch of coffee beans spread on the ground. While each bean is clearly an identifiable object, the random distribution of the beans as a whole is more like a texture pattern. Natural textures tend to be more irregular than man made ones. During the past six months, we have made considerable progress in developing algorithms for texture based search. The basic idea is to pre-process the images at the time of storage and extract the texture information. This is done using Gabor filters, which are modulated Gaussians. Processing through a bank of these Gabor filters is (approximately) equivalent to extracting line edges and bars in the images, at different scales and orientations.
Simple statistical moments such as the mean and standard deviation of the filtered outputs can now be used as indices to search the database. Figure 3 (586 Kbytes) shows an application to browsing large air photos.
Instead of Gabor filters, one may also use the same orthogonal wavelet transform that was used for storing the image data. But extensive experiments on a large set of textured images show that retrieval performance is better using Gabor filters than when using conventional orthogonal wavelets. Why not use Gabor transforms for storage? Because Gabor functions do not form an orthogonal basis set, and hence the representation will not be compact. Further, no efficient algorithms exist for computing the forward and inverse transformations, which is important in a digital library context. While data ingest is off-line and can be computationally intensive, data retrieval should be fast and be performed in real time using existing hardware. Orthogonal wavelets are good for such implementations whereas non-orthogonal Gabor wavelets are promising for image analysis.
Efforts are currently underway to incorporate wavelet based storage and texture feature based search using Gabor filters into the main testbed of the ADL. Many of the issues related to multiresolution browsing appear to be design problems. These include the choice of wavelet filters and storage of the different subbands on the disk. The parallel processing group is investigating efficient parallel algorithms for computing the wavelet transforms. Our initial results on texture based search are very encouraging, and in collaboration with the database researchers, we are investigating methods for indexing using these features.
We have given here only a brief outline of image processing issues related to the ADL. Many excellent books and journal articles are available on topic of wavelet transforms (see, for example, ) More details on the content based search using texture features can be found in .
Thanks to Norbert Strobel and Wei-Ying Ma for generating the results, to Christoph Fischer for the air photo image, and to Sanjit Mitra and Terry Smith for their help in writing this article.
 M. Vitterli and C. Herley, ``Wavelets and filter banks: theory and design,'' IEEE Trans. Signal Process. vol. 40, Sept. 1992, 2207-2232.
 B. S. Manjunath and W. Y. Ma, ``Texture Features for browsing and retrieval of image data,'' Technical Report CIPR-TR-95-06, July 1995.