David E. Fenske
Head, Music Library
VARIATIONS Project Director
Indiana University
[email protected]
Jon W. Dunn
VARIATIONS Project Technical Director
Indiana University
[email protected]
D-Lib Magazine, June 1996
The VARIATIONS Project is best known for the distribution of high-quality digital audio via an ATM network from servers and storage systems having some special characteristics to Intel-based and Macintosh clients. The evolution of this project from its beginnings in the late 1980's to its initial operational state today is inextricably connected with the design and construction of a new School of Music Library at Indiana University, and with the opportunities presented by a new design. It also addresses some pedagogical and library preservation problems. This article describes the motivation for the project and its history, its operation and experiences to date, and its future goals. Although the project is now operational, this report should not be viewed as a final one. VARIATIONS is a work in progress and represents several partnerships within Indiana University and our partnership with IBM. Information can be obtained about our internal partnerships by following links to the Indiana University School of Music , the Indiana University Libraries, Indiana University's University Computing Services . The VARIATIONS Project, as a result of its partnership with IBM, uses many of the IBM Digital Library technologies. Information about the Indiana University School of Music Library's relationship to IBM's plans is publicly available at the IBM Digital Library site.
Common knowledge has it that university buildings take a long time to accomplish. We can validate this observation. The first internal documents for a new music library were written in 1977. The officially endorsed proposal was first produced in 1983 with subsequent revisions in 1986 and 1989. It was with the 1989 version that the new Music Library was built.
In the earlier versions, a traditional library of the time was envisioned. The principal issue was providing twenty years of collection growth without compromising the available number of readers and listeners. The debate in 1983 was over allocating space to the listeners versus the readers.
The 1989 plan addressed the same issues for collection growth: twenty years of growth and the need to unify collections, particularly score and recorded sound collections. However, the 1989 plan also completely reexamined the issue of patron spaces. The new patron spaces envisioned a unification of listening and computing spaces (not even mentioned in the 1983 plan) and the ability to reallocate reader spaces to digital library spaces as the need arose.
Why the change? Starting in the mid-1980's, the Music Library had asked the question: if information was going to become increasingly digital in the future, what would be required to continue the place of the Indiana University Music Library at the center of an information hub in the School of Music? Starting in about 1987, the Music Library installed its first Novell server. Distributing information over a network (as opposed to standalone workstations) seemed to us the only appropriate choice for a library. Initially, this network served only a few public workstations and Music Library faculty and staff computing.
The Novell-based server combined public and staff computing and continued to evolve over the years, gradually extending to all six buildings of the School of Music complex. During these years we found new ways to distribute text-based sources, computer-assisted programs and music notation sources. Since about 1990, CD-ROM products have also become an important part of this program.
Supported by a computing vision inherent in the 1989 version of the building program, we realized that we had not yet succeeded in distributing sound nor video sources. Recorded sound had accounted for more that 50% of the items used in the Music Library for the previous 20 years. We realized that we could not move into a digital environment until we addressed the central issue of the network distribution of time-dependent data (e.g. sound).
The term VARIATIONS was first used in a joint paper (David Fenske and Michael Burroughs) presented to the International Computer Music Association at its meeting in Glasgow, Scotland, in 1990. The term has a clear musical allusion to the form, theme and variations. The term was also meant to imply the musician's need for various data formats--text, sound, video, music notation and images--in an integrated setting. Instruction and research are dependent on the aural analysis of music while simultaneously reading a score. This analytical act is supported by text-based and music notation-based research.
The technical challenges in distributing sound over a network became the focus of the VARIATIONS Project from 1990 until its successful operational deployment on April 1, 1996. In several respects, the new Music Library building and the VARIATIONS Project are both focused on the same issue differing only in the environments: unifying and integrating collections of information principally in text, score and recorded sound formats.
From 1992, the VARIATIONS Project examined server, network and client technologies from all of the principal computing companies. We found many worthy products addressing one or more of our requirements. It became clear to us, however, that our concept was, in 1992, beyond the capabilities of technology. We were encouraged that parts of our vision had, then recently, come into existence and that all of these companies were emphasizing at least some of the concepts that then came to be known as the digital library.
One of operating principles in examining technology from many companies was that it had to be shown to work at Indiana University including servers, networks and clients. The computing environment on the Bloomington campus and in the School of Music is heterogeneous. Intel-based and Macintosh-based machines abound in about equal number. The campus network supports a variety of network protocols, IP, IPX Appletalk and others. UNIX and Novell servers are common throughout the campus and in the Music Library. UNIX workstations from a variety of vendors are more common elsewhere on the campus than they are in the School of Music.
Many products were examined that were by themselves exciting but failed our needs for a networked-based distribution of information in a time-dependent form (i.e., sound). We examined a UNIX-based workstation that had better sound support than any other platform, but added nothing to the network distribution solution. Regrettably, this company no longer exists. We examined UNIX-based products from other vendors some of which did address our need to distribute information over a network. Most of these products failed to integrate well into our campus network or they failed to scale to a level meeting our needs.
For a couple of years, the solutions seemed to lie with UNIX-based clients and servers and it looked as though our problem would be to entice our users away from their Intel and Macintosh-based computers. The reasons were the networking tools native to the UNIX environment and the early deployment of high-level sound manipulation tools combined with high-quality sound. However, this proved to be impossible, as many of the applications needed by our users, especially music-related applications, were only available for Intel and Macintosh platforms. Windows and Macintosh emulators for UNIX workstations could not deal well with sound or MIDI (Musical Instrument Digital Interface) connections to synthesizers. Our examination gradually shifted from one focused primarily on clients to one focused on the network and the servers, with Macs and PC's as the clients. In the process, the contending technology companies were quickly narrowed down to two and then one.
The existing Ethernet-based campus networking solutions present on the campus did not deal well with real-time audio or video streams, due to the fact that their bandwidth is shared in a building by potentially hundreds of stations. For our new building, we had to look to switched networking technologies to accommodate our needs. We considered a number of networking schemes but only two were serious contenders: ATM and switched Ethernet. Switched Ethernet had several initial advantages. It substantially increased the bandwidth dedicated to each workstation. It could be combined with yet-higher bandwidth building backbones even involving the promise of ATM's eventual quality of service and resource reservation from the server to the switch. Switched Ethernet was at the time a more established technology and would have been the more conservative choice. There were some who also argued that it was cheaper than ATM to deploy.
We chose ATM over switched Ethernet for several reasons. While switched Ethernet does provide sufficient bandwidth for some of our immediate needs, there were questions about how long switched Ethernet would serve our purposes. Because of the scale of the VARIATIONS Project one of our choices was use of ATM in the building backbone. As 25 Megabit/second (Mbps) ATM adapters were released and dropped in price, the issue became one of ATM to the desktop. The ability of ATM to reserve bandwidth via quality of service guarantees also formed part of this argument. Sound alone is a more critical network problem than video despite today's video-driven development of networking technologies. Video over a network degrades for a while before stopping altogether. During this degradation, annoying as it may be, information context is not lost. High quality audio only does not degrade gradually, it simply stops. In less than a second, information context is lost when audio over a network breaks. In view of this critical observation, ATM was the only networking technology that promises guaranteed service through resource reservation. Based on the functional requirements addressed in this paragraph, ATM came out somewhat ahead of switched Ethernet, but there were other issues as well.
Even given the declining costs of 25 Mbps adapters, an ATM adapter is more expensive than Ethernet adapter for switched Ethernet. The same comparison holds true for the rest of the networking environment. The question for us became: Was switched Ethernet really the most economical choice? As a wiring plant, we chose category 5 unshielded twisted pair to the desktop, already a more economic choice than many new buildings built only a few years earlier. (They often chose fiber optic to the desktop, which costs more as a wiring plant and as a desktop device, but delivers high bandwidth.) Although 25 Mbps will work over the category 3 twisted pair common in most buildings on the Bloomington campus, category 5 meant that we could deploy higher-speed ATM in the future without rewiring and that we would not have to install fiber to the desktop. (Fiber does connect the ATM switches and the VARIATIONS Project's servers.) Still, switched Ethernet would have worked over category 5 as well and even over category 3 wiring.
The critical components of the economic question became long-term bandwidth needs indicating category 5 and ATM and what might be called the replacement factor. Switched Ethernet might have won the economic argument if we were retrofitting an existing building and needed to make the minimum amount of physical alteration in order to increase bandwidth to the desktop. Installing switched Ethernet in a new facility combined with the functional arguments articulated previously suggested that we would want to replace switched Ethernet within a couple of years for functional reasons. The combined costs of installing switched Ethernet and then replacing it within a short period of time was judged much more expensive as well as unlikely to succeed in a university context. In short, ATM, although initially more expensive, provides a much longer service life than switched Ethernet and was, therefore, for us the economical choice.
There was also another argument in favor of ATM: technology development. While we were in the process of the preceding network examination, we were also carrying out an examination of servers (and to a lesser extent clients). IBM could offer the greatest number of components in essentially an end-to-end installation. IBM stayed with us through years of examination and allowed us to influence their choices in the digital library environment. So many of the decisions we were making generally were high risk ones. The ability to influence technology development and to reduce the vendor -to-vendor finger pointing typical of mixed vendor deployments made IBM the logical choice. Although IBM could have provided either a switched Ethernet solution or an ATM one, it was clear the future belonged with ATM.
The end-to-end solution became additionally important when one also considers the technological challenges involved with serving audio and video data and with storing this data. The VARIATIONS Project, as a result of its partnership with IBM, uses many of the IBM Digital Library technologies. Information about the Indiana University School of Music Library's relationship to IBM's plans is publicly available at the IBM Digital Library site.
In describing how VARIATIONS works, we can break the system into
three primary parts: content creation, content storage, and content
distribution.
Student workers (under the direction of Constance Mayer, Head of Circulation Services) use specially-equipped personal computers to create CD-quality sound files in Microsoft's .WAV format from original analog or digital media. We are using a 16-bit sample size at a sampling rate of 44.1 KHz, the same quality used by audio compact discs and typical commodity sound cards for personal computers. In the case of CD's, the sound is already in digital form and can be transferred directly from CD to hard disk without any loss of quality using Microtest's Disc-to-Disk software on Macintosh and Intel-based workstations equipped with CD-ROM drives. Records, cassette tapes, open-reel tapes, and other analog media must be converted to digital form using Intel-based workstations equipped with Turtle Beach and Roland sound cards. Analog recordings require more attention in order to get good quality results, as one must carefully set recording levels and monitor recording progress.
In addition to simply creating a sound file copy of the original recording, the students also enter the index or band information from the original recording. This is information which is not available in the existing online library catalog record for the item, but is necessary to provide a level of access for the patron which approaches that of having the actual item with CD booklet or record jacket in hand. For this task as well, CD's are easier to work with; a locally-written Macintosh program can extract precise index timing information from the CD itself, requiring the worker to only input the description of each track from the CD booklet. Analog recordings require that the worker carefully identify the exact locations of track breaks and enter this timing information for each track in addition to the descriptions.
After creating a sound file and a track description file on one
of the digitizing workstations, these files are transferred via
FTP to a central IBM RS/6000 archive server (discussed further
below). At night, a batch job runs which compresses these files
into MPEG format, using a
3.6:1 compression ratio. MPEG audio
compression works by eliminating frequencies in the sound which
cannot be perceived by the human ear and mind. Most listeners
have found the MPEG-compressed audio to be of more than acceptable
quality for day-to-day use, and the original full-quality uncompressed
files are always kept for preservation purposes. Another advantage
of MPEG beyond the decreased file size is that it provides a common
file format for Intel, Macintosh, and UNIX workstations.
There are two primary servers in the VARIATIONS system for storage of digital audio: a playback server and an archive server. The playback server consists of an IBM RS/6000 Model 59H with 120GB of hard disk storage. This server can store over 600 hours of MPEG-compressed CD-quality audio on file systems managed by an IBM software product known as Multimedia Server for AIX. Via a filesystem technology known as Tiger Shark, Multimedia Server provides for striping of audio and video files across multiple disks, which provides load balancing and guaranteed real-time delivery of these files.
The archive server is an IBM RS/6000 Model J30 with an attached IBM 3494 Optical Tape Library Dataserver containing two IBM 3590 tape drives, which is managed by IBM's ADSTAR Distributed Storage Manager software. This library can hold up to two terabytes of content, or over 9000 hours of compressed audio. The 3590 drives, with a nine megabyte/second transfer rate, allow for fast access and retrieval of large multimedia files. Currently, this server is being used to archive the uncompressed sound files and store backups of the compressed files residing on the playback server. Later this year, software will be added so that the archive server will be able to transfer MPEG-compressed audio files to the playback server on demand to provide a larger amount of online storage for audio files being accessed by patrons. At that point, the playback server will essentially be acting as a most recently used cache for the sound files residing in the archive server.
Tape was chosen over optical technology for this application because
of its higher transfer rates and better cost/megabyte ratio. While
optical storage media offer the advantage of faster seek times
than tape, their data transfer rates and seek times are so much
slower than disk that sound files would still have to be copied
to disk in order to provide multiple simultaneous access to the
same file.
Library patrons access sound recordings in the system from 45 IBM Pentium computers located throughout the library and in a teaching classroom/cluster on the third floor of the library. These stations all currently run Microsoft Windows 3.1, but an upgrade to Windows NT is anticipated in the near future. Each of these stations is equipped with a sound card (IBM Mwave), MPEG audio decoder software from Xing Technology, CD-ROM drive, Kurzweil K2000 synthesizer/keyboard, and headphones. Beyond the access to digital audio, these stations also deliver general computing functions (word processing, e-mail, spreadsheets, etc.), library computing functions (access to CD-ROM databases and the library catalog), and music computing functions (ear training, music notation, composition).
Two scenarios exist for locating and playing recordings in VARIATIONS. The first case is that of a student who needs to listen, for a class assignment, to a particular recording which has been placed on reserve by the instructor. The student sits down at a workstation and launches Netscape, which is set to use the Music Library home page as its starting page. From this page, the student selects "Course reserves," which takes the student to a list of courses being offered in the current semester. The student selects the proper course to obtain a list of recordings on reserve for that course, and then selects the recording to which he or she wishes to listen . This launches a locally-written VARIATIONS Player application which begins playing the sound file from the playback server across the building ATM network. The student has full control over playback of the recording, with the ability to stop, start, rewind, and fast forward. He or she can easily move through the tracks of the recording to get to the particular work or section desired.
The second case is that of a patron who wishes to listen to a particular recording independent of any course assignment. In this case, the user would select IUCAT, Indiana University's NOTIS-based online catalog system, from the Music Library home page, and perform a search of the catalog to find the item desired via the standard NOTIS terminal-based online public catalog interface. If the item is available online, a URL pointing to the online copy of that item will be displayed along with the rest of the catalog record. The user can then cut and paste that URL from the terminal window into Netscape to access the item. Indiana University is planning to implement Ameritech Library Services' WebPac World Wide Web to Z39.50 gateway software to provide a true web interface to the catalog later this year. With WebPac, the user will simply be able to click on the URL when viewing the catalog record in order to access the online copy of the item.
A screen shot of the VARIATIONS Player application
One of the reasons, along with copyright, that VARIATIONS is only accessible within the new Music Library building is that of networking. The existing Ethernet-based campus and building networks at Indiana University are not capable of dealing with large numbers of real-time multimedia sessions.
In the building, we are using an IBM ATM network with a combination of 100 and 155 megabit/second links over fiber-optic cabling to servers, and 25 megabit/second links over copper unshielded twister pair wiring to client PC's. ATM was chosen as the network technology for the new building due to its long-term advantages for real-time multimedia traffic. Currently, audio data is delivered from server to client via the NFS (Network File System) protocol running over ATM via Ethernet LAN Emulation. We hope to be able to transition to using native ATM services with the ability to reserve bandwidth via quality-of-service guarantees. This will require, however, that Multimedia Server product be adapted to support this and that API's and drivers which support quality-of-service become available for Windows and UNIX operating systems.
Our experience in running a production ATM network has been, for
the most part, positive. We have not run into any significant
management or stability problems, although it is admittedly more
difficult to troubleshoot problems when they do occur due to lack
of diagnostic tools and the extensive pool of knowledge which
has been built up for older technologies such as Ethernet.
VARIATIONS was up and running for public access for the first time on April 1, 1996, delivering course reserves for two undergraduate Music Theory classes, one containing about 20 students and the other containing about 150 students. Training sessions were conducted (by Jon Dunn and Constance Mayer) for both classes; a hands-on session was used for the smaller class while a demonstration/lecture was used for the larger one. In both cases, step-by-step instructional handouts were provided. Students seemed to be able to pick up quickly on how to use the system, most having had some computing experience (word processing, e-mail, web browsing) previously. By the end of final exams in early May, sound files were being launched over 1000 times per day. For the summer session beginning in June, we plan to provide at least fifty percent of reserve listening materials via VARIATIONS.
A feedback form is provided on the web for students to submit
questions, comments, or problem reports regarding the system.
Most questions have been of the form, "Why can't I access
the recordings from my home/dorm room/favorite campus computer
lab?" Now that students have had a taste of what electronic
access to sound recordings can provide, their desires for more
capability have increased faster than technology can respond.
Many faculty members have also been intrigued by the possibilities
of VARIATIONS. A number of faculty members, with varying degrees
of computer background, are very interested in using VARIATIONS
in their instruction.
There are a number of library-related aspects not necessarily apparent in a discussion driven by technology: access and preservation. For the first time, digital preservation practices mean greatly improved analysis and restoration capabilities and increased access to information in all formats.
Digital preservation standards are still evolving. For music, as for all areas in the humanities, preserving information is a crucial component. Readers whose disciplines lie outside of the humanities may not always appreciate the fact that information, for the humanist, retains its research value for extremely long periods of time. It is axiomatic that as publication activities passed through the Industrial Revolution in the early 19th century, the longevity of publications actually decreased due to changes in paper manufacturing processes.
Society at large may generally regard recorded sound as largely entertainment. For the musician, it represents nearly 100 years of changing performance practice now available for research. With the exception of the compact disc, all recorded sound media are regarded as fragile since they deteriorate with each use even under the best of conditions. Even compact discs are not indestructible. Digital preservation captures manuscript, print and recorded sounds in their current state. In all of these cases, we see now the development of digital tools to restore the image or the sound so that nuances crucial to the scholar can again be observed.
The VARIATIONS Project as a digital library project not only means better instructional and research tools, it also means improved access to information: 1) retrieval of the full information object is linked to its corresponding bibliographic record in the online catalog; and 2) in most many cases particularly with graphic images and textural data, the information can be distributed to users elsewhere on the campus, on other campuses and potentially the world.
There are a number of immediate term goals we are pursuing in the library information delivery phase. The solutions to these goals are not yet known:
Since we are still early in the operational deployment of the VARIATIONS Project, the reader may have the impression that this is a project driven largely by pedagogical and library information delivery goals in a single building. This impression would be largely incorrect. It is merely the point where we needed to start.
Having accomplished the network distribution of high-quality sound data within a single-building ATM network, we will investigate wide area distribution. This distribution ranges from campus academic buildings (and some dormitories) attached to the network usually via Ethernet and FDDI to services delivered to other campuses of the university via an ATM wide-area network, to services delivered via modem connections. Eventually, the VARIATIONS Project's services will range from guaranteed under ATM or other network technologies providing quality of service guarantees, to best effort for high quality information over non-switched Ethernet and to degraded quality over modems. In order for the project to deliver these services, we will need to develop more intelligent software determining the network capabilities of the requesting user as well as tools to gracefully degrade the quality of the sound to match the quality of the network connection. We have actively discussed these plans with our technology partners and will be pursuing solutions in the coming months and years.
One aspect of the above wide area distribution problem is technical (as described in the preceding paragraph) and another is mission-oriented. Our purpose in distributing information is to support the educational mission of Indiana University in the School of Music, on the Bloomington campus, and on other campuses of Indiana University as well as distance-based education. We do not provide free copies of content to users even in this environment. Only a couple of minutes of sound are in memory at any one time. The VARIATIONS Project never distributes a full copy of a work for use by the end user.
For end-user desktops on the campus network, we check the location of the user by the Internet Protocol address before allowing use of commercially-produced sound. Degraded quality content distributed via modems will not be any more attractive to users than are low quality images from art museums The problem will come with inter-library sound requests now in the digital environment. (It should be noted that we presently honor most inter-library loan requests for out-of-print recordings by sending a cassette copy: a common practice.) The inter-library loan request for a digital copy (such requests have already been received) of an out-of print recording are inevitable and would require the transmission of a full copy. Notice that we have restricted this discussion to out of print material since in print material should always, in our view, be purchased by the requesting library. In order for out-of-print digital material to be distributed with the minimum of difficulties between copyright owners and libraries, we must have this data encrypted with a time limited (such as 30 days) key. After the key has expired, the data is useless. The borrowing library will presumably erase the file since it is useless. Note that this arrangement provides for tighter controls in the digital world than those of the analog world.
The VARIATIONS Project will create databases of score notation. The primary difficulty is scanning musical scores is the size of the publication, which often exceeds the standard sized 8.5" X 11" scanner. Scanned images of musical scores are useful for reserves, incorporation into HTML files, etc. While these are useful activities, they do not themselves particularly advance research. Within the last two years, a few music character recognition programs have come into existence. One of these, from AR Editions of Madison, WI, does a particularly good job of converting images of printed music into notational files. Unfortunately, the resulting notational file is thus far readable only by AR Edition's music editor. The goal of most music character recognition programs is to reproduce as completely as possible the printed page including not only pitch, meter and rhythm, but also many other facets of music publication such as slurs, accents, ornamentation, etc. The objective is to facilitate further editorial work or publication-related activities.
Also in the past two years, Prof. David Huron (from the University of Waterloo) has released his Humdrum Toolkit permitting queries of notational files in a variety of notational files formats and in a variety of operating environments.
All of the above are promising signs. One of the VARIATIONS Project's goals is to be able to convert large amounts of printed works into a database. Queries could be formed requesting stylistic information from a large data-set on a scale not previously attempted. The requirements for these activities are different from the kind of ongoing activities listed above:
The VARIATIONS Project will create or assemble tools to directly analyze, and query sound databases. Musical scholars at least since the initiation of musicology in the latter part of the nineteenth century have focused on the score when studying or analyzing music. When members of the general public study a piece, the reference is usually to the score. In both of the preceding instances, the musical sound, live or recorded, is often used to reinforce or confirm score-based observations.
Music score notation, despite the best efforts of composers and other musicians is, at best, an approximation of the composer's intention. Musical conventions (performance practices) have an immediate impact on the process, then and now. The focus of much of musical scholarship for the last 150 years has been to recreate a critical text that will interpret the composer's text and the then contemporary performance practices for modern musicians, who are themselves the product of a differing set of musical conventions.
We now have nearly 100 years of recorded music. Each fixed performance is itself an interpretation of a musical work which differs by necessity and by intent from all other recorded (and live) performances of that work. This observation is also true of the composer performing his/her own works. We are no more likely to identify the "perfect" performance, than the perfect and final critical text. By being able to more directly query recorded sound without any reference to the score, we will gain a view of performance practice and of interpretation that is event driven and takes into account the defining characteristic of music, sound.
While we know how to represent sound, we have not had subtle enough
tools to query representations for the small nuances in frequency,
duration and amplitude that allows us to study one event as different
from another of the same musical work in ways that are insightful.
In short, we do not have the tools that allow us to study with
any level of discrimination approaching the level of the human
ear and intellect. While not trying to displace the role of the
human mind, we are limited in our abilities to recall accurately
large amounts of sound. Database software that allows us to manage,
organize, compare, query and report sound directly promises the
potential for new perspectives and new research.
The VARIATIONS Project at Indiana University's School of Music
Library has recently achieved initial operational success. The
Project will be addressing issues of wider distribution of sound
and data in other formats and of score and sound database creation.
June 1996
hdl://cnri.dlib/june96-variations