Metadata Fundamentals for All Librarians
By Priscilla Caplan.
While we frequently hear the naïve remark (not infrequently from people who should know better) that libraries have been all about metadata from their beginnings and that librarians have a good grip on metadata principles and practices, you don't have to scratch much below the surface of the notion of metadata to discover the true state of affairs. While some librarians have bridged the gap between a relatively well organized world of tightly federated systems and practices in generating bibliographic records (unquestionably metadata) and the far less cohesive universe of metadata being generated by many of today's cultural heritage institutions and other public and private sector entities, far too many librarians haven't got a metadata clue. Many gaze across the emerging metadata landscape like deer in the headlights of an automobileat best baffled and more frequently absolutely frightened. The unruly, untidy nature of this landscape offends the sensibilities of those more at ease with well articulated principles and practices. For many, the metadata landscape appears to be strewn with defunct initiatives (all with varying degrees of bright promise in their infancy) and myriad mindless, uninteroperable variations on the same themes.
This book by Priscilla Caplan works hard to make sense of it all first by presenting a useful set of principles and practices in the book's early chapters followed by a number of chapters that examine a variety of metadata initiatives and related schemassome designed to address the metadata needs of particular discourse or practice communities (e.g., metadata for education and government) and others to fulfill specific functions across such communities (e.g., rights and structural metadata). Given the complexity of the topics addressed and the relationships among them, the author's hard work pays off. Caplan capably explains various metadata notions from semantics and syntactic bindings to application profiles, metadata registries and mechanisms to enhance metadata interoperability in ways that clearly define both the topics and the relationships among them.
Readers need to be mindful, however, of the ramifications of the book's title and explicit scope of Caplan's text. This is a work intended for librarians and, therefore, cannot be viewed as any sort of survey of the full landscape we see emerging of metadata for discovery, management and use. This relatively singular focus makes the text highly useful and coherent for librarians and students of librarianship who seek to understand metadata solutions within traditional and networked libraries. However, the text is of less value to individuals working in private and public sector environments with less cohesive (or totally absent) metadata histories.
Early in her text, Caplan states her reasons for using the term "scheme" when speaking of both element sets (i.e., the bounded sets of permissible metadata statements) and value spaces (i.e., controlled vocabularies and encoding schemes) as opposed to the more fashionable use of the term "schema" to denote element sets and "schemes" for value spaces. While her reasoning is sound, the result is an occasional pause on the part of the reader in order to determine the sense in which the term is being used. This would not be a problem but for the fact that Caplan thinks (and rightly so in this reviewer's opinion) that the notion of metadata embraces both sides of the attribute/value pair as opposed to the more limited view of metadata as a concept concerned solely with elements and element sets. Thus Caplan devotes a chapter to the discussion of vocabularies and thesauri.
Part one on principles and practices begins with a chapter on basic concepts including a definition of metadata, its various types and roles and a careful statement of the aspects of metadata schemas: semantics, content rules, and syntax. Caplan uses the IFLA Functional Requirement for Bibliographic Records as the means for driving home the need for "a rigorous data model underlying the scheme" that reflects the entities to be described and the relationships among them.
The statement of principles and practices is followed by a discussion of various syntactic bindings for metadata including MARC, HTML, SGML, and XML. While some might disagree with Caplan's treatment of Resource Description Framework (RDF) as yet another syntactic binding, her explanation of the XML serialization of RDF is clear.
As noted earlier, Caplan includes a chapter on value spaces with descriptions of the characteristics of simple controlled vocabularies, thesauri, and classification schemes. While the explanations of each are fairly elementary, they provide sufficient background for librarians and students of librarianship to place their significance within the larger framework of metadata. There is a substantial section on identifiers ranging from familiar public identifiers such as ISBNs and ISSNs to the less familiar such as URLs and URNs.
Part one closes with two chapters that focus heavily on metadata problems in the context of a network environment striving for metadata interoperability among metadata repositories containing metadata drawn from a rich variety of schemas. In the first of these chapters, mechanisms for achieving some degree of interoperability are briefly explored. Those mechanisms range from system interoperability stemming from strong federations and relatively homogeneous metadata (such as traditional library union catalogs and the use of cross-system searching of such systems using ANSI/NISO Z39.50), to cross-walks among more heterogeneous schemas. The chapter on interoperability concludes with a discussion of metadata registries including several projects of note and the role such registries will play in promoting metadata interoperability. The second of the concluding chapters on principles looks at how metadata is being used on the Web and by internet search engines, and looks at the role of metadata in building the semantic Web.
Part two of the text examines a number of metadata initiatives and schemas. Caplan notes that neither the enumeration of projects and schemas nor their treatment are exhaustive. It is clear from Caplan's narrative that many of the initiatives and schemas she highlights are works-in-progress. Even so, they are very useful in casting light on functional areas of concern as well as the metadata needs of various discourse and practice communities. A number of the initiatives discussed illustrate the evolving nature of metadata in the networked environment, pointing clearly to the fact that the era of monolithic, all-purpose metadata schemas that have served libraries well in the past are giving way to a broad range of experimentation among disparate communities to develop metadata schemas that meet their particular needs.
In conclusion, Caplan's text is extremely worthwhile readingparticularly so for the librarian or student of librarianship wanting to get a solid, high-level view of the metadata landscape. Even for the relatively seasoned metadata practitioner or scholar, Caplan provides interesting insights into the history of a number of the initiatives she highlights. The text is well written, authoritative, and timely.