This article identifies certain architectural principles for the deployment of networked information [Note 1] that, when applied, should contribute to an environment in which digital objects can be readily discovered, retrieved and consumed in ways that encourage the free flow of information while being consistent with individual and organizational intellectual property rights (IPR) policies and preferences. I attempt to reconcile the disparate forces driving the dissemination of information today, including open and free access to materials; interoperability of networked information systems; collection, deployment and maintenance of metadata services; naming infrastructures for information objects; relationship-based policy management; and practical aspects of today's rapidly evolving applications and services. Although the central focus of this article is to confront current information-opaque approaches to digital rights management (DRM), I hope the principles presented here are broader in scope and will suggest solutions elsewhere.
This article poses the question: if we could engineer the Web to have a built-in awareness of IPR policies, what would it look like? How might we extend the Web in ways that would leverage IPR information and policies to actually improve our information experiences, from initial creation, through discovery, to consumption and re-use? In a broader view, how might we engineer policy-compliant systems that have an appropriate level of contextual awareness, including the ability to maintain a safe and comfortable level of individual privacy?
There are no obvious paths toward making IPR policy expression, enforcement and compliance an integral part of the Web; designing systems for policy awareness is difficult, as the W3C discovered with its earlier efforts in the privacy domain [P3P]. In this article I assert that the representation of information objects in ways that accommodate the consideration of policies might require a fundamental shift in how we currently model information objects. But the promise of the Web as an important, pervasive conduit for the creation, sharing, trading and accessing of information demands that we rise to such challenges.
The global information network is a network of intellectual property. All information made accessible on this network has been disseminated within some IPR context, where relationships between information objects and parties (such as creators, publishers and consumers) have been defined in some way, either implicitly or explicitly. The manifestation of those relationships in terms of specific objects, parties and actions can be thought of as IPR policies.
Today there is no consistent mechanism for the expression of IPR policies on the Web, nor do we have ways for information consumers (or agents operating on their behalf) to easily and automatically discover, access and interpret such policies for information of interest. The quality of the Web suffers by not having an open and accessible way to persistently associate IPR policies with information objects, before and after their dissemination.
In this article I explore the development of a conceptual "platform" for IPR policy expression, discovery and interpretation; in an earlier paper [Erickson] my co-authors and I referred to this as a Policy and Rights Expression Platform (PREP). PREP is a set of guiding principles; it should not be thought of as a digital rights management (DRM) system, but rather as a basis for interoperability for DRM systems and services. I believe these principles are complementary to advanced metadata expression and transport mechanisms currently in development [RDF], and indeed may suggest ways for DRM technologies to leverage those mechanisms. In the second article in this series, I will examine some of the architectural implications of these principles, and will provide an example of implementation.
2. Foundation Principles
This discussion is framed by reflecting on a few core concepts from the Kahn/Wilensky architecture [Kahn]. Next, certain requirements are developed for rights management mechanisms suggested by the W3C's current Goals of the Web and the Design Principles of the Web, as well as requirements for the general problem of information dissemination. I then explore several complementary developments, including an earlier treatment of information objects as services and a relationship-based approach to DRM interoperability. Throughout this discussion I will advocate overlaying concepts from Kahn/Wilensky, arguing that this is possible even with today's fragmented Internet technologies.
2.1 Kahn/Wilensky Digital Object Services
"...There is no requirement that a digital object be stored in a repository in any particular manner. Conceptually, the description of a digital object is strictly a logical one and is not intended to describe any particular implementation. In particular, it is possible that, in response to a request to access a particular digital object, a server runs a program that computes the digital object on the fly. It is possible for multiple digital objects to be embedded in a program (e.g., a data base manager or knowledge-based system) that emits them upon request. The program may itself be a digital object. Thus, accessing and depositing are virtual processes, and may or may not involve the actual depositing and retrieval of actual objects per se, although such actual storage and retrieval is likely to be prevalent..." [Kahn]
An important aspect of Kahn/Wilensky is that it advocates a logical modeling of objects. From my perspective, this logical description injects the level of abstraction (or indirection) necessary to integrate rights management services with discovery and retrieval services, aided by one or more mediation services. Such a logical description of an object might originate with a simple text file, database queries or other "active" mechanisms; it can be serialized using a variety of syntaxes, appropriate to the messaging protocol being used.
Logical descriptions of information objects contribute to interoperability, in particular by enabling requesting agents to discover an object's available dissemination services, allowing those agents to interact with the object as appropriate to obtain the manifestations of the resource that they prefer and are entitled to get. Dissemination services are not limited to providing different versions (e.g., different media formats) of an object; perhaps more important is the role that this logical description can play in relating or federating available metadata services to facilitate resource discovery, commerce and rights management.
In early 1998 the use of persistent, unique identifiers, especially the DOI [DOI], as bases for object-specific services was discussed:
"…Every digital object must be viewed as having the potential to offer a world of services to its community of users, whatever segment those users may come from. In addition to the obvious and intended end use (such as reading or viewing), every object may serve as a portal to a variety of available secondary services, including bibliographic information, production credits, content management services, rights management and license administration. Every object is also typically a container for a hierarchy of constituent objects, each of which may also expose its own selection of services to the user…" [DOIServ]
That paper contained a discussion of how a web of metadata services would facilitate the provision of services for the object, and suggested that a uniform metadata exchange format and exchange protocol would facilitate creating such a web. The paper did not, however, provide a specific model for how these services could be discovered and presented and, in particular, did not examine how to construct logical expressions of the information objects.
In terms of Kahn/Wilensky, each of the contributing metadata streams can be seen as an individual dissemination. An object's repository is not required to actually host that data; for example, we would expect most value-added metadata (i.e., reviews and commentary) to be supplied by remote Web services that the provider and/or consumer would subscribe to or otherwise would align with. The question then becomes one of how to reliably describe this web of services and persistently relate it to the information object.
Throughout the remainder of this article, I will develop the idea of a generic "wrapper" that models what amounts to a virtual information object [Note 2], enabling a mediator or "handler" agent to draw upon a (potentially) rich set of web-based metadata and dissemination services to create the desired digital object services.
2.2 DRM and the "Goals for the Web"
Universal Access: To make the Web accessible to all by promoting technologies that take into account the vast differences in culture, education, ability, material resources, and physical limitations of users on all continents… [WebGoals]
When information objects are disseminated without due consideration of the specific context of their rendering or interpretation, they risk appearing opaque or otherwise inappropriate to the user. Simple examples of context include the capabilities of the rendering application or the preferred language of the information consumer. More advanced forms of context might include credentials reflecting particular organizational affiliations that the consumer might have, personal policies for how the user prefers information to be presented (including transcodings or other forms of content adaptation), and policies for how solicitations for personal information will be considered.
To maximize the opportunities for successful access by the broadest possible set of client agents, information objects (and the repository services that make them available) must provide as many clues as possible about their contents, from descriptive information to technical requirements or specifications and available options for dissemination. The latter might include alternative renderings or simply different metadata views on the object.
How is this relevant to DRM? In a limited way, technical protection mechanisms are analogous to digital object repositories; attempts by users or client services to exercise certain privileges are roughly equivalent to requests for particular disseminations of the objects held by those repositories. Meaningful disseminations of these objects are not possible unless the context matches what the repository service can deal with, which may include the particular technical protection mechanism, the content format, user and system credentials, etc.
This repository analogy breaks down quickly in today's environment, because current DRM technologies are actually rather incomplete as digital object repositories. In particular, they do not provide general interfaces to information about the underlying information objects, nor do they have the ability to present alternative interfaces for accessing the object (or reference services, following the Kahn/Wilensky model). We see that a federation of metadata services could supply much of this capability; again, the question is how to persistently associate these services with the object.
It was established earlier that the significance of the proposed "wrapper" is that it provides a logical description and general interface to an object's services. Using today's techniques, this description of the information object might be serialized using XML syntax and would appear in the context of an appropriate messaging protocol, which itself would be layered on HTTP. Introducing a logical model of the information object on top of this "stack" increases the likelihood of broad and long-term access to the object; underlying layers, especially specific mechanisms for expressing and transporting information, are likely to mutate more rapidly than the information itself.
Semantic Web: To develop a software environment that permits each user to make the best use of the resources available on the web… [WebGoals]
Bitstreams are inherently opaque, so a federation of metadata services is essential to enabling the formulaic discovery and retrieval of information. Discovery and retrieval agents must access and interpret a rich set of metadata prior to their request of a particular dissemination, in order to receive the desired information appropriate to their individual requirements. In the previous paragraphs I considered this from a format and encoding perspective; in this section I extend this to enabling semantic discovery and evaluation of information objects.
Elsewhere [Erickson] the case was presented in which users would want any personal or organizational IPR policies that might apply to particular information objects to be weighed (alongside other query factors) by search agents acting on their behalf. In other words, an important aspect of how we declare our requirements for information resources in the future is likely to be what we have a right to access, in a particular way. One can see that this need for "rights awareness" in information retrieval is another manifestation of the appropriate copy problem [Caplan], also referred to as localization in reference linking [NISO], [Lannom].
This leads to two conclusions: First, expressions of IPR policies must be available in a format consistent with other metadata sources and expressive mechanisms that will be the basis for formulaic information discovery on the Web (other information retrieval contexts). Second, we realize that policy expression mechanisms are likely to serve as the basis for personal or organizational IPR policies, which may take the form of declaring the types of policy terms individuals or organizations will accept or reject.
Web of Trust: To guide the Web's development with careful consideration for the novel legal, commercial, and social issues raised by this technology…[WebGoals]
In this article I develop requirements for accessing information objects based upon a rich federation of metadata resources, which I have suggested could be drawn together by mediator services that virtualize digital information objects. A critical missing link in this is the element of trust, not simply in the integrity of the messages exchanged between various agents, but also in the appropriateness of services that these agents may attempt to render (and whether these agents are even capable of properly rendering the services that may be asked of them).
A fundamental concern in rights management is that policy enforcement mechanisms can only approximate the actual policies that should be enforced in any particular context. This is especially true in the case of copyright protection mechanisms. In the general case, the policy terms that should be applied at any given instant will be a balance between specific terms that apply to a particular use, the user or client's logical and geographic context, and terms from personal and organizational use policies that should be imposed. Each of these policy expressions must be supplied in a trusted way; furthermore, the validity of each expression depends in part upon the vocabulary (schema) used, and whether the receiving agent accepts that vocabulary as valid: whether the agent trusts the policy expression to mean what it assumes it to mean.
Those of us in the rights management field are only just beginning to develop frameworks for this sort of semantic trust, or trust in the meaning of metadata assertions. One proposed way to construct such a semantic web of trust is through the development of rights data dictionaries, which would combine with rights expression languages to enable the reliable declaration and exchange of policies between agents.[MPEGCFR] Data dictionaries will provide a basis for semantic interoperability, or an agreed-upon equivalence of terms, between the different expressive dialects of competing rights languages. This work was initiated by the <indecs> project in 1998 [indecs], and serves as a fundamental basis for the more recent Open Digital Rights Language work by Renato Ianella [ODRL]. In the next section I will discuss the role that rights data dictionaries play in interoperability.
In the general case, trust management comes down to finding mutually acceptable credentials, and binding those credentials to particular assertions [Blaze]. The problem lies in finding approaches to establishing trust with minimum overhead, in ways that thrive in a highly decentralized networked environment. This becomes even more critical as we consider models for information retrieval that depend upon a federation of services in order to deliver a result, as I have suggested. In the second article in this series I will present an example that uses the safe dealing approach [SafeDeal] for authentication and access control, an exciting decentralized model that directly addresses the unique cross-organizational access control issues of institutional users [Lynch].
2.3 DRM and the "Design Principles of the Web"
I have found the following design principles, fundamental to the Internet engineering community, to provide practical guidance for thinking about mechanisms for IPR policy expression, interpretation and enforcement. If the Goals for the Web suggested strategies for creating a rights-aware infrastructure, then these Design Principles provide a set of tactics.
Interoperability: Specifications for the Web's languages and protocols must be compatible with one another and allow (any) hardware and software used to access the Web to work together… [WebDesgn]
Interoperability in this sense does not suggest that all Web-based systems and services must be able to talk meaningfully to each other. It does mean, however, that Web-based systems must be able to peacefully coexist, and should leverage each other's capabilities to work together whenever possible.
In the rights management arena there will be opportunities for services to interoperate at several logical levels, analogous to the various steps in the information value chain. As discussed earlier, rights languages will provide the basis for expressing rights information and policies, and will be useful for a variety of rights messaging applications including IPR information discovery, simple policy expression, rights negotiation and trading, and the expression of rights agreements and electronic contracts (eContracts).
Minimally, a rights language is expected to provide vocabulary and syntax for the declarative expression of rights and rights restrictions. In order to satisfy the design principles of Interoperability and Evolvability (next section), we would also expect a rights language to provide an open-ended way to express new rights (e.g., new operations on content, new contextual constraints), and to allow extensions from external vocabularies.
The concept of interoperability through a shared ontology, as embodied (for example) by a rights data dictionary, is similar to what the ebXML working group has been trying to achieve with their core components approach to building interoperable business objects [ebXML]. Following that model, existing or future IPR expression languages would be able to interoperate through translation via this shared semantic layer, rather than necessarily forcing applications and services to use a single common language.
Roscheisen [FIRM], presented an appealing relationship-based approach to DRM interoperability. In his model, all communications happen within the context of established relationships, which are modeled as electronic contracts reified as distributed objects. Interoperability within this framework is achieved through the abstract separation of contract and promise objects from specific implementations of rights and obligations objects, which may include bindings to enforcement and payment components. Within FIRM, multiple proprietary rights management components and systems interoperate by referencing common higher-level promise and contract expressions, making semantic and even functional interoperability possible.
Challenges in implementing FIRM include the cost of discovering existing relationships; "home-running" authorization decisions to the contract manager service; and implementing proprietary DRM components that are semantically consistent. FIRM, as presented, is an all-in-one model that does not immediately suggest obvious points of interoperability between other models, but it does contribute in important ways; in the second article in this series, I will discuss specific ways that FIRM can contribute elements to an interoperable architectural model.
Evolution: The Web must be able to accommodate future technologies. Design principles such as simplicity, modularity, and extensibility will increase the chances that the Web will work with emerging technologies such as mobile Web devices and digital television, as well as others to come…[WebDesgn]
Today's information is disseminated in a brittle environment full of implicit contextual assumptions. If we were to radically perturb the context, as we will surely do when we attempt to render today's disseminations some years from now, our agents will likely fail because they will have insufficient clues to provide their services in ways appropriate to the new context. Deployed information objects must provide sufficient clues in their inherent structure to endure migration away from today's contextual assumptions.
We can readily approximate today what many of the technical aspects of our future knowledge-based infrastructure will look like. For example, we can envision a community of interoperating mediator services that leverage information wrappers projecting knowledge interfaces and thus facilitating dissemination.
An invariant in network evolution is that there will be, must be, an annealing of the current infrastructure in order to admit what happens next. We will first adopt virtualized methods for doing this, which in this case will enable our current data-and-communications fixated infrastructure to act as if it is knowledge-based. There may (and probably will) be multiple ways to do this; as is always the case, we should expect this to be accompanied by techno-bureaucratic battles. But eventually we'll see the emergence of applications, then systems that will be native to knowledge inter-working.
Revolution in our networked infrastructure has never happened overnight. Rather, innovations are disseminated through a combination of incremental changes, built upon previous work, combined with the occasional annealing.
Decentralization: Decentralization is without a doubt the newest principle and most difficult to apply. To allow the Web to "scale" to worldwide proportions while resisting errors and breakdowns, the architecture (like the Internet) must limit or eliminate dependencies on central registries…[WebDesgn]
Mechanisms for IPR information and policy expression should give rightsholders maximum choice and flexibility in disseminating their IPR information, and should likewise give information consumers maximum choice in discovering and interpreting that information. This means that IPR languages and protocols, through their fundamental modularity and extensibility, should enable a variety of applications accommodating many modes of use, if it is the choice of the parties to take advantage of these modes.
For example, we would expect some rightsholders or agents to choose disseminating IPR information through Web services [WebServ], while others might elect to bind IPR policies to information packages, perhaps with references to the network-distributed version. Binding their information to packages ensures that accessibility of the information under all circumstances, while disseminating through web services ensures that the most current information is always available to the widest variety of users.
A logical extension of this principle is that the digital, networked environment isn't just about the Web. Information will be used and deployed in a variety of modes, not always networked and not always digital.
Note that the principle of decentralization is in fundamental opposition to traditional and even role-based authentication and access control mechanisms, which are inherently based upon centralized authority control. Clifford Lynch [Lynch] provides a comprehensive summary of the issues faced by libraries and other cross-organizational users, who were among the communities to identify the problems with such centralized management. Lynch's analysis suggests that trust models based upon one or more additional levels of indirection might contribute to a more decentralized form of trust management and thus would be more practical in the future.
In the next article in this series I will examine ways that Gladney's safe dealing approach, inspired in part by Lynch's requirements, can be combined with the mediation approach for describing information objects to achieve a comprehensive architecture that fits with principles for interoperability.
3 Summary: Principles for Interoperability in DRM Systems
Building upon the previous analysis and originally inspired by the W3C's work in the privacy domain [P3PFAQ], we can now present the following set of guiding principles for interoperability in the expression, exchange and interpretation of IPR information and policies. Note that these are revised from the version offered in Erickson et al. [Erickson] as a possible basis for a W3C Recommendation directed at policy expression and enforcement for Web-based information:
Increase Trust: Any standard for rights expression and enforcement should increase trust and confidence in the digital, networked environment as a medium for the expression, exchange and commerce for goods in which IP rights are asserted.
Standard Formats: Information providers should be able to express their IPR information and policies in standard formats that may be retrieved automatically and interpreted easily by user agents and client services.
Machine/Human Readable: Information-consuming agents should be able to retrieve rightsholders' IPR policies in both machine- and human-readable formats. IPR policies should be expressed in a form that enables automated interpretation, whenever possible and appropriate.
User Notification: Users and client services should be able to review and interpret the IPR information and policies asserted by rightsholders prior to potentially infringing use.
Interoperability and Openness: A standard should not adopt a single technical enforcement mechanism for ensuring that users and services act according to rightsholders' IPR policies. Rather, products and services should implement an open specification that establishes common points of interoperability. A successful specification will provide enough room to encourage competition and, therefore, innovation and differentiation.
Complementary to Laws: A standard for rights expression and enforcement must be complementary to laws and self-regulatory programs that also may provide appropriate IPR enforcement mechanisms within arbitrary contexts of use.
Transport Independent: A standard for rights expression and enforcement may include a model protocol for transferring IPR information and policy expressions, but such a model should be abstract and should not assume a particular form of transport. In particular, a standard should not introduce its own mechanism for securing IPR information during transport or storage. Rather, such a standard should extend tools and protocols that provide data transport, which themselves should be extended by appropriate security safeguards.
Privacy: A standard for rights expression and enforcement should extend previous work in privacy. For example, such a standard should be compatible with mechanisms for declaring the degrees of identity tracking and degrees of anonymity and disclosure of identity. Such a standard should also be compatible with mechanisms for communicating the degree to which privacy policies are complied.
4 Looking Ahead: Expressing Principles as Architecture
In a future D-Lib Magazine article I will provide a more detailed treatment of the architectural implications of these interoperability principles. This section highlights some of the points that I plan to address in that discussion:
In this article I have attempted to identify certain architectural principles for the deployment of networked information; I hope these principles will contribute to an environment in which digital objects can be readily discovered, retrieved and consumed in ways that encourage the free flow of information while being consistent with individual and organizational IPR policies and preferences. I have attempted to reconcile the disparate forces driving the dissemination of information today, and have suggested a way to improve current information-opaque approaches to digital rights management.
The principles developed here are intended to guide the development of reliable and efficient mechanisms for expressing and exchanging rights information and policies, which I see as a necessary first step for the Internet community. I believe they also pave the way toward establishing open interfaces for mechanisms of policy compliance and enforcement, which may be considered later as extensions.
[Note 1] In order to keep the language of this article fluid, I will usually refer to the Web when I actually mean the digital, networked environment.
[Note 2] Purists may question the introduction of the term virtual here, since I am simply defining a new information object that, through indirection and the help of mediation services, is able to provide the desired capabilities. I wouldn't disagree; I have chosen this wording to simply to distinguish between the services that an object provides and the more limited services that a "legacy" object makes available.
[Note 3] I question the declaration of predicates within the object description such as in [MPEG21], since I see this as more appropriate to the configuration of client or repository services.
[Blaze] Matt Blaze, Joan Feigenbaum and Angelos Keromytis, "KeyNote: Trust Management for Public-Key Infrastructures," Cambridge 1998 Security Protocols International Workshop, England (1998).
[Caplan] Priscilla Caplan and Dale Flecker, "Choosing The Appropriate Copy: Report of a discussion of options for selecting the among multiples copies of an electronic journal article," Digital Library Federation Architecture Committee (September 1999).
[DOI] International DOI Foundation, "The DOI Handbook, Version 1.0.0," (February 2001).
[ebXML] ebXML Technical Architecture Specification, v0.9 (October 2000).
[Erickson] John S. Erickson, et.al. "Principles for Standardization and Interoperability in Web-based Digital Rights Management," A Position Paper for the W3C Workshop on Digital Rights Management (January 2001).
[FIRM] R. Martin Roscheisen, "A Network-Centric Design For Relationship-Based Rights Management," Ph.D. Dissertation (Stanford University, 1997).
[indecs] Godfrey Rust and Mark Bide, "The
[Handle] Handle System. <http://www.handle.net/index.html>.
[Kahn] Robert Kahn and Robert Wilensky, "A Framework for Distributed Digital Object Services," (1995).
[Lannom] Larry Lannom, "Handle System Overview," Presented at the first DOI-EB working group meeting, Reston, VA (March 2001).
[Lynch] Clifford Lynch, "A White Paper on Authentication and Access Management Issues in Cross-organizational Use of Networked Information Resources," Coalition for Networked Information, (1998)
[MPEG21] Vaughn Iverson et.al., "MPEG-21 Digital Item Declaration WD (v1.0)," ISO/IEC JTC 1/SC 29/WG 11/N3825 (January 2001, Pisa, IT).
[MPEGCFR] MPEG, "Call for Requirements for Rights Data Dictionary and Rights Expression Language" (2001).
[NISO] Meeting Report, NISO/DLF/CrossRef Workshop on Localization in Reference Linking, Reston, VA (July 2000).
[ODRL] Renato Iannella, "Open Digital Rights Language Specification v0.8."
[P3P] W3C, "Platform for Privacy Preferences (P3P) Project."
[P3PFAQ] W3C, "P3P and Privacy on the Web FAQ," (2001).
[RDF] W3C, Resource Description Framework (RDF) Model and Syntax Specification (1999).
[SafeDeal] Henry M. Gladney and Arthur Cantu, Jr., "Safe Deals with Strangers: Authorization Management for Digital Libraries," to appear in Comm. ACM (April 2001).
[WebDesgn] W3C, "Design Principles of the Web" (2001).
[WebGoals] W3C, "W3C's Goals" (2001).
[WebServ] "…A Web service is simply an application delivered as a service that can be integrated with other Web Services using Internet standards. In other words, it is a URL-addressable resource that programmatically returns information to clients who want to use it. One important feature of Web Services is that clients don't need to know how a service is implemented…"
(17 April 2001, a correction was made to last sentence in the final paragraph of section 2.3. The sentence now reads: "Lynch's analysis suggests that trust models based upon one or more additional levels of indirection might contribute to a more decentralized form of trust management and thus would be more practical in the future.")
Copyright 2001 John S. Erickson