D-Lib Magazine
The Magazine of Digital Library Research

I N   B R I E F

September/October 2010


S-Match: an open source framework for matching lightweight ontologies

Contributed by:
Aliaksandr Autayeu
Postdoctoral Research Fellow
University of Trento
Trento, Italy

Interoperability among different representations of the same knowledge is a difficult problem because of the use of different terminology and a great variety of ways knowledge can be expressed. This problem has become more important in the Web era with the consequential information explosion the Internet has brought to our lives. Lack of interoperability and information explosion make it much more difficult to retrieve, disambiguate and integrate information coming from a wide variety of sources. The common characteristic among all these sources of information is that they can be represented using lightweight ontologies, which provide the formal representation upon which it is possible to reason automatically about hierarchical structures such as classifications, database schemas, library catalogs, and file system directories.

Semantic matching is a fundamental technique applicable in areas such as resource discovery, data integration, query translation, schema and ontology merging, peer to peer networks, data migration, and agent communication. Semantic matching is a type of ontology matching technique that uses semantic information encoded in lightweight ontologies. It operates on graph-like structures and finds the nodes that are semantically related. It has been proposed as a valid solution to the semantic heterogeneity problem, namely managing the diversity in knowledge.

S-Match is an open source semantic matching framework that provides several semantic matching algorithms and facilities for the development of new ones. It includes components for transforming tree-like structures into lightweight ontologies, where each node label in the tree is translated into a propositional description logic (DL) formula, which unambiguously codifies the meaning of the node. S-Match contains the implementation of the basic semantic matching, minimal semantic matching, and structure preserving semantic matching (SPSM) algorithms. The basic semantic matching algorithm is a general purpose matching algorithm, very customizable and suitable for many applications. The minimal semantic matching algorithm exploits additional knowledge encoded in the structure of the input and is capable of producing minimal mapping and maximal mapping at almost no extra cost. The SPSM is a type of semantic matching that produces a similarity score and a mapping preserving structural properties: (i) one-to-one correspondences between semantically related nodes; (ii) functions matched to functions and variables to variables.

The key contributions of the S-Match framework are:

  1. a working open source implementation of semantic matching algorithms;
  2. several interfaces ranging from easy to use Graphical User Interface (GUI) and Command Line Interface (CLI) to Application Program Interface (API), suitable for different purposes varying from running a quick and easy experiment to embedding S-Match in other projects;
  3. the implementation of three different versions of semantic matching algorithms, each suitable for different purposes, and the flexibility for integrating new algorithms and linguistic oracles tailored for particular domains;
  4. an open architecture extensible to work with different data formats with a ready-to-run implementation of the basic formats.

On the semantic matching project home page you can find the project documentation and downloads, subscribe to our S-Match Announce mailing list, leave us feedback or request a feature in S-Match Trac system, and stay up to date with the S-Match news feed.


Launch of the University of Central Lancashire Research Repository: CloK

Contributed by:
Helen Cooper
Repository Manager
LIS, University of Central Lancashire
Preston, United Kingdom

In September 2010 the University of Central Lancashire Learning and Information Services will be launching the Research Repository for UCLAN with CloK – a repository service designed to complement scholarly workflows, policies and practices. This is part of an ambitious project to develop a culture of digital preservation and self archive at UCLAN, with institutional commitment to Open Access at the heart of the University Community.

The objectives of the initial JISC start-up funding were to create:

  • an operational and sustainable institutional repository for UCLAN hosting a variety of digital assets to consistent and interoperable standards
  • a robust technological and governance infrastructure according to national and international information-environment standards
  • a culture that accepts the principles of an Open Access approach to the management of research and learning

The main thrust of the start-up phase of the project has been to support the UCLAN Research Strategy in providing a tool that will "boost the quality, scope and capacity of our research activities; ... (supporting) ... genuine evidence-based claims to be world-class (and) focusing especially on the enhancement of each member of staff's abilities and range of activities,"

With the first two targets already achieved, by March 2011 we will have the post Research Assessment Exercise 2008 research publications from all schools and units across the University recorded on the repository, with a large proportion of full-text included. This includes an opt-out mandate for post graduate theses in place from September 2010, moving to an e-only submission requirement in 2011.

The third part of the project, to engender a culture that adopts Open Access archiving of all digital intellectual products, is already being developed. Digital preservation of the archive has moved up the teaching and learning agenda at UCLAN, and later this year we will be developing a learning resource repository to complement the research archive.

Details of the CLoK project can be found at http://www.uclan.ac.uk/clok/ and the repository is at http://clok.uclan.ac.uk.


Logins for Life Project

Contributed by:
Leo Lyons
Analyst, Logins for Life Project Team
University of Kent
Kent, United Kingdom

The University of Kent is running a JISC-funded project, Logins for Life, to explore the provision of lifelong digital identities and online services to staff and students. Inherent in such provision is a method for dealing with the changing roles – and possibly multiple roles – of the id holder during his/her relationship with the University.

Creating a relationship with prospective students – maybe whilst still at school – is beneficial for both parties. Maintaining a relationship beyond graduation or retirement also has benefits.

Making it easier during initial enquiries to find and bookmark information is an obvious way in which the relationship with prospects can be enhanced. Similarly, providing targeted suggestions, links and opportunities for discussion with other enquirers, with current students and alumni through forums, may be attractive and improve recruitment rates. Hard facts about courses and facilities are not the only criteria, and a chance to learn something about the culture of a place might come from joining an online community.

Providing lifelong email – some HEIs (Higher Education Institutions) are already doing this – helps maintain contact with graduates and may influence their choice of post-graduate studies. It enhances links with alumni, allows them to keep their contact details up to date, and ensures their addresses on published papers do not become obsolete as soon as they leave the university. However, there is obviously a cost involved.

Some suggestions are more controversial – and could have a higher cost. Is there a case for providing alumni, some of whom may have gone on to post-graduate studies elsewhere, with continuing access to online resources? What are the licensing implications for such a move and who pays?

If HEIs are to attract people to engage at an earlier age, we need to explore how they might want to do that. The project will examine how the integration of social networking applications (Facebook, Twitter, etc.) and protocols like OpenID might be appropriate at different stages of a user's relationship with HEIs. Are registration forms and usernames and passwords essential in order to get a prospectus, or could we accept a Facebook account or OpenID credentials?

The Logins for Life project is a collaboration between teams from the Information Services department (led by John Sotillo) and the School of Computing (led by Professor David Chadwick). David's team are building a linking application, using Open Source protocols, that allows users to 'marry' existing digital identities to a University account, enabling access, with a predetermined level of assurance, to online resources via Facebook, OpenID, etc.

Information gathering for the project is drawing on interviews with stakeholders both externally and internally. Library staff in particular and all staff managing and administering systems are informing the project, and the team are also looking at best practice in other HEIs and beyond. The views of students already at Kent and the new intake in the autumn will be sampled via an online survey.

Contact from interested parties is welcome via the Logins for Life blog. Further information is available on the project web pages.


The National e-Infrastructure for Social Simulation (NeISS) Project

Contributed by:
Mark Birkin
Professor of Spatial Analysis and Policy
University of Leeds
Leeds, United Kingdom

Social simulation is an increasingly important and valuable technique in both research and policy-making. In particular, methods involving the use of individual 'agents' have been used to study problems such as the formation of economic markets, transmission of diseases, social and political negotiations, criminal behaviour, and housing preferences. In practice, however, realising the full potential of social simulation is often hampered by one or more of the following factors: lack of appropriate data; access to suitable models; availability of computation; and the ability to understand, visualise and share results. Crucially, individual users will rarely have the full range of skills or access to supporting technologies for the whole social simulation process. The National e-Infrastructure for Social Simulation (NeISS) is a project that aims to address these problems through development and application of the latest computational technologies.

Supported by JISC for three years as part of its Information Environments programme, the project leader is Professor Mark Birkin of the School of Geography, University of Leeds. In addition to geographers, the participants include sociologists, computer scientists, and e-health specialists from the Universities of Glasgow, Stirling, Southampton, Manchester, University College London and Daresbury Laboratory.

Development work so far has concentrated on linking together the processes of data extraction, simulation, visualisation and analysis as e-research 'portlets'; on publishing the resulting simulation experiments and workflows as objects that may be edited, transferred and shared; and on combining these portlets into web-enabled portals that can support social simulation research, policy analysis and decision-making. Applications have focused on the deployment of social simulation to support health care delivery, housing, land-use and transport policy. Whilst a process of requirements gathering and evaluation with prospective users has already begun, the NeISS team is eager to hear from individuals or groups who desire to engage more closely with this exciting research agenda. For more information, see the NeISS website (http://www.neiss.org.uk/) or contact the project leader (m.h.birkin@leeds.ac.uk).


The Xpert Project: A Distributed Repository for OER

Contributed by:
Patrick Lockley (Learning Support Development Officer)
and Steve Stapleton (Open Learning Support Officer)
The University of Nottingham
[email protected] and [email protected]

XPERT (Xerte Public E-Learning ReposiTory) was a JISC-funded project to explore the potential of delivering and supporting a distributed repository of e-learning resources, created and seamlessly published via the commonly used web technology of RSS Feeds. RSS feeds are common across newspaper and magazine style websites, allowing a subscriber to receive updates when new content is added to the site.

With the addition of RSS feed generation to the Xerte suite of tools, this allows several partner institutions to jointly produce content for publication in the open-access Xpert repository. Xerte Online Toolkits is a suite of open-source browser-based tools that allow content developers to develop rich, interactive and highly accessible learning content quickly and easily, and to seamlessly publish that content online. Collaboration with other content developers is supported, and it is easy to share and re-use content with other users. The system is extensible by developers and provides a proven, flexible and powerful solution, and it is in use all over the world.

The Xerte tools have been designed to make the addition of metadata to the learning resources easy and to expose them for harvesting by open-access repositories. Using RSS, carrying the richer Dublin Core metadata, allows more metadata to be searched and learning objects to be categorised easily in the repository. Crucially, resources remain with the institution that developed them, removing many of the barriers to repository use that other initiatives have encountered when requiring that the content itself be submitted (usually via uploading a file).

Building on the work done with a number of partner institutions, creators of learning resources can contribute to XPERT via RSS feeds, either created seamlessly (and automatically) through local installations of Xerte Online Toolkits, or through alternative RSS (or OAI) feed generators. Learners and educators can use XPERT to search a growing database of open learning resources suitable for students at all levels of study in a wide range of different subjects. XPERT also includes RSS feeds from many Open Educational Resources (OER) sites worldwide and therefore provides a rapidly growing rich resource of high quality e-learning resources from multiple institutions. To date (Summer 2010) the Xpert repository contains over 66,000 items.

Useful links

The Xpert repository: http://www.nottingham.ac.uk/xpert

The Xerte project pages: http://www.nottingham.ac.uk/xerte

Presentation on Xerte and Xpert at the Open Learning Conference November 2009: http://www.youtube.com/watch?v=LewfluR6leg&feature=PlayList&p=1ADDE924D7A740F2&index=5


I N   T H E   N E W S

September/October 2010


Introducing 'DuraCloud Open Source': Preservation in the Cloud

September 9, 2010 announcements from Carol Minton Morris, DuraSpace:

1) "The DuraSpace organization is pleased to announce the release of DuraCloud version 0.6. This latest version of the DuraCloud software, a platform which utilizes commercial cloud infrastructure, is now available for download from subversion: https://svn.duraspace.org/duracloud/tags/duracloud-0.6.0/. The DuraCloud software is deployed as a service hosted by DuraSpace using a cloud server environment and is integrated with multiple cloud storage providers, including Amazon AWS and Rackspace. The source code for the DuraCloud platform is made available to encourage community involvement in the development of the software, as well as to allow the creation of services which can be integrated with the DuraCloud system. Detailed information about the latest release may be found at the DuraCloud wiki: https://wiki.duraspace.org/display/duracloud06/DuraCloud+Release+0.6."

"A full list of DuraCloud Open Source features may be found here: https://wiki.duraspace.org/display/duracloud/DuraCloud+Features."

2) "The Fedora Repository team announced the release of Fedora 3.4 led by Steve Bayliss. This version includes a number of exciting new features and bug fixes that make Fedora an even more compelling repository platform. Read the post: http://expertvoices.nsdl.org/duraspace/2010/08/23/now-available-fedora-repository-34/. Related link: http://wiki.duraspace.org/x/AgAU."


Notice of cessation of PADI gateway and padiforum-l

September 2, 2010 announcement from Maxine Davis, National Library of Australia — "The PADI subject gateway http://www.nla.gov.au/padi/ and associated list padiforum-l http://listserver.nla.gov.au/wws/info/padiforum-l were initiated as a service to the digital preservation community and have been maintained by the National Library of Australia since 1997 in order to collocate selected information on digital preservation. For background information see 'About PADI', http://www.nla.gov.au/padi/about.html."

"As is to be expected with any portal to Web based documents maintenance of web links becomes progressively more demanding over time. Websites are redesigned, migrated to new platforms, URLs are changed, projects and their websites cease, so called persistent identifiers are not, and even when web documents or pages are archived in a web archive, questions arise as to which version of an archived page to link to (which date or even which archive as copies may be held in multiple web archives with different levels of completeness). The current structure of PADI requires the Library to commit around 0.5 of a fulltime staff member to locate, describe and enter links to new information sources and to maintain links to existing resources. Although originally conceived as a cooperative contribution model, increasingly the burden of adding material to PADI has fallen to the NLA as input from elsewhere has almost ceased."

"The information-seeking and information-providing mechanisms of a community also change over time. After reviewing the gateway service the Library has concluded that the existing website, database and list no longer meet the current needs and that the Library's resources are best invested elsewhere. While there may be more efficient ways of building a service like PADI today, using Web 2.0 tools, the Library is unable to make the investment in converting the existing service."

"Reluctantly – because we still find PADI useful ourselves – we believe we cannot sustain PADI and have decided to cease maintaining it."

"A copy of the website has been archived in PANDORA, Australia's Web Archive. The existing live website will remain available until the end of 2010; however no new resources have been added since the start of July 2010 and the existing links will not be actively managed. The archives of the padiforum-l list will continue to be available, however no new postings will be accepted from 30 September 2010."

For more information, please contact PADI (Preserving Access to Digital Information) at padi@nla.gov.au.


Eyes on the Prize Interviews: The Complete Series Collection

September 2, 2010 announcement from Cassandra Stokes, Washington University St. Louis, — "This summer, the Washington University Libraries are making available a vast collection of complete, unedited transcripts of nearly 300 interviews recorded for the award-winning PBS documentary series Eyes on the Prize."

"The transcripts, many of them never before seen, document the memories and first-hand accounts of hundreds of men and women who served on the front lines of the American Civil Rights movement. Taken together, they comprise a rich collection of valuable primary source material that is sure to be of interest to historians, teachers, filmmakers, and scholars of the Civil Rights era."

"The Eyes on the Prize Interviews: The Complete Series Collection, a digital resource on the Libraries' website (http://digital.wustl.edu/eyesontheprize/), features the fully searchable text of every interview conducted for the 14-episode television series..."

"...The Eyes on the Prize Interviews Collection [also] contains a wealth of material that never made it into the final television production. Some interviews were used only in part, while others were consigned entirely to the cutting-room floor. Because they have never been widely available until now, their potential value as historical and documentary source material is inestimable...."

"...For more information, visit the Eyes on the Prize Interviews Collection at http://digital.wustl.edu/eyesontheprize/. Or visit the Washington University Film & Media Archive at http://library.wustl.edu/units/spec/filmandmedia/index.html."

For the full press release, please contact Cassandra Stokes at cstokes@WUSTL.EDU.


IMLS Calls for 2011 Museums for America Grant Applications

Deadline: November 1, 2010

August 31, 2010 — "The Institute of Museum and Library Services (IMLS) is accepting applications to its largest museum grant program, Museums for America (MFA), for fiscal year 2011. Museums for America grants provide up to $150,000 in funding and support projects that strengthen a museum's capacity to serve its community."

"Museums for America grants are awarded in the following areas:

  1. Engaging communities (education, exhibition, and interpretation)
  2. Building institutional capacity (management, policy, and training)
  3. Collections stewardship (management of collections)"

"Through these broad categories, IMLS supports a full range of museum activities including digitization of collections, staff training, research, exhibitions, educational programs, Web site enhancement and development, collections management, and other similar activities."

For more information, please see the full press release.


NISO Publishes Cost of Resource Exchange (CORE) Protocol as a NISO Recommended Practice

August 31, 2010 — "NISO is pleased to announce the publication of its latest Recommended Practice, CORE: Cost of Resource Exchange Protocol (NISO RP-10-2010). This Recommended Practice defines an XML schema to facilitate the exchange of financial information related to the acquisition of library resources between systems, such as an ILS and an ERMS."

"CORE identifies a compact yet useful structure for query and delivery of relevant acquisitions data."

"CORE was originally intended for publication as a NISO standard. However, following a draft period of trial use that ended March 2010, the CORE Working Group and NISO's Business Information Topic Committee voted to approve the document as a Recommended Practice. This decision was in part based on the lack of uptake during the trial period as a result of recent economic conditions, and was motivated by the high interest in having CORE available for both current and future development as demand for the exchange of cost information increases. Making the CORE protocol available as a Recommended Practice allows ILS and ERM vendors, subscription agents, open-source providers, and other system developers to now implement the XML framework for exchanging cost information between systems."

For more information, please see the full press release.


IMLS Seeks Comments on Public Library Survey

August 27, 2010 — "In a Federal Register notice published on Monday August 23, 2010 (see http://www.imls.gov/pdf/PLS_60-day_notice_2010.pdf), IMLS issued a call for comments on its Public Library Survey. The survey provides essential data about public library service in the US including service hours, book circulation, number of librarians, service area population, technology and more."

"The survey is made possible through the cooperation of state libraries that provide data each year and through an agreement with the US Census Bureau. Analysis of survey data has led to policy and funding decisions that improve library service in the United States."

"For results and analysis of the most recent survey, see the IMLS FY2008 Public Libraries Survey Report at http://www.imls.gov/news/2010/063010.shtm. For recent analysis, see Service Trends in US Public Libraries at http://www.imls.gov/pdf/Brief2010_01.pdf and Libraries Use Broadband to Serve High Need Communities at http://www.imls.gov/pdf/DataNote2009_01.pdf."

For more information, please see the full press release.


Webinar Available Online: Helping Job Seekers: Using Electronic Tools and Federal Resources

August 18, 2010 — "The August 11 webinar, Helping Job Seekers: Using Electronic Tools and Federal Resource, is now available online at http://tiny.cc/d6fjb. In the presentation, Department of Labor Employment and Training Administration (ETA) staffers Lauren Fairley-Wright and Tracie Hamilton provide an overview of the public workforce system and share electronic tools most helpful to library staff who assist unemployed workers. The archived webinar includes the PowerPoint presentation and the participants' dialogue. Hosted by WebJunction, the webinar was made possible by a partnership between the Institute of Museum and Library Services (IMLS) and ETA announced June 25, 2010 http://www.imls.gov/about/workforce.shtm."

For more information, please see the full press release.


IU, NISO receive Mellon grant to advance tools for quantifying scholarly impact from large-scale usage data

August 12, 2010 — "A $349,000 grant from the Andrew W. Mellon Foundation to Indiana University Bloomington will fund research to develop a sustainable initiative to create metrics for assessing scholarly impact from large-scale usage data."

"IU Bloomington School of Informatics and Computing associate professor Johan Bollen and the National Information Standards Organization (NISO) will share the Mellon Foundation grant designed to build upon the MEtrics from Scholarly Usage of Resources (MESUR) project that Bollen began in 2006 with earlier support from the foundation. Bollen is also a member of the IU School of Informatics and Computing's Center for Complex Networks and Systems Research (CNetS) and the IU Cognitive Science Program faculty."

"The new funding for "Developing a Generalized and Sustainable Framework for a Public, Open, Scholarly Assessment Service Based on Aggregated Large-scale Usage Data," will support the evolution of the MESUR project to a community-supported, sustainable scholarly assessment framework. MESUR has already created a database of more than 1 billion usage events with related bibliographic, citation and usage data for scholarly content."

"The project will focus on four areas in developing the sustainability model – financial sustainability, legal frameworks for protecting data privacy, technical infrastructure and data exchange, and scholarly impact – and then integrate the four areas to provide the MESUR project with a framework upon which to build a sustainable structure for deriving valid metrics for assessing scholarly impact based on usage data. Simultaneously, MESUR's ongoing operations will be continued with the grant funding and expanded to ingest additional data and update its present set of scholarly impact indicators."

For more information, please see the full press release.


IMLS Podcast Provides Plain Language Facts about 21st Century Skills

August 11, 2010 — "The Institute of Museum and Library Services (IMLS) proudly announces release of a podcast on 21st century skills by IMLS Acting Director Marsha L. Semmel."

"To listen to the podcast, click here http://www.imls.gov/resources/podcasts_Aug10.shtm."

"In the 4.30 minute podcast, Semmel explains:

  • What are 21st century skills?
  • Where did the 21st century skills movement come from?
  • Where do museums and libraries fit in the 21st century skills movement?"

"Semmel also describes Making the Learning Connection (http://www.imls.gov/news/2010/061610.shtm), IMLS's national campaign to better understand the opportunities, challenges, and key issues facing today's museums and libraries in their efforts to meet their communities' 21st century learning needs. The campaign includes an eight-city workshop tour, a national contest, new online tools and resources, and a series of interactive webinars."

For more information, please see the full press release.


DTIC Launches Aristotle

August 6, 2010 — "The Defense Technical Information Center (DTIC) [has] launched Aristotle, a professional social networking site for the Department of Defense (DoD) science and technology (S&T) community. Aristotle provides a secure environment for scientists, engineers, researchers and program managers to network, create and collaborate with other experts in the S&T community."

"Aristotle is a Web-based social media tool that adds a new dimension to professional social networking for DoD employees. Users not only network with other individuals; they can link to Topics, Projects and Documents. Aristotle provides situational awareness of the larger DoD S&T community."

"Federal government and DoD employees and their contractors must register with DTIC to access Aristotle. In addition to the security provided by the requirement to sign on with a userid and password or by using a registered Common Access Card (CAC), users can assign permissions to everything they create in or upload to Aristotle."

For more information, please contact Sandy Schwalb at [email protected].


IMLS Awards National Leadership Planning Grants to 13 Institutions, $763,715 Distributed

July 30, 2010 — "The Institute of Museum and Library Services (IMLS), the primary source of federal support for the nation's museums and libraries, announces that 13 institutions are receiving National Leadership Collaborative Planning Grants (NLG) totaling $763,715. Grantees will contribute $491,995 in matching funds. There were 62 applications to the program with requests totaling $3,752,309."

"The NLG program includes two types of collaborative planning grants, which enable multi-institution project teams to work together to either plan a single project or to produce a white paper that will encourage multiple projects; and project grants, including both research and implementation grants, for which that preliminary work has already been done. National Leadership Grant research and implementation awards will be announced in September."

For more information, please see the full press release.


CLOCKSS adds its 11th archive node in Italy

July 26, 2010 — "CLOCKSS is very pleased to announce that it has added an eleventh archive node to its network of worldwide, redundant, and distributed archive nodes."

"The new node is hosted at the Universita Cattolica del Sacro Cuore, in Milan, Italy. The node will store a complete version of the CLOCKSS archive content. Like all CLOCKSS nodes, it uses award-winning open-source LOCKSS technology to automatically and continually check with the other nodes, audit itself, and repair any differences. If content in the archive becomes no longer available from any publisher ('trigger' event), CLOCKSS will release the content and make it available for free to the entire world."

"...The archive node at Universita Cattolica del Sacro Cuore brings the CLOCKSS network of archive nodes to a total of eleven. CLOCKSS node locations are selected to be geographically and geopolitically diverse, and housed at major libraries with many centuries of experience and continuity protecting the world's scholarship."

For more information, please see the full press release.


The eighth edition of the Intellectual Freedom Manual

July 20, 2010 — "ALA Editions, the publishing imprint of the American Library Association, announces the release of the Intellectual Freedom Manual, Eighth Edition, edited by the ALA Office for Intellectual Freedom (OIF). Updated for the first time since 2005, this indispensable volume includes revised interpretations of the Library Bill of Rights along with key intellectual freedom guidelines and policies, including:

  • a new chapter, 'Interactivity and the Internet,' and other fresh material on intellectual freedom and privacy in online social networks;
  • an examination of intellectual freedom for disabled library patrons;
  • coverage of the latest USA PATRIOT Act debates and extensions."

"Established Dec. 1, 1967, OIF is charged with implementing the intellectual freedom policies of the American Library Association by educating librarians and the public about the concept of intellectual freedom as embodied in the Library Bill of Rights and the Association's basic policy on free access to libraries and library materials. In order to meet its educational goals, the office undertakes information, support and coordination activities."

For more information, please see the full press release.

transparent image