Necessary but Not Sufficient: Modelling Online Archive Development in the UK

Search | Back Issues | Author Index | Title Index | Contents

D-Lib Magazine
January/February 2008

Volume 14 Number 1/2

ISSN 1082-9873

Necessary but Not Sufficient

Modelling Online Archive Development in the UK

Ian G. Anderson ¹
HATII
University of Glasgow
<I.Anderson@hatii.arts.gla.ac.uk>

Introduction

Since the mid 1990s archives have greatly expanded their online information and services, yet this is an area of archive development about which we know little. We do not know the range of online services, features and information available, how these vary between archives or how they are delivered. The process of adding to our knowledge is hampered by the lack of a template for identifying and evaluating online archive information and services or a model for understanding their development. Without this information it is difficult to identify development paths, user requirements and appropriate technology, as well as the activities that can be supported or how archives compare to other information providers. It is only by developing and testing such templates and models that we can begin to understand the current state of online archives, how they might develop and the challenges they face.

Despite the hyperbole surrounding the development of the World Wide Web, and now Web 2.0, there are surprisingly few models for evaluating the evolution of online services, although criteria to judge 'good' and 'bad' web design proliferate.² Typically, web site reviews consider content, services and functions alongside graphics, page layout, navigation, interactivity, human computer interaction and accessibility. Moreover, they do so at a particular point in time rather than taking an evolutionary approach. This article is not concerned with issues of interface design and aesthetics, although these are important issues, and poorly designed interfaces can certainly prove a barrier to users. Instead, this article concentrates on the range and depth of information and services provided. This approach is justified, because in the development of archives' online presence it is how they translate their analogue information, services and functions into the online world that will ultimately determine their relationship with virtual users. In this regard, research by Anderson and Tibbo has revealed the diverse information retrieval behaviour of one user group, academic historians.³ The complexity of a system to meet the needs of this user group, let alone multiple constituencies, immediately raises the question of how well developed archives' current online services are and whether archives have the capability to develop their services further. Studies in the fields of teaching and learning, business, information technology, digital libraries, scientific information and online museums provide some pointers to the way in which a model for online archive services might develop but none provides a holistic solution.

In the field of learning and teaching, Chickering and Ehrmann relate the American Association for Higher Education's (AAHE) 'Seven Principles for Good Practice in Undergraduate Education' to new information technology. They argue that any use of new technology needs to be consistent with these seven principles.⁴ Where information technology is used appropriately, it will encourage contact between students and faculty; develop reciprocity and cooperation amongst students; use active learning techniques; give prompt feedback; emphasize time on task; communicate high expectations and respect diverse talents and ways of learning. Applications of this model emphasise learning outcomes above information technology per se, but nevertheless, relate these outcomes to a range of technologies from simple email and non-interactive web pages through computer-based conferencing to interactive web sites and multimedia.⁵ The value in this model is in its association of new technology with accepted, generic, learning outcomes. The model is also useful in recognizing that ease of implementation for faculty and institutions is a key factor. Whilst it identifies a number of the seven principles that can be achieved using relatively simple technology, such as email and non-interactive web pages, a greater range of activities and benefits are available with more dynamic and interactive technology. Adopting a similar principle for an online archive model would suggest the necessity of aligning services, features, content and technology with accepted archival functions.

Staying in the field of education, Clyde's content analysis of school library web sites over a six-year period provides the closest example of the methodological approach adopted in this study.⁶ Clyde examined 50 sites in 1996 and revisited them again in 1999 and 2002, by which time the number of live sites from the original cohort had reduced to 32. Clyde initially examined 26 content and feature categories, with a further 28 being added in 1999 and 12 more in 2002. Aside from the longitudinal aspect, Clyde's approach indicates the value of having an evolving set of criteria by which to analyse web sites. Providing a degree of 'head room' in the criteria enables a model to be predictive, rather than just descriptive.

Jacob and Huxley, who emphasise the changing nature of the technology in learning resources, take a slightly different approach. They characterise the evolution of online educational resources in four stages, emphasising the enabling nature of technological change. This starts with a range of static resources, through archives of these resources, to mediated online professional networks and, eventually, to viable online communities.⁷ The linking of static to dynamic and content to communities in this model represents concepts shared by both educational technologies and online business applications.

In the business world, web site evolution is typically represented in either three or four stages. WestNet Learning and Carson use slightly different terms, but both have simple three-stage models evolving from an electronic brochure site, through an e-commerce site, to a dynamic business application model.⁸ The transition from Stage 1 to Stage 2 is characterised by a move from static information to simple interactivity, and from Stage 2 to Stage 3 by more dynamic content, separation of interface and data, and application development environments that embody software engineering concepts. WestNet Learning suggests there could be a fourth stage of business web site, but gives no indication as to what form this might take. Today, one might surmise that this next stage is Web 2.0 technology, such as web services rather than software, participation, user-generated content and networking. Talisayon, on the other hand, has put forward a four-stage business model.⁹ This does not advance the technological boundary beyond the three-stage models, but splits Stage 2 into interactive and transaction models. Talisayon's interactive stage is characterised by the two-way flow of information between website and surfer; using email, chat rooms, bulletin boards, surveys, online games, etc. The other stages in Talisayon's model remain essentially the same as the three-stage versions.

A variation of these business models is provided by Chu et al.¹⁰ Chu provides a framework and longitudinal study of e-commerce sites and identifies four stages of development – pre-web, reactive-web, interactive-web and integrative-web – over the period 1993 to 2001. The pre-web phase need not concern us here. The reactive-web phase is characterised by providing static information to users; communication is one-way and businesses can only react to requests. The interactive phase is characterised by secure online transactions, increased personalisation and a two-way commercial process. The integrative era represents the management of business processes online, characterised by data sharing and extraction, and the integration of front-end with back office functions.

In contrast to the evolution of web sites in learning and teaching, business models make a more explicit association between technological development and the commercial services provided, with user outcomes implied. Whilst one can attain positive learning outcomes with relatively simple technology, this is less so in business models. Indeed, the notion of a web-based business application model is entirely predicated upon the development of appropriate technology. Although the emphasis in business and learning and teaching models may be different, they each suggest, typically, a four-stage evolution where technological development supports increasing levels of communication and interaction between service provider and end user, culminating in a process that approximates to a face-to-face transaction.

WestNet Learning recognises that for the user the difference between a Stage 2 and a Stage 3 site may be fuzzy, if apparent at all, as the key change is not what is done, but how it is done. This raises a fundamental problem for any model that attempts to represent the evolution of the web. Do you model services, technology or both? Although closely related, these two aspects are not entirely dependent upon one another. Furthermore, can either approach accommodate the 'outcomes' seen in learning and teaching, or is this a third and separate way of modelling the evolution of web sites?

As the models for the evolution of business web sites indicate, there is a strong relationship between the provision of services and the development of back-office systems to make this commercially viable. If one looks at the field of information technology, one can find another perspective on the evolution of web services. This approach concentrates on the development and application of software engineering principles to web services. As a large proportion of software costs are incurred in the maintenance phase, early, static HTML sites – whilst simple and flexible – were often low quality and expensive to update once content grew beyond a few pages.¹¹ Consequently, the evolution of web sites can be seen as a rapid succession of technologies that increased control over content creation with greater separation of data from its process and representation.¹² This approach can be expanded to form a conceptual web services architecture. In this model different roles (interaction, description and discovery) are defined as a series of functional layers, behind which are three overarching concerns of quality of service, security and management. This model then facilitates the evaluation of specifications, standards and interoperability of the technology in each particular layer.¹³ Even more so than business models, this approach puts technological change as the driver of web site evolution.

Storey and Jahnke, however, expand on the traditional software engineering model to define web site evolution as user interface and data driven.¹⁴ In the web environment both of these paths pose additional challenges. In terms of data, one of the goals of the web is to integrate distributed and heterogeneous data sources, often on a large scale. This raises issues of harmonising data structures, resolving overlapping data and enforcing consistency.¹⁵ This is also something the archives community have been striving towards with the development of standards such as the General International Standard Archival Description (ISAD(G)) and interoperable implementations such as Encoded Archival Description (EAD). As far as interface design is concerned, there are competing demands between novice users who require simple and intuitive interfaces and experienced users who demand more sophisticated features and tools. Added to this is a general and persistent pressure to evolve web sites.¹⁶ Archive users are no less diverse and the challenge of providing appropriate features and tools as great.

Other studies have evaluated web evolution in terms of the rate and extent of page updates in order to produce more effective indexes,¹⁷ or have examined long-term trends in the growth and decay of web structure and content in scientific directories and information services.¹⁸ Similar studies have been undertaken in order to inform search engine design.¹⁹

If one looks at the field of digital libraries and online museums, one can find a further range of web site evaluations. Within the field of digital libraries there is a vast literature on evaluation, but surprisingly little of this deals explicitly with modelling the evolution of digital library systems as a whole. Digital library evaluation studies have hitherto been primarily concerned with a wide range of components, including systems performance, conceptual models, evaluation methods, metrics, collection properties, usage types and test beds. This diversity is reflected in the extensive survey of digital library evaluation literature conducted by Zhang and Saracevic.²⁰ Of the 89 works covered, only two deal with the development of digital libraries in an evolutionary sense. Fox and Hix proposed nine principles for digital library development under the categories of representation, architecture, and interfacing.²¹ Meanwhile, Saracevic suggested a conceptual framework for digital library development.²²

The latest work from the evaluation strand of the DELOS Network of Excellence reflects the varied evaluation approaches in the rapidly changing field of digital library development.²³ In addition to providing a comprehensive picture of the state of the art of digital library evaluation, the authors propose a framework for digital library evaluation and a methodology for classifying evaluation procedures. Although evaluating the development paths of digital libraries over time is not a specific component, this is certainly not precluded by the model. Moreover, the article raises the question of how digital library models relate to other complex information systems, such as archives. It is beyond the scope of this article to speculate on such issues, but the possibility of creating a generic version of the model used in this research or a digital library version to enable comparison between archives and libraries within the same institution are possible areas for future research.

The coverage of D-Lib Magazine provides a further indication of the breadth of studies in digital libraries, covering training,²⁴ infrastructure,²⁵ architecture,²⁶ and design and evaluation.²⁷ Whilst these articles address many features of digital libraries, none attempt to survey the sector as a whole. In contrast, Greenstein and Thorin's The Digital Library: A Biography attempts a wider survey of the digital library sector.²⁸ Based on survey returns from 21 members of the Digital Library Federation, the Greenstein and Thorin work concentrates on the aims, context and organisation of digital libraries. As such, they do not deal explicitly with the variety of digital library features and services or how these might be modelled. Nevertheless, characteristics of digital libraries are addressed and a developmental overview is provided. The authors recognise that early digital libraries are fundamentally experimental, and hence difficult to summarise, but three broad phases are defined.

The early digital libraries of the 1990s are characterised as ambitious, experimental, competitive and seduced by the holy grail of 'killer apps' that would avoid the restructuring forced on other organisations. However, these expectations could never be realised, because early collections suffered from being too small, too idiosyncratic and too passive. Furthermore, the competitive spirit contradicted the benefits of shared networked resources and hindered the development of standards.²⁹

The maturing digital library is characterised by an interest in modular systems architecture, not unlike the development of web application systems for online businesses, rather than 'killer apps'. A strategic approach is taken to collection retro-conversion, and there is a desire for established common standards rather than bold claims of innovation. Furthermore, the maturing digital library re-discovers the importance of user studies and seeks core technical solutions and organisational integration with the main library services. Lastly, the maturing digital library promotes itself as complementing, not threatening, traditional library services, fostering innovation, fulfilling traditional services and capitalising on successful services.³⁰

In the final stage of digital library development, Greenstein and Thorin see the 'Adult Digital Library' move from integration to interdependency, although the authors state it is much too soon to describe any digital library as fully mature. The key trends in this phase are the organizational, functional, and budgetary integration of the digital library into the main library. Other common trends are continued experimentation, with more emphasis on research and development, and interdependency with on and off-campus information organisations. Some of the challenges to be faced include responsibility for the long-term preservation and maintenance of born digital information, and the failure to integrate instructional technology with the digital library.

Perhaps the work closest to the archival web model outlined below is Still's. Here a survey of library web sites in the USA, UK, Canada and Australia is used to develop a set of 'core' elements. For US libraries, 16 core elements are identified, ranging from an update date and physical address, through OPAC links and subscription resources, to request forms, instructional material and remote access information.³¹

Rather like digital libraries, the literature on virtual museums has concentrated on the educational benefits and evaluations of site structure, multimedia, interface design and usability.³² The features and evolution of virtual museums have received relatively little attention. In part this is explained by the museum's prioritising educational materials, particularly for the K-12 market. There are of course exceptions. Museum International devoted two issues to 'Museums and the Internet' examining the range of activities being undertaken online. Avenier identifies four key roles for museums:

disseminate information and news,
access databases and image banks,
display virtual exhibitions, and
communicate, conduct research and provide instruction.³³

Meanwhile, Diaz and del Egido propose two means of classifying museum Web sites. The first is by end user: general public, teachers or specialists. The second method is by the information the museum Web sites present, such as advertisement (address, opening times etc.), educational tools (extensive collections, knowledge for teaching, pedagogical thinking, editorials) and historic value (resources of the institution) aimed at specialists and written by museums staff.³⁴ Where a longer, developmental perspective is taken, this is again concerned with usability.³⁵

The various approaches to web site features and development taken in other disciplines give some indication as to how an archival model for web site development may look. It should have at least four distinct phases that accurately reflect the different types of development within the domain. It needs to cover a wide enough range of web site content and functions to enable differentiation, but also needs to include features that suggest future development paths. Different technologies need to be accommodated, but a crude technical determinism should be avoided by linking these to archival functions and user needs. Moreover, the model needs to work irrespective of the type of archive, be it university or local record office, its location or organisation. Lastly, the model needs to indicate clear developmental routes that recognise the breadth and depth of online information but also provides key minimum criteria for each phase or type.

The Model

The Model for Archive Web Development (MAWD) described in this article was developed from data collected in July 2004 on 17 content and function features from each of 25 US and 25 UK archive web sites.³⁶ The 25 UK archives were selected by random number from 70 archives that had responded to a 2003 questionnaire on their online service provision. The US archives were sampled from the Carnegie Classification of Institutions of Higher Education who had also been interviewed about their online archive development. The same 25 archive web sites were then revisited in September 2007, and data collected again on the same 17 content and function features, thus enabling a measure of the extent to which they had developed over the intervening three-year period.

The questionnaire on UK archives had been sent to a random sample of 150 repositories from the National Register of Archives. The sample comprised national, regional and local archives, university archives, business archives and specialist repositories. Archives were sent a questionnaire by mail but were given the option of returning an electronic version or completing the questionnaire online. Seventy responses were received, representing a response rate of 47%. The questionnaire asked 55 questions regarding archives online presence. Questions covered:

when the web site was first established,
who was responsible for design and content, frequency of updates,
how many hours per week were spent working on the web site,
what the goals of the site were,
what discovery tools were available, including the extent and nature of online finding aids,
what standards and formats were used for finding aids, reference services, user education and user evaluation.³⁷

The information collected provides a valuable context to help interpret the data collected from the subsequent web site evaluation.

For the web site evaluation itself data was collected on 17 features: hours and physical location, contact email address, staff directory, reference form, user services, donor information, collection description, electronic finding aids (including the number and format), tips, aids and resources, exhibits, the structure of the web site, whether it is searchable, whether it has a links page, how it is hosted and date of the last update.

The model itself has been iteratively developed, tested and revised over the last three years.³⁸ The initial version of the model in 2004 established five types of online archive and a sixth type was added in 2007. These types are outlined below in Table 1 and the full model in Appendix 1.

Table 1.
Model Type Summary

Type	Description
Type 1: Poster	Must Have: Essential location and opening time information is available. At least one contact email is provided for enquiries. Only a brief description of overall holdings is provided.
Type 2: Brochure	Must Have: In addition to Type 1 features, the Type 2 archive provides at least some collection/fond level finding aids, some of which may extend to series level. Simple lists (such as shelf lists) and indexes providing the location of material are also included in Type 2. The descriptions are static and the repository does not provide a capability to search them. Should Have: A downloadable enquiry form for mail-in requests is available. Repository rules and regulations are provided.
Type 3: Interactive Brochure	Must Have: The archive provides at least the same extent and depth of finding aids as Type 2, but these are searchable using tools built into the finding aid. Having a facility to search the web site is not considered equivalent to being able to search the finding aid itself. Should Have: Some advice on conducting archival research is provided as well as a submittable web form for online enquiries.
Type 4: Interactive Finding Aid	Must Have: The archive provides the same searchable functions as Type 3, but a majority of collections are represented in finding aids with some descriptions extending to folder/file level. Online exhibitions and/or educational resources can be found. Should Have: Tips, hints and advice on conducting archival research, and online help pages for using finding aids, FAQs and links pages to related collections and repositories are provided. Information on the archive's policies, practices and standards are available.
Type 5: Transaction Service	Must Have: The archive provides all of the services of Type 4, but the majority of finding aids extend to folder/file level with some descriptions extending to item level. A wider range of search options is available in the finding aid (e.g. cross-collection, free-text, person, place and corporate names, period, exclude, and Boolean operators). Finding aids provide links to at least some digital surrogates. Should Have: A range of additional online services is offered. These include pre-ordering of material, reprographic services and online payment. Context sensitive help should be available and services are tailored to specific archival user groups, such as family and house historians, local historians, academics, professionals, first time users, students, school children or teachers.
Type 6: Interactive User Community	Must Have: The emphasis in Type 6 is on Web 2.0 interactive services that support information sharing, user-generated content and social navigation. Type 6 archives still provide searchable finding aids, but these may not be as detailed as those found in Type 5 archives. Instead, archives provide users with the tools that enable them to contribute to the development of finding aids, resources and services and share information and research. Should Have: Type 6 archives should provide users with both a public space in which to share and contribute information and a private workspace in which to save, annotate and customise relevant information. Such services might be delivered through saved searches, tag clouds, comments, notes, ratings, live help, forums, chat or blogs.

The model described in Table 1 aims to be both descriptive and predictive, to reflect not only what is, but also what could be. The first five types in the model are based on the evaluation of web site data collected in 2004. Although no archive matched Type 5 criteria, transaction service was included in the model to indicate one potential direction in which online archive services might develop. In 2007, Type 6, Interactive User Community, was added as an alternative development route to Type 5 and to reflect the growing importance of Web 2.0 technologies. Given the wide range of criteria evaluated and the varying extent to which archives matched these, a refinement of the model was to split criteria for each type into 'must have' and 'should have' fields, with classification based on the extent to which an archives met 'must have' and then 'should have' criteria.

Type 1 and Type 2 are analogous to the 'brochure' sites found in early e-commerce sites. Both provide the minimal information required to contact the archive or make a physical visit. Type 2 sites, whilst still static, are more likely to enable the user to identify relevant collections – or parts of collections – prior to a visit, but the lack of search facilities inhibits this process. The crucial distinction in Type 3 sites is that the finding aids can be searched directly within the repository web site, hence the designation Interactive Brochure. This interactivity is far more likely to assist users in identifying relevant collections, or parts of collections, quickly and effectively. The online enquiry form and additional advice provided by Type 3 sites also help users refine and expedite enquiries.

The additional depth and breadth of the finding aids in Type 4 sites should enable most users to identify some relevant material in advance of a visit, and the additional research resources help focus their work. In Type 5 sites, where at least some digital surrogates are provided and where these are linked to sufficiently detailed finding aids, the possibility of the online archive becoming a Transaction Service arises. That is, some users may be able to achieve their information-seeking goals without physically having to visit the archive at all. They can identify, locate and download the information they require, and sufficient guidance and help is provided online to assist with understanding and interpretation.

The sixth type of online archive service, Interactive User Community, is one that incorporates features of Web 2.0 technology. Anderson suggested features of an 'archival expert system' that harnessed the Web's ability for user-generated content.³⁹ It was suggested that users could populate online finding aids, rate the usefulness of sources and provide comments on them. Live help, intelligent agents, chat and forums are all additional features that can be imagined having applications in the online archive environment. It has also been possible to demonstrate that innovative structures and interfaces can be developed that present users with a radically different view of the archive.⁴⁰

Although the Type 5 online archive provides one potential development path, until and unless archives have significant portions of their collections digitised they will not be able to exploit the two-way, online transactions that have driven e-commerce and digital library development. Furthermore, such a development would require archives to develop finding aids to a sufficiently detailed level to enable them to be effectively linked to the digital surrogates. Therefore, it may be more realistic – given archives difficulty in cataloguing their existing collections to any level of detail, let alone digitising them – to develop services along the lines of Type 6, rather then Type 5. The cultural and technological shift would be greater than developing to Stage 5, but the long-term resource implications may well be less. Indeed, it may even be possible to harness the user-generated content aspects of Web 2.0 technology to overcome some of the problems of cataloguing backlogs and the limited range of digitised content.

In developing the model two key challenges arose: the number of types to include and the criteria for distinguishing one from another. One of the main criteria by which the model categorises online archives is the scope and depth of the finding aids provided. The term 'finding aid' can of course cover a wide range of tools from collection-level descriptions through container lists to full-fledged, multi-level EAD finding aids. In turn, finding aids may be represented in anything from PDF files, through relational databases, to complex XML, and reflect original or intellectual order of the collection. Consequently, the use of the term 'finding aid' in the model has had to be flexible and inclusive. As a basic criterion, those that appear to conform to the six minimum required ISAD(G) elements have been considered as finding aids.⁴¹ It has not been possible within the scope of this project to verify whether every finding aid conforms to minimum ISAD(G) standards but it has served as a useful reference point.

On a more practical note, whilst it is relatively easy to establish the level to which a finding aid has been created, it is far harder to establish the extent to which an archive's collections are represented in online finding aids. Because Types 2, 3, 4 and 5 in the model have increasing coverage (as well as greater depth) of finding aids, as criteria this was potentially a major stumbling block. Almost without exception, online archives give no indication as to the extent to which their holdings are represented in online finding aids. Aside from a hindrance to this research, this raises the potential for the online user to think that if information can not be found online that the archive does not hold it. However, the 2003 questionnaire sent to UK archives asked 'For approximately what percentage of your collections do you have finding aids online'. This figure provided a baseline that could then be correlated with the number of finding aids found in 2004 and 2007 to provide at least an approximate measure of the extent of collections that were represented online. The same questionnaire also asked archives to what level their finding aids had been created and the standards used, providing an additional check on the information found online.

A further challenge in terms of UK archives was that a large proportion had searchable, online finding aids, but these were accessed through a variety of external archive hub services, not through the archives themselves. While individual archives provided data for these services, the services are not developed, hosted or controlled by them. This raised the question as to whether the model should differentiate archives according to whether they hosted searchable online finding aids themselves.

The argument in favour of doing so is that archives that host searchable finding aids themselves are more likely to have built up an in-house technical capability compared to those that do not. On the other hand, the ability of UK archive hub services to support searching across multiple collections and repositories often enhanced the service level beyond that typically provided by individual archives. To differentiate archives according to their ability to host searchable online finding aids would emphasise technical capabilities above information and service provision. Whilst there is clearly a relationship between the two, the model does not seek to enshrine technology for technology's sake. Moreover, an archive's true technical capacity could only be implied from the type of online finding aid provided. Therefore, it was decided to base the ranking of archives purely on the online services, irrespective of whether they were provided directly by them or not, while technical capacity was regarded as part of the broader developmental context of online archive services.

In order to emphasise service and information-led aspects in the model a generic set of archive functions and related user needs was developed, see Table 2 below. Using iterations of these it was then possible to establish a link between the online content and features, and the functions and needs that they could support, a similar approach to educationalists' modelling of online learning and teaching. It would be impossible to include all potential archive functions or user needs, but by working from generic examples it is hoped that the model of online archives can be situated in current archival practice. The final version of function and user needs can be found in the model in Appendix 1.

Table 2.
Key Archival Functions and User Needs

*Function*	*Description*	*User Need*
Collecting	Collecting materials from individuals, families, and organizations other than the parent organization	Greater range of relevant collections
Conservation and Preservation	Repairing, stabilising and/or preventing damage to materials	Collections can be used
Management	Administration, organisation and conduct of archive operations	Archives are open and services available
Access	Provision and development of indexes, finding aids or other tools to locate information, including related collections and repositories	Identifying and finding relevant material
User Support	Provision of archive, subject or thematic guides, advice and help	Better understanding of how to use archival material
Education and Outreach	Materials and services to schools, colleges, universities, lifelong learners and under-served user groups	Using archival material to understand broader issues
Reference and Research Service	Responding to information requests or research questions	Remote enquiry
Professional Development	Use, development and dissemination of best practice, standards, policies and experience	Archive seen as a trusted repository

Source: The above functions and descriptions are based as far as possible on the SAA Glossary of Archival and Records Terminology,<http://www.archivists.org/glossary/index.asp>.

UK Archives Online

In categorising online archives according to the Model for Archive Web Development (MAWD), a clear distribution pattern is evident (see Table 3 below). For UK archives in 2007, the majority (13 out of 25) were Type 3. None of the UK archives surveyed were Type 5 or Type 6 and only five were Type 4. Two archives were Type 1 and one was Type 2. One other archive was un-classifiable, although it did have a web page.

When comparing 2007 data with that from 2004, little significant change is apparent. Although all the web sites evaluated had been updated in some shape or form, significant content, feature or service additions was rare. Only three archives were reclassified based on 2007 data, in all cases these were from Type 3 to Type 4. All three of these archives were local authority archives, and the reclassification was based on the availability of more extensive online finding aids. In two cases this included the development of in-house searchable finding aids. Smaller, but still significant, changes were evident in additional online exhibitions and research and study guides.

Although there is always an element of serendipity in sampling intervals (for example, it is known that two Type 3 university archives are about to launch searchable, in-house finding aids online), the 2007 analysis suggests that a change from Type 3 to Type 4 is the most likely transition for archives to make. Given the lack of movement between other types, overall the rate of change points to an incremental, evolutionary development of online archive services rather than rapid or significant step changes.

Table 3.
Types of UK Online Archives

Type	2004	2007
None	1	1
1	2	2
2	1	1
3	16	13
4	5	8
5	0	0
6	0	0

Although the rate of change observed was not great, the rate of decay of the sites evaluated was minimal. None of the sites had gone off line and only four of the URLs had changed – in all cases this occurred where the archive site was part of a parent organisation. Although this project did not count the number of pages on each site, the structural change of sites was minimal, with the odd section added or merged between 2004 and 2007.

Further analysis reveals other patterns. UK archives were split into four categories: university, local authority (county, city and local archives), institutional and business.⁴² Type 3 and 4 archives were comprised entirely of local authority, institutional and university archives. Business archives were Type 1 or Type 2. The full breakdown of archives by type and stage is provided in Table 4 below.

Although all but one of the business archives' parent companies has a significant online presence, their archives do not appear to benefit from any particular e-commerce synergies or technological advances that the development of business web sites might suggest. However, the heaviest users of business archives are more likely to be in-house clients than members of the public or researchers. Therefore, developing extensive outward facing services is not necessarily the most logical deployment of resources. In contrast, publicly funded university and local authority archives more diverse users may need to provide a wider range of functions and services online.

Table 4.
MAWD Types by Archive Types

*Type*	*Archive*	*2004 Count*	*2007 Count*
0	Local Authority	1	1
1	Institutional	1	1
1	Business	1	1
3	University	4	3
3	Institutional	11	3
3	Local Authority	11	8
4	University	1	1
4	Institutional	2	2
4	Local Authority	2	5

Using the project's questionnaire data, the online archives can be analysed in other ways. Looking at when an archive first established a web site reveals that Type 4 archives were not necessarily early adopters. Although one Type 4 web site was established in 1996 and one in 1997, the remaining three were established in 1999, 2000 and 2002, respectively. In contrast, no Type 3 site was set up after 2001, with the majority established between 1997 and 2000. Archives at Type 1 and Type 2 established web sites between 1995 and 2000.⁴³

Certainly, establishing an early web presence does not appear to have conveyed any advantages as far as long-term development is concerned. Indeed, it may be the case that those who delayed their first web presence could take advantage of maturing technology and gain from the experience of early adopters. Looking at the questionnaire data, it is difficult to state with certainty that Type 4 archives spend significantly more time on their web sites than Type 3 do. Two Type 4 archives spent 5 hours per week on this activity, with one archive stating 'little' time was spent. Two others did not provide this data. This compares to an average time of 3.18 hours per week spent by Type 3 archives.

A more revealing difference occurs when looking at the number of staff employed who create electronic finding aids, one of the key criteria that distinguished Type 4 from Type 3. For Type 3 sites the average of those who supplied data is 4.25 people, whilst for Type 4 archives it is 7.25, with two of the archives employing over ten people.

The distinction between Type 3 and Type 4 sites continues when the responsibility for updates is examined (see Table 5 below). In Type 4 sites the responsibility for content updates rests entirely with a team or committee, whereas in Type 3 sites there is a split between the teams and the head of the archive. What we do not know yet is whether those archives that were reclassified as Type 4 in 2007 achieved this through devoting more staff to developing finding aids and sharing the responsibility for website content updates.

Table 5.
Archive Web Site Content Update Responsibility

*Type*	*Head*	*Team/Committee*
3	8	8
4	0	5

In terms of the standards used in developing finding aids ISAD(G) is used by 53 of the 70 respondents to the 2003 questionnaire, with one other using SPECTRUM.⁴⁴ All but three of the Type 3 archives use ISAD(G), and all of the Type 4 archives use it. Fourteen archives use ISAAR(CPF), twelve the NCA rules and five LCSH.⁴⁵ EAD has had relatively little uptake, with only 13 of 70 respondents to the questionnaire using it, 8 of which were university archives. This may suggest that the use of standards, particularly ISAD(G), goes hand in hand with the development of more extensive and detailed finding aids found in Type 3 and 4 archives. Indeed, this would be a necessary pre-requisite to enable consistent searching found in Type 4 archives.

Although a reasonable number of archives that responded to the questionnaire (29%) had conducted some form of user survey, only 16% had done so to establish user needs or inform service development, and only 4% had done so to establish user categories. Certainly, there is little evidence of the adoption of market segmentation techniques that are an essential component of developing online archive systems, or indeed any kind of archive service.⁴⁶ Whilst it is possible that archives' online services will develop organically to meet the needs of their users, there remains the risk that without a better understanding of who their users are and what those users need, online archive services will fail to meet the needs of key user groups.

If one looks beyond the basic location and finding aid information the provision of other types of online information and services is highly variable. Of the 25 archive web sites evaluated, 20 had some form of FAQ, tips, source or research guides. These resources ranged from a single genealogy guide on one site to 82 source lists on another. Variations include guides on palaeography, reading room rules, records management and information for new users. Guides on genealogy, local history and house history were by far and away the most common, being found on 15 of the 25 sites with other subjects including banking and transport history.

The same number of archives, 20 out of 25, provided some form of online exhibition, usually of digital images, but these were far less numerous than for resource guides. Whilst one web site had 26 'learning resources', another a very comprehensive educational web site, and one site 300 sample film clips, the remainder had between one and seven exhibitions online. This aspect remains under-developed and only one archive added significant exhibition content between 2004 and 2007. Furthermore, in all cases surveyed exhibition content was very much 'stand alone' with no cases linking exhibition content to finding aids.

All but four archives had information on the user services they provided, but once again this information was highly variable. Twelve of the archives had information on their research or search services and nine on reprographics, all but one of them being Type 3 or 4 archives. Only three archives described a wide range of services, including, talks, tours, preservation and conservation, records management, shop and support for education, teachers and faculty.

Other information or services one might expect to find online were scarce. In only four cases, all Type 3 or 4 archives, was information for donors found, although two of these did include the archive's collection policy document. Although 15 archives provided external links on their sites, only six of these could be considered comprehensive in any way, providing links to related collections and resources, repositories and organisations. Once again these were all Type 3 or 4 sites. Only seven of the web sites (as opposed to finding aids) could be searched and in all but one case this was a function of the parent organisation hosting the site. Only two sites provided a staff directory and in no instance was extensive information found on the standards and practices employed by the archive. One could argue that users do not necessarily care about the finer points of ISAD(G), EAD, appraisal or description but this would be one way of communicating the professionalism of archivists and it should be remembered that other archivists are also users of archive web sites.

Development Paths

Seventeen of the 25 UK archives web sites evaluated provide searchable online finding aids, of varying levels of detail and extent, through an archival hub service.⁴⁷ However, all Type 4 archives provided their own online searchable finding aids, although whether the finding aid was provided in-house or externally is not a criterion. This might suggest that developing the breadth and depth of finding aids and other resources that are Type 4 criteria goes hand in hand with the ability to deploy them in-house. However, the pattern of using hub services to provide online finding aid access has a number of advantages. Even a very small number of hubs providing access makes it easier to establish standards, data controls and consistency across the community and provide the sort of data exchange that forms part of the integrative phase seen in online business development. There are also distinct advantages from a user's perspective. In particular, they can search multiple archives simultaneously and do not need to know which archives may hold relevant collections in advance. Lastly, it would be very difficult to apply cross-archive searching on hundreds of individual finding aids from each individual repository. It would also seem more likely that the sort of interactive community services envisaged in Type 6 archives could best be supported by a centralised service, given that their success depends not only on a degree of technical expertise but also a large enough user community to make them viable.

The drawback is that without online finding aids of their own, archives will find it more difficult to add on particular services and features of their own or to customise services to particular user groups. Linking digital surrogates to a hub service would also need careful consideration, as the data storage and preservation burden could be great. Nor does this line of development solve the problem of archives finding the staff resources to create the data required by the hub services. The adoption of something akin to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), with distinct data and service providers and automated metadata harvesting may provide an efficient means to overcome this problem.⁴⁸ Meantime, providing a customised front end for the hub service that could be integrated within existing archive sites would avoid the necessity of users being transferred to a separate site with a generic interface to undertake catalogue searches. Indeed, contributors to the Archive Hub are able to deploy their part of the finding aid locally as a 'spoke' combining the advantages of both local and hub distribution.

It remains to be seen whether or not archives develop the range of finding aids, content and services that match Type 5 criteria. Widening the sample may identify such archives but on the evidence so far it seems likely that such an archive would only be encountered today at the national level. Although the time and resource implications of developing both finding aids and digitised content are considerable, Type 5 remains the most evolutionary path for those archives currently categorised as Type 4.

Responses to the question on archive web site's goals included in the questionnaire would bear out that something akin to Type 4 is the objective of many archivists. Repository and collection information were almost universal goals. Twenty-five of the 70 respondents also included online finding aids as a goal. Fourteen included patron communication, whilst social inclusion, lifelong learning and digital content access were also objectives for 9, 15 and 12 archives, respectively. Publicity, marketing, professional communication, links, guides, advice and resources were other goals mentioned. Only one archive included anything akin to Web 2.0 technology with one wanting to develop a 'forum for researcher discussion'.

It is also possible that archives will develop along a path not yet predicted or in some unforeseen hybrid of types identified in the model. Nor is an archive that develops to a Type 2 or 3 site necessarily inferior. Whatever development paths online archives take, however, must be in response to what their users need, as well as what functions, resources and technology the archive can provide. It should be born in mind that no one development path or type suggested in this article is only inherently 'better' than another in so far as it serves the needs of its users.

Conclusion

The Model for Archive Web Development (MAWD) described here provides a framework with which to evaluate the state of online archives and indicate potential paths for future development. The small number of archives that are Type 1 and Type 2 may indicate that these two types could be combined. However, the fact that the model picked up the archives that developed from Type 3 to Type 4 between 2004 and 2007 would suggest that there is sufficient discrimination and graduation elsewhere.

Although more advanced technology can support a wider range of archival functions and user needs, this is not necessarily appropriate for all archives, in all cases and for all users. However, this does crucially depend on archives having undertaken sufficient user needs analysis to identify what type of online information and functions are relevant for particular user groups. Archives need to ensure that the technological choices they make are enabling, not deterministic. While a hierarchy of online archives is implicit in the model, what ultimately matters is whether or not archives provide the services their users need. If all that their users require are directions and opening times, then developing sophisticated online systems would be pointless. What is open to question is whether, from the archives surveyed, there have been sufficient user evaluations to establish users' online needs, or clearly segmented user groups.

The MAWD also suggests developmental paths for online archives, based on their current status and potential for future development. It is also hoped the model will facilitate comparative analysis, not only between different types of archives, but between archives in different countries, although it remains to be seen to what extent the model will need to be adapted to this purpose. Of course, the model has so far only been applied to a limited, if representative, sample of UK archives. Extending the number of archives surveyed would provide more robust evidence, even if it would be unlikely to fundamentally alter the basic components of the model. A repeat of the 2003 questionnaire is also overdue to provide additional context and explanation for the development of online archives, and a comparison with the library services, particularly where these are provided by the same institution, would be enlightening.

As it stands, the majority of UK archives are providing the necessary, if sometimes basic, range of information and services that support different archive functions and user needs. Of these, a large number (Type 3 and Type 4 archives) have gone some way toward enabling users to progress their enquiries without having to physically visit the archive. However, UK archives overall are far from providing online services that are sufficient to reflect the full range of their services in the analogue world. In particular, the development of online finding aids, digitised content and material to support different types of user require more development. Unfortunately, these are also the areas that archives find most difficult to progress rapidly given the resource implications. This may suggest that UK archives need to adopt a more radical approach to their online services than the incremental development that this analysis indicates. Making more use of hub services (particularly to implement Web 2.0 technologies), integrating local versions of hub finding aids back into repository web sites and harnessing user input may lift the burden of service delivery sufficiently to enable archives to accelerate the development of their online presence.

In the meantime, there are a range of relatively minor steps that a large number of archives could take that would greatly enhance their online presence. Indicating the extent to which holdings are represented in online finding aids is one obvious example. Linking existing exhibition material to finding aids, tailoring research and source guides to particular users, providing more extensive external links and providing more information on the range of services, policies and practices that archives have are all small, but valuable, steps archives could take.

Appendix 1: Model for Archive Web Development (MAWD)

Notes

1. The author would like to thank the Gladys Krieble Delmas Foundation, which funded the initial part of this research under the auspices of the Primarily History project. Particular thanks are due to Prof. Helen Tibbo of the School of Information and Library Science, University of North Carolina Chapel Hill with whom the initial research and modelling was undertaken and who provided invaluable comments on drafts of this article. Thanks are also due to Casey Roberson at SILS, UNC who undertook a search of recent literature and made valuable suggestions on the use of Web 2.0 in online archives. Any errors or failings remain the sole responsibility of the author.

2. <http://www.webpagesthatsuck.com> is one of the longest running and best known of these web sites.

3. Anderson, I. 'Are you being Served? Historians and the Search for Primary Sources', Archivaria, No 58, (Fall 2004) & Tibbo, H. 'Primarily history: historians and the search for primary source materials', Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, 2002.

4. Chickering, A. W. and Ehrmann, S. C. 'Implementing the Seven Principles: Technology as Lever', AAHE Bulletin, October, 1996, pp. 3-6.

5. Caputo, A. J. Using Internet Technology in Teaching and Learning, <http://www.tss.uoguelph.ca/ltci/TGuides/using-tech.pdf>, 2000.

6. Clyde, L. A. 'School Library Web Sites: 1996-2002', The Electronic Library, 22(2), 2004. pp. 158-167.

7. Jacobs, N. and Huxley, L. 'From static content to dynamic communities: the evolution of networked educational resources' Online Information Review, 26(1), 2002.

8. WestNet Learning, Web Strategies for Business, Unit 1, pp. 4-7, http://www.westnetlearning.com, 2003 and Carson, L, Web Site Evolution, University of Minnesota, <http://www.westnetinc.com/mkt/catalog/sampleunit/WS4B.pdf>, 2002.

9. Talisayon, S.D. 'Knowledge Networks 3', Business World Online, <http://www.geocities.com/serafintalisayon/KNetworks.html>, 2001.

10. Chu, S.-C. Leung, L.C. Hui, Y.V. Cheung, W. 'Evolution of e-commerce Web sites: A conceptual framework and a longitudinal study', Information and Management, 44, 2007, pp. 154-164.

11. Warren, P., Boldyreff, C. and Munro, M. 'The Evolution of Web Sites', Proceedings of the 7th International Workshop on Program Comprehension, (1999) IEEE Computer Society, Washington DC, p. 178.

12. Storey, M.A. and Jahnke, J.H. 'Web Site Evolution - Towards a flexible integration of Data and its Representation', First Workshop on Web Site Evolution (WSE'99), October 1999. p. 3.

13. Kreger, H. 'Fulfilling the Web Services Promise', Communications of the ACM, June 2003, 46(6). pp 29-30.

14. Ibid. Storey and Jahnke.

15. Op cit pp. 3-4.

16. Op cit p. 4.

17. Fetterly, D., Manasse, M., Najork, M. and Wiener, J. 'A Large-Scale Study of the Evolution of Web Pages', Software: Practice & Experience, 34(2), February 2004, pp. 213-237.

18. Ortega, J.L., Aguillo, I. and Prieto, J.A. 'Longitudinal study of content and elements in the scientific web environment', Journal of Information Science, June 2006.

19. Ntoulas, A., Cho, J. and Olston, C. 'What's New on the Web? The Evolution of the Web from a Search Engine Perspective', Proceedings of the World-Wide Web Conference (WWW), May 2004.

20. Zhang, Z. and Saracevic, T. <http://www.scils.rutgers.edu/~miceval/research/DL_eval.html>.

21. Fox, E.A., Hix, D., et al. 'Users, user interfaces, and objects: Envision, a digital library',Journal of the American Society for Information Science, (1993) 44(8), pp. 480-491.

22. Saracevic, T. 'Digital library evaluation: toward and evolution of concepts', Library Trends, (2000) 49(3), pp. 350-369.

23. Furh, N., et al, 'Evaluation of Digital Libraries', International Journal on Digital Libraries, 8(1), November, 2007.

24. Hatings, K. and Tennant, R. 'How to Build a Digital Librarian', D-Lib Magazine, 2(11), November 1996, <http://www.dlib.org/dlib/november96/ucb/11hastings.html>.

25. Flecker, D. 'Harvard's Library Digital Initiative', D-Lib Magazine, 6(11), November 2000, <http://www.dlib.org/dlib/november00/flecker/11flecker.html>.

26. Arms, W.Y., Blanchi, C. and Overly, E.A. 'An Architecture for Information in Digital Libraries', D-Lib Magazine, 3(2), February 1997, <http://www.dlib.org/dlib/february97/cnri/02arms1.html>.

27. Computer Science and Telecommunications Board, 'Design and Evaluation: A Review of the State-of-the-Art', D-Lib Magazine, 4(7/8), July/August, 1998, <http://www.dlib.org/dlib/july98/nrc/07nrc.html>.

28. Greenstein, D. and Thorin, S.E. The Digital Library: A Biography, Digital Library Federation, 2002.

29. Ibid. pp. 9-11.

30. Ibid. pp. 11-22.

31. Still, J. M. 'A content analysis of university library Web sites in English speaking countries', Online Information Review 25(3), 2001.

32. Dyson M.C. and Moran K. 'Informing the Design of Web Interfaces to Museum Collections' Museum Management and Curatorship, 18(4), 2000, pp. 391-406 & Sabin R. 'Museums and their Websites: An examination and assessment of how museums are coping with the challenge of the World Wide Web', Journal of Conservation and Museum Studies, No. 2, May 1997.

33. Avenier, P. 'Putting the Public First: The French Experience', Museum International, 204(4), Oct-Dec 1999, pp. 31-34.

34. Diaz, L.A.B and del Egido, A. 'Science Museums on the Internet', Museum International, 204(4), Oct-Dec 1999, p. 37.

35. Cuncliffe, D., Kritou E. and Tudhope D. 'Usability Evaluation for Museum Websites' Museum Management and Curatorship, 19(3), 2001, pp. 229-252.

36. The 17 content and function features and the structure of the model were developed in conjunction with Prof. Helen Tibbo of the School of Information and Library Science, University of North Carolina, Chapel Hill.

37. The full survey can be found at: <http://www.hatii.arts.gla.ac.uk/research/historians/primarily_history.html>.

38. Thanks are due to the AX-SNet members (http://www.axsnet.org) of the Archive User Research Workshop held at the University of Mid-Sweden in June 2005 for their feedback on this model.

39. Anderson, I. 'Are you being Served? Historians and the Search for Primary Sources', Archivaria, No. 58, (Fall 2004).

40. See <http://www.hatii.arts.gla.ac.uk/research/visual/visual.htm>.

41. The six minimum ISAD(G) elements are: Reference code, Title, Creator, Dates, Extent of the unit of description and Level of description.

42. These categories are according to the type of organisation, not holdings. So business archives are part of a corporation, not an archive that happens to hold business records. Institutional archives are non-profit organisations such as churches, hospitals, administrative bodies, etc. Local Authority archives are those at the administrative level beneath national government.

43. There are three Type 3 archives that could not provide the date of their first web page. One Type 3 archive reported establishing its first web page in 1992, which would have made it one of the first web pages in the world!

44. SPECTRUM is the UK Documentation Standard for Museums, see: <http://www.mda.org.uk/spectrum.htm>.

45. ISAAR(CPF) is the International Standard Archival Authority Record for Corporate Bodies, Persons and Families. NCA rules are the National Council on Archives Rules for the Construction of Personal, Place and Corporate Names.

46. Hallam Smith, E. 'Customer Focus and Marketing in Archive Service Delivery: theory and practice', Journal of the Society of Archivists, 24(1), April 2003, pp. 35-53 & Yeo, G. 'Understanding Users and Use: A Market Segmentation Approach', Journal of the Society of Archivists, 26(1), April 2005, pp. 25-53.

47. These include: A2A (Access to Archives), Archives Hub (for university archives), SCAN (Scottish Archives Network), ANW (Archives Network of Wales), GASHE (Gateway to Archives of Scottish Higher Education) and NAHSTE (Navigational Aids for the History of Science, Technology and the Environment).

48. Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), <http://www.openarchives.org/pmh/>.

D-Lib Magazine Access Terms and Conditions

doi:10.1045/january2008-anderson

D-Lib MagazineJanuary/February 2008

Volume 14 Number 1/2 ISSN 1082-9873