Archives Described at Collection Level


Meg Sweet and David Thomas
Public Record Office, United Kingdom

What makes archives different?

At the Public Record Office, the UK National Archives, three keywords sum up our role: selection; preservation; access. Access applies not only directly to the records in whatever form but also via an intermediary, the finding aid or catalogue.

Archives have their own very particular features which are explicitly recognised in their description. First and foremost, they are hierarchical: one person's/family's/corporate body's archives are broken down into components which can themselves be further broken down, layer after potential layer into the smallest describable component. Archival description is multi-level. It is also based firmly on the concept of provenance. What is being described has been created and accumulated by an identifiable body (or bodies).

The General International Standard for Archival Description, ISAD(G), published in 1994, with a second edition due out in September 2000, is widely recognised and increasingly adhered to. Its popularity stems from its being firmly rooted in tradition and in its being a fairly permissive compilation of best practice. ISAD(G) incorporates a wide range of data elements for archival description, a small number of which it prescribes as mandatory for international data exchange. It embodies the principle of multi-level description and takes as its underlying premise four rules of multi-level description: go from the general to the specific; provide information relevant to the level of description; link descriptions; avoid redundancy of information. These rules have, to date, served archivists well.

Traditional use of collection level description

Unlike books, which are stand-alone products, archival documents can only be understood in the context in which they were created. T.S.Eliot's The Four Quartets by itself is readily accessible to users. A file of miscellaneous correspondence, on the other hand, can only be understood if it is known who wrote it and when it was written. Even then, the information needs to be qualified: a Treasury file of correspondence would have a different significance if it were produced by that part of the Treasury which dealt with public expenditure rather than by the part which dealt with the management of the economy. For this reason, archives have traditionally been described in terms of the organisation (usually a public body or private company) that created and accumulated them.

The fashion for describing archives has gradually changed over time. The Public Record Office (PRO), London, was founded in 1838 and moved into its first building in Chancery Lane in the early 1850s. At the same time, FS Thomas produced the first guide to the PRO. He used what would now be seen as a curious hybrid system. The records were described by their creating department - largely the medieval courts - but within those courts they were described by subject. So, the records of the Exchequer were described at collection level as Exchequer records and a short administrative history was given. The records were then described by subject - 'Abbeys, Accounts, Acquitances', etc.

This pattern of describing records by collection and then by subject was the norm at the PRO for the next 70 years. It was not until 1923 that M.V. Giuseppi produced what was the first modern guide to the Public Record Office. This gave the administrative history of each department and then went on to describe the divisions into which it was organised and then individual series of documents, ranging from medieval rolls to modern files. For the Exchequer, there was a detailed administrative history that focused on the records it generated, followed by an administrative history of each of the Exchequer's divisions. Finally, all the Exchequer's file series were described.

Giuseppi's model continued up to the 1960s when the last printed guide to the PRO was published. The great value of printed, high level guides to holdings was, before the advent of the World Wide Web, their provision of remote access (however limited) to the holdings of record offices. By the 1960s, however, the volume of modern records flowing into the PRO was so large that conventional print media could not keep up. The new style Current Guide of the 1970s onwards was produced annually but only for a strictly limited circulation at the PRO and a couple of other London institutions. A microfiche version was published at irregular intervals. For its first 150 years, the PRO, like most archives, had a dual approach to describing its holdings. Like most other archives it had separate, paper-based systems for providing access to collection-level descriptions and to individual items.

Why go beyond collection level description?

Although collection level descriptions are of enormous value to some researchers, particularly those who are conducting research into the history of individuals or of institutions, they are of limited value on their own to the broad range of researchers. This is because what researchers wish to see are individual files and it is only possible to identify detailed files from multi-level catalogues.

Some academic and other researchers would be interested to learn from the PRO's collection level catalogue entry that the PRO has the records of service of officers in World War One; the PRO's series level scope and content note indicates that. This series (WO 339) contains records and correspondence for Regular Army and Emergency Reserve officers who served in the First World War. The content of the files varies enormously, from a note supplying date of death, to a file of several parts containing attestation papers, record of service, personal correspondence and various other information. Records of British reserve officers who were commissioned into the Indian Army were originally held separately, but later added to this series. For the majority of the series there is no correspondence date range, and the nominal description has been abbreviated to surname and initial.

However, our experience is that most of our family historian users are interested in the records of individual officers, whether it is because they are tracing their ancestors or pursuing an interest in World War One poets. Consequently, they are far better served by having access both to the series entry and to an online list of all those officers whose records survive.

Similar problems are posed by very large series. For example, the main series of Home Office papers at the PRO has 26,000 files. The series level scope and content note is relatively comprehensive:

This is the main series of Home Office papers. The subject matter of the files reflects the diversity of domestic matters dealt with by the Home Office. These have included aliens, betting and gaming, borstals, building societies, burials and cremations, bye-laws, changes of name, the Channel Islands, charities, children, civil juries, drugs, ecclesiastical matters, elections, explosives, extradition, factories, fire services, firearms, honours, Ireland and subsequently Northern Ireland, the Isle of Man, magistrates, markets and fairs, Lords Lieutenant, mental patients and criminal lunatics, naturalization, pardons, petitions of right, poisons, police, prisons, prostitution, public order, use of royal title by institutions and companies, universities, vivisection, wartime measures, warrants and wild birds.

Further Home Office papers are in a supplementary series, HO 144:

These are files on criminal and certain other subjects, separated from the main file series in HO 45 because of their sensitivity at the time of transfer.

Both series are described clearly and accurately in conformance to standards and would be of great value to any researcher who is interested in broad issues of policy. They are of less use to people interested in particular individuals: there is nothing to tell the researcher that HO 45 contains the papers relating to the conviction of Timothy Evans for a murder he did not commit nor that HO 144 contains the Home Office's files on Dr. Crippen and the Jack the Ripper murders in Whitechapel.

In practice many archive users require clear, accurate and searchable descriptions of individual files (or their equivalents). They then move 'bottom upwards' to see the context in which the documents were created and used. A researcher into the Welsh Aberfan tip disaster may start with a keyword search for all references to the disaster but will very quickly see the value, if not the necessity, of knowing exactly which of all the bodies connected to the disaster and subsequent inquiry produced the records in question.

Technology in the late twentieth century made it possible to unite description of the whole with description of the component layers making up that whole. The emergence of the World Wide Web made remote access to full, up to date catalogues a possibility, and a ready-made means of providing multi-level description in an automated environment was established with the advent of Encoded Archival Description (EAD) in the mid 1990s. EAD, a document type definition of Standard Generalized Markup Language (SGML), was specifically designed for multi-level archival description. Largely because it maps so well to ISAD(G) there has been a large, international, take up of EAD. Many institutions have mapped existing finding aids to EAD; many have developed EAD templates for new cataloguing. Even where database solutions have been adopted for archival catalogues, a requirement has often been that the system should be capable of EAD import and export. In a very short space of time EAD has taken on the role of chief means of data exchange for archival descriptions. It has proven popular because it can provide one stop access to the description of the whole archive together with its parts.

Why stay with collection level description?

Real life for all too many archival institutions consists of running to try to stay still. Too few resources, in terms of money and skilled staff, combine with increased pressure for service. Very often the immediate has to take precedence over the important or, at the very least, the demands of the immediate preclude a fully worked through strategy of balancing priorities. Cataloguing backlogs build up. In almost every case, a decent collection level description of each of an institution's holdings is prepared as a necessary control in the accessioning process. Very often, though, more detailed cataloguing has to take its turn in the queue. Priority there may be determined strictly chronologically or may be decided by such factors as importance, user demand or particular staff skills available.

Even when the whole archive has been fully catalogued, there may be great dissatisfaction with the results. With legacy finding aids, the standard of archival description may be judged too poor or too idiosyncratic for a modern global audience. Too much may have been dependent on the human eye's interpretation of layout on a printed page or understanding of particular typing conventions. Data elements may have been mixed in together to form one mass of 'description' or levels, jumbled in a way that a machine will all too quickly expose but which a person browsing through page after page may not pick up.

When thinking in terms of retroconversion from paper based finding aids to electronic form, and especially for presentation on the Web, difficult choices about priorities sometimes have to be made. One such choice can be between depth and breadth: whether in the short term it is more beneficial to end users to have multi-level catalogues of some archives available remotely or whether it is more useful to view across the board collection level descriptions that indicate the locations of the archives (together with more detailed listings when these exist).

For one or more of the reasons noted above, a significant number of individual record offices, or of bodies working together to form a network, have decided to focus, at least in the short term, on providing collection level descriptions of their holdings on the Web.

A further consideration may be the applicability of ISAD(G) to cataloguing in an automated environment. ISAD(G) emerged from the world of paper based finding aids. Avoiding redundancy of information, avoiding repetition of information from one level to another made total sense. But what of the world of automation? What of the possibility of an isolated 'hit' in response to a particular search? How does the end user make sense of the 'hit' without adequate contextual information being returned as part of or with the hit? Do we need brand new rules of archival description for an automated environment, with a global, and largely unknown, audience? Is it safer for now to stick with collection level description for any other context than the local one, where human intervention is to hand?

Why Multi-level description?

We may well have to rethink or readjust some of the standards we follow as we take on fully the promises and challenges of online delivery of our archival catalogues. What we do know is that existing users want full catalogues on the Web and that new users (previously hampered by geography, limited mobility, lack of resources or simply unaware of the possibilities offered by archival research) will benefit most by remote access to the whole together with its parts.

Copyright© 2000 Meg Sweet and David Thomas

