D-Lib Magazine
spacer
The Magazine of Digital Library Research
spacer
transparent image

D-Lib Magazine

March/April 2015
Volume 21, Number 3/4
Table of Contents

 

Storage is a Strategic Issue: Digital Preservation in the Cloud

Gillian Oliver
Victoria University of Wellington, New Zealand
gillian.oliver@vuw.ac.nz

Steve Knight
National Library of New Zealand
steve.knight@dia.govt.nz

DOI: 10.1045/march2015-oliver

 

Printer-friendly Version

 

Abstract

Worldwide, many governments are mandating a 'cloud first' policy for information technology infrastructures. In 2013, the National Library of New Zealand's National Digital Heritage Archive (NDHA) outsourced storage of its digital collections. A case study of the decision to outsource and its consequences was conducted, involving interviews of the representatives of three key stakeholders: IT, the NDHA, and the vendor. Clear benefits were identified by interviewees, together with two main challenges. The challenges related to occupational culture tensions, and a shift in funding models. Interviewees also considered whether the cultural heritage sector had any unique requirements. A key learning was that information managers were at risk of being excluded from the detail of outsourcing, and so needed to be prepared to assert their need to know based on their stewardship mandate.

 

1 Introduction

Internationally, government 'cloud first' mandates are forcing serious consideration by many public service organisations of outsourcing information technology (IT) requirements to external providers. Cultural heritage institutions are no exception to this, but there is a paucity of advice and experience to draw on to inform decision making. The purpose of this paper is to document the decision taken to outsource the storage of the National Library of New Zealand's National Digital Heritage Archive (NDHA), and in so doing provide some empirical evidence to assist other institutions worldwide facing similar decisions. The tendency to equate digital preservation with cold storage could lead to incorrect assumptions about the outsourcing solution required. It is important to articulate the symbiotic relationship between access and preservation. Those providing digital preservation services need to provide access and preservation management to materials in active storage (i.e. active retrieval and active management of collections over time).

The paper begins by providing the background of the international and national governmental context with regard to cloud computing, and then reports on the literature relating to use of cloud computing by cultural heritage institutions. This is followed by the case study of the NDHA context and the decision to outsource, drawing on interviews conducted with the key individuals involved. The paper concludes with discussion of findings and draws out some key recommendations for others considering outsourcing the storage of digital collections.

 

2 Background — The Global Context

The United States National Institute of Standards and Technology (NIST) has developed a concise definition of cloud computing to serve as a basis for comparison of services and deployment strategies:

"Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. " (Mell and Grance, 2011, p.2)

The NIST definition further identifies service and deployment models. Of relevance to this paper is the Infrastructure as a Service (IaaS) model, where

"The capability provided to the consumer is to provision processing, storage, networks and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g. host firewalls)" (Mell and Grance, 2011, p.3).

Four deployment models are identified: private, community, public and hybrid clouds. It is the private cloud model that is most relevant to this paper, where

"The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers (e.g. business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises " (Mell and Grance, 2011, p.3).

Internationally, governments have seen the potential for cloud computing to enable the delivery of more efficient and effective public services, with compelling cost economies (Irion, 2012). It has been argued that cloud computing architectures are fundamental in the necessary transformation of governments to provide citizens with services in the digital age (Fishenden and Thompson, 2013). New Zealand is of course not immune from these global trends. In 2011, the New Zealand government embarked on a programme of transformation of the public sector to achieve significant cost savings and economies of scale via the use of shared services (Guy, 2011).

 

3 Cultural Heritage Institutions and the Cloud

Despite the ubiquity of cloud computing and its promotion by governments worldwide, concerns about trustworthiness and the mandate for digital archives to preserve unique treasures in perpetuity have meant that cultural heritage institutions have scarcely been early adopters of this innovation. New Zealand's NDHA is possibly the first, and perhaps the only, national cultural heritage initiative to outsource the storage of its collections. Nonetheless, the first published reports of outsourcing are starting to appear, notably from Britain. In 2014, the United Kingdom's National Archives released a set of guidance documents for archives considering taking this step (National Archives, 2014). The guidance includes three provisos which should be considered the bottom line — they must underpin any negotiation of an outsourcing contract:

"First, data held in archives must be expected to be both preserved and accessible beyond the commercial lifespan of any current technology or service provider.

Second, an approach to addressing serious risks, such as loss, destruction or corruption of data that is based purely on financial compensation will not be acceptable, as this takes no meaningful account of the preservation and custodial role of archives; and,

Third, in order to reinforce the criticality of the first two elements, explicit provision must be made for pre-defined exit strategies ... and effective monitoring and audit procedures" (p.10).

The guidance document provides a long list of the benefits of outsourcing. Perhaps the most significant of these from a digital archiving perspective is the potential for improved capability in digital preservation. Because of the feasibility of automated replication in multiple locations and the specialized expertise of vendors in terms of digital storage and integrity checking, it may be possible to achieve improvements at bit preservation level (p.11). This would also need to be reinforced through referential integrity checks across the database and METS files, as well as any other systems supporting the digital preservation programme.

The TNA guidance is accompanied by four case studies of outsourcing by cultural heritage institutions (case studies can be downloaded here). The settings of these case studies range from a local history centre to the British Parliamentary Archives, but do not include a national library or archives.

 

4 New Zealand

In 1965, the National Library of New Zealand (NLNZ) was established as a standalone government department. In 2003, New Zealand's legal deposit legislation was updated to include digital resources (New Zealand Government, 2003). This was significant as it meant that any New Zealand digital content created was required by law to be deposited at the national library, to be kept in perpetuity. Therefore it was necessary to build a repository to serve as a digital archive, and in 2008 the NDHA was launched (see Knight, 2010 for background on development and implementation).

In 2011, NLNZ (together with Archives New Zealand) was integrated into the Department of Internal Affairs (DIA). The Government Chief Information Officer (GCIO) is also part of DIA. The GCIO is charged with delivering sustainable business savings of NZ $100 million per year by 2017 (New Zealand Government, 2013). One strategy identified to achieve this is discontinuing in-house owned and operated technology assets, and moving to a service based model instead (New Zealand Government, 2013). As a first step towards this goal a contract was negotiated with three vendors to act as approved data centres — so effectively a private cloud. In effect, the DIA was tasked with leading the implementation of infrastructure as a service (IaaS) across the whole of government (incorporating Storage as a Service and Backup as a Service amongst other potential services), and the NDHA agreed to be one of the initial pilot groups, with a specific focus on outsourcing storage. The size of the NDHA collection can be seen in the following figures:

oliver-fig1

Figure 1: Total intellectual entities and files in permanent repository

 
oliver-fig2

Figure 2: Size of permanent repository, in terabytes

The background to the decision to pilot outsourcing including the concerns identified by National Library stakeholders and details relating to the migration method have been documented by Cynthia Wu (Wu, 2013).

 

5 The Case Study

The motivation for this case study was to contribute to a much larger project: InterPares Trust, led by Professor Luciana Duranti of the University of British Columbia. The goal of this international research agenda is to "generate theoretical and methodological frameworks to develop local, national and international policies, procedures, regulations, standards and legislation" (InterPares Trust, 2015) for trustworthy digital records and data in a global, networked environment. Of special concern is the need to ensure a persistent digital memory, hence the decision to outsource the storage of New Zealand's digital memory seemed particularly worthy of investigation. The purpose of the NDHA case study therefore was to contribute to InterPares goals, in particular to inform the development of policy relating to the use of cloud storage providers.

Data was collected by interviewing seven individuals in May through June 2014. Interviewees were representatives of the three parties involved in deciding to outsource, negotiating the contract and implementation. The three parties were Government Technology Services (the branch of DIA with responsibility for implementing IaaS across government), the NDHA and the vendor. Snowball sampling identified those individuals who could comment from a strategic perspective, as opposed to a focus on operational detail. This was interpretive research, so the findings cannot be generalized, but they do nevertheless provide a rich picture of the benefits and challenges of outsourcing. Quotations used below are attributed to one of these three groups by the use of initials: IM (for information managers working with NDHA), IT (for members of Government Technology Services) and VE (for individuals associated with the vendor).

 

5.1 Benefits

There were clear benefits identified by interviewees, and very genuine enthusiasm for the features of an outsourced environment. It was explained that the opportunity to outsource storage to the Cloud was presented at a very good time — the existing in-house IT infrastructure needed to be upgraded, and its capacity to store increasing amounts of data was of concern. One respondent noted that the decision to outsource was an "opportunity to look at the whole topography of how the NDHA was laid out" (IM2), in other words to revisit original design decisions and to refine where necessary. This is a very significant factor given that the initial design and implementation of the NDHA was a pioneer endeavor — there were no pre-existing digital preservation systems that could be used as templates. In outsourcing the storage component it was possible to take advantage of the vendor's experience in managing large sets of data and load balancing.

Other benefits identified pointed to the fact that the hardware used would be state of the art, and of a consistent standard not likely to be seen in an in-house IT facility. A contractor whose business depends on the quality of service provided will have a tailor made modern facility, whereas in-house IT services are likely to be characterized by organic and possibly haphazard growth over long periods of time. It was also pointed out that a contractor will be much more attuned to customer service in the sense that if greater capacity was required, it would be made available as soon as possible. In contrast, in-house requests for greater capacity could involve a lengthy negotiation process navigating internal approval channels.

Another benefit that attracted comment related to greater transparency about the costs involved in digital preservation activities, and consequently being able to make informed decisions about particular courses of action. For instance, whether to manage digitised content in the same way as born-digital content. One respondent made an impassioned plea for those working in the cultural heritage sector to seize the opportunity represented by this new service and funding model, to escape from the victim mentality that is characteristic of this sector when it comes to the storage of collections, both digital and physical:

"... we tend to kind of preload, get some capacity and fill it and then panic and try to get some more investment. Which I think in the longer term is unhelpful because it masks the true year on year costs of operating the business and makes it easier for government or other investors once every 10 years to say here's a bit of money, go away — and managing within that becomes the challenge for the institution, rather than continually having a genuine conversation ..." (IT3).

The point being made was that those working in the cultural heritage sector are making decisions reactively rather than proactively, and always being on the back foot. What's worse is that this condition is accepted as a way of life.

 

5.2 Challenges

Two main challenges emerged from the interviews, relating to occupational cultures (tensions between the information management and information technology perspectives) and to funding. These two areas were touched on by all interviewees.

From the information managers' perspective, being taken seriously, being included in decision making, was a major challenge initially:

"In the beginning we weren't even invited to meetings, we were just told this stuff was happening" (IM1)

From the information technology management perspective, the mindset of cultural heritage professionals was perplexing to say the least. There was little understanding why information managers would want to be concerned with the detail of outsourcing, rather than just leaving everything to the experts — i.e. those working in IT.

"Library and archives ... have a different view of control than other branches and departments. They want to have a lot more control over where their information assets are, they want to know about them ... to a level of detail I think is unnecessary." (IT1)

This interviewee compared the attitudes encountered in the cultural heritage sector with other branches of government:

"... other people are prepared to give up a bit more control and trust ... provided their criteria are met." (IT1)

In other words, from the IT perspective the information managers were perceived as crossing boundaries into areas which were not of their concern. The vendor also commented on the differences encountered from their perspective:

"... the business was very much involved right from the outset to understand the solution we were putting up, the level of discussion and technical diligence that they went through was a little bit more in depth than the typical engagement you would have." (VE1)

Another IT respondent drew an analogy to the infrastructure of the physical world saying that the provision of IT services should be viewed in the same way as the services provided by electricians and plumbers:

"you are always reliant on third parties, so recognize the professional disciplines there" (IT3)

This difference in perspective is not necessarily one of trust but may be traced back to the emergence of digital preservation and the need for libraries and archives to understand technology and infrastructure more deeply in order to be able to attest to the authenticity and integrity of their digital collections over time, as they have historically done in the physical world.

Eventually, the gap in understanding between information management and information technology professionals was bridged by an intermediary. This was an IT contractor who had been heavily involved in the initial design and subsequent implementation of the NDHA. The contractor was someone therefore who had a deep understanding of the mission and purpose of the NDHA and consequently why decisions to do things a certain way had been made. In addition, the contractor was well known and respected by colleagues in Government Technology Services. This 'honest broker' role was felt to be essential to the success of the project.

The other main challenge concerned a change in the funding model. This change was a consequence of moving from the purchase of equipment for storage for use in house, to the provision of storage as a service. In the former case, funding would be drawn from an organisation's capital expenditure (CapEx). In the latter, the costs are accrued to the organisation's operating budget (OpEx). And it's not simply a case of funding being reallocated from capital to operating — a much more complicated scenario is at play. This, it must be stressed, is not a situation unique to the New Zealand government environment, as was made clear by an interviewee:

"It's ironic, in the UK, the States and the Asia Pacific [region] there is this big push to consume things as a service, which everyone knows is moving you down an OpEx route. And the financial models aren't there to let you do it. And in most places it's the governments that are pushing you down this route, and yet the same people are going — well, we can't support funding it. So — it will be sorted, because it's the way the world is going, it's just a case of when. It will be awkward for a period of time" (IT1).

This interviewee went on to say "We've created this whole new channel with no new funding. That's a real pressure. ... the actual unit prices of storage are cheaper. The security and the service we're getting is better. The offering stands in its own right but we've got these funding issues around it" (IT1).

It is indeed particularly ironic given that the push for the adoption of cloud computing was motivated by the government's need to realize significant cost savings and economies of scale via shared services (Wu, 2013). As another respondent made clear, it was extremely difficult to make the decision to move to an outsource model on the basis of cost savings, as the operating costs for in-house storage were not transparent. So rather than an informed decision, there was a need to take a leap of faith in deciding to make the change.

 

6 Unique Requirements of the Cultural Heritage Sector

Interviewees were asked if they considered that the cultural heritage sector had any different or unique requirements that needed to be taken into account when moving to an outsourced model for storage. The vendor and the information managers were in agreement that a higher level of assurance was needed:

"IT have a very vanilla view of servers, infrastructure, applications ... [in contrast] the NDHA, they are very passionate about the data that sits in there, on the servers, the longevity of it, the reputable data, ensuring that it doesn't change over time, probably more than what IT folk typically are." (VE1)

The vendor went on to explain that "... the business was very much involved right from the outset to understand the solution we were putting up, the level of discussion and technical diligence that they went through was a little bit more in depth than the typical engagement you would have." (VE1)

"If most organisations lose a document, so long as they get the document back they're pretty happy. But because of digital preservation being what it is, you don't want to lose or corrupt any of the bits, they have to be exactly the way they were before " (IM2).

Another unique feature identified by interviewees related to the mission and purpose of the NDHA. An all of government contract had been negotiated which included a catalogue of specific services that were to be provided. The contract was understandably written for generic workloads — reflecting the types of information transactions taking place every day in government offices. In the case of the NDHA however the amount of data needing to be stored was significantly larger than the norm, and needed to be kept in perpetuity rather than for a limited period of time. The NDHA workload was described as having:

"very high initial throughput, and very peaky workloads at ingestion and ... very random recalls and the capacity is much larger than what the service catalogue was originally written for" (VE2).

The volume of data and longevity requirements, together with the need for greater assurance, impacts on the nature of back up carried out, as well as on protection and retention regimes.

 

7 Discussion

Findings from this case study indicate that very clear benefits are possible for cultural heritage institutions in moving to the Cloud, particularly in terms of gaining access to state of the art equipment and facilities, and expertise in dealing with extremely large datasets. In the NDHA case, risks associated with data sovereignty were minimized, as the only option was the service provided by what was in effect a private cloud.

Given those benefits though, the decision to outsource did surface a number of other issues which are likely to be internationally relevant. The over-arching issue which was identified by one interviewee as an opportunity, is the paradigm shift in terms of control and financial model. To really understand the profound nature of this change in model, and respond to it as an opportunity rather than a threat, it will be essential for information managers to understand the strategic dimensions of storage, and broader infrastructure decisions. Storage, whether for physical or digital collections, has tended to be regarded as purely an operational concern, but the decisions made about storage have the potential to influence the very core of the institution, and may impact its ongoing viability. Hence taking a long term view, and understanding the consequences of outsourcing from a control and financial management perspective, are essential.

Andrew Abbot, in his theory of professions, articulated the idea of a competition for jurisdiction (1988). Abbott argues that professions arise as a result of system disturbance, and eventually establish their jurisdiction over a particular problem area — or put another way, their responsibility for a given set of issues. As society develops new technologies, new problems emerge, and occupations either respond or lose ground to other, newer professionals — a question of survival of the fittest. This idea has been explored in the past in the library and information science domain, with the various professions active in the information environment viewed as engaging in a competition for jurisdiction (Van House and Sutton, 1996). Given the complexity of today's information environment, the competition for ownership of specific domains has become more and more acute.

For information managers to be able to establish their claim for specialist expertise, to be acknowledged as having a particular perspective to contribute to decision making is often an uphill battle. One clear prerequisite is for information managers to be clear about their responsibilities, and to be ready to explain repeatedly why their concerns matter. In so doing, they open up the potential to shift the thinking of the other professions involved. Thinking needs to move beyond a narrow competition, to viewing the information environment as expansive and complex enough to need a network of independent specialists, similar to the healthcare environment for instance. Faced with intractable views, however, the best option might be to try to identify an 'honest broker', someone who understands both sides and is respected by everyone involved. A key learning from this case study is that information managers should expect to be excluded from the detail of outsourcing, and so must be prepared to be assertive and to establish their need to know based on their mandate to act as stewards of information as an authoritative resource.

 

8 Conclusion

Cultural heritage institutions should investigate using storage as a service offerings, and also look ahead to utilizing other cloud based services. Being aware of the short term consequences of cost saving (i.e. increased burden on operating budgets) must be factored into decision making, and set against potential long term benefits.

Although it is not possible to generalize from the one instance explored in this paper, it seems likely that the requirements of cultural heritage institutions are likely to differ in terms of data quantity, longevity required, and spikes in activity level from those expected in generic, everyday office situations. Being able to articulate this difference, and to explain stewardship responsibilities, will assist in negotiating appropriate service levels. The ideal situation is one where a trusted individual can be identified, who can act as broker between information management and information technology professionals to assist in raising awareness of the different perspectives involved.

The nature of this changing environment, where in-house operations can be delivered as a service by third parties, is one where opportunities can be threats if information managers are not equipped to respond appropriately. Much of the responsibility for ensuring that information managers can adapt and be effective in this complexity rests with educators. It is imperative that new entrants to the information professions are equipped with the knowledge and skills necessary to approach and understand technology and infrastructure as a strategic issue within their sphere of influence.

 

References

[1] Abbott, A. (1988) The system of professions: An essay on the division of expert labor. University of Chicago Press.

[2] Fishenden, J., & Thompson, M. (2013). Digital government, open architecture, and innovation: why public sector IT will never be the same again. Journal of Public Administration Research and Theory, 23(4), 977-1004. http://doi.org/10.1093/jopart/mus022

[3] Guy, N. (2011, August 18). Speech - The future of government ICT.

[4] InterPares Trust (2015).

[5] Irion, K. (2012). Government Cloud Computing and National Data Sovereignty. Policy & Internet, 4(3-4), 40-71. http://doi.org/10.1002/poi3.10

[6] Knight, S. (2010). Early learning from the National Library of New Zealand's National Digital Heritage Archive project. Program: Electronic Library and Information Systems 44 (2) 85-97

[7] Mell, P. and Grance, T. (2011). The NIST Definition of Cloud Computing. Special Publication 800-145, National Institutes of Standards and Technology.

[8] National Archives (2014). How cloud storage can address the needs of public archives in the UK

[9] New Zealand Government (2003). National Library of New Zealand (Te Puna Mātauranga o Aotearoa) Act 2003.

[10] New Zealand Government (2013). Government ICT Strategy and Action Plan to 2017.

[11] Van House, N., & Sutton, S. A. (1996). The panda syndrome: an ecology of LIS education. Journal of Education for Library and Information Science, 131-147.

[12] Wu, C. (2013) Adoption of infrastructure-as-a-Service at the National Library of New Zealand. Paper presented at Archiving 2013, Imaging Technology and Science, Washington, DC. Published in Final Program and Proceedings, pp176-182.

 

About the Authors

Gillian Oliver is the Programme Director, Master of Information Studies at Victoria University of Wellington, New Zealand. She is Honorary Research Fellow at the Humanities Advanced Technology and Information Institute, University of Glasgow and at The Open Polytechnic of New Zealand. Her professional practice background spans information management in the United Kingdom, Germany and New Zealand. Her research interests reflect these experiences, focusing on the information cultures of organisations.

Steve Knight is the Programme Director, Preservation Research and Consultancy at the National Library of New Zealand. PRC's primary focus is preservation of and access to New Zealand digital content with a particular view to modelling and developing solutions that can be scaled to national level.

 
transparent image