Digitization Prioritization

Digitization Return on Investment (ROI): Benefit/Cost

Often, the total size of a collection is monumental, even when measured at the pace of a rapid mass digitization program. In nearly all cases, the collection will have to be segmented and prioritized, as comprehensive digitization may take years or decades. Both the costs of digitization (in terms of hardware, personnel, storage outlays, and time spent) and its benefits must be considered, resulting in a traditional Return on Investment (ROI) calculation.

“Image quality has been paramount throughout the history of our digitization program. We select projects carefully according to agreed upon criteria. These include our current level of access to the material, whether digitization will help promote the preservation of the current state of the material, the quality of digital surrogate we can produce, the research value of the content, the practicality of the project’s scope, the state of the cataloging records for a collection, handling considerations for the materials, the perceived audience for the resulting Preservation Digital Objects, and whether the project will build on projects previously completed. We measure these factors against the investment required for the project.”

– Julie Ainsworth, Head, Photography and Digital Imaging, Folger Shakespeare Library

Integration with Other Activities

It is often possible, with proper planning, to coordinate the digitization of a collection with other required activities or initiatives. One of the most common is to combine the rehousing, cataloging of a collection, or conservation treatments with its digitization. This can not only save time and resources, but also reduces the net handling that a collection experiences. Digitization which is integrated with other activities has an inherently reduced cost/investment and therefore an inherently higher ROI.

“Digitization is a great motivator for bringing special collections material through the conservation lab. When items or collections come up to the lab for digitization we can take care of both conservation treatment and digitization concurrently. This way the items are only coming into the preservation and conservation department once.”

– Bethany Davis, Digital Processing Coordinator Librarian, University of Iowa Libraries

Provenance, Copyright, Location

Provenance, copyright status, and storage location of the collections impact its digitization ROI. Provenance can affect whether digitization is a contractual obligation or restriction. For instance, many collections are donated on the condition that they will be made available to the public, sometimes with specific riders that they be made available publicly via an online portal or within a specified period of time. Any implicit or explicit commitment that an institution has to digitize a particular collection should be strongly considered in its prioritization.

Copyright status can legally limit the creation of and use of digital copies of a collection in ways that are different than access to the physical originals. For example, contemporary works are generally under copyright and the capture of and use of digital reproductions of the work may require specific release. If an institution has two collections that are of similar overall value and condition but one has passed into the public domain, it may be reasonable to prioritize the digitization of the public domain collection, as the derivatives of this collection will be easier to use.

Location can pose obvious logistical challenges. For instance, digitization may be more costly (in terms of time and money) if the target collection is held at a branch or satellite collection away from the main digitization lab. It may be sensible to prioritize the digitization of collections that do not present logistical complications.

Uniqueness of the Collection

Collections that are not duplicated at other institutions can be considered to have a higher ROI for digitization. For instance, circulation material that is already available in publicly accessible digital collections provide minimal ROI for duplicate digitization. In contrast, a one-of-a-kind document which has no peers or equivalents, or is in better condition at one institution than at another, provides obvious research value.

Suitability to Today’s Technology

Today’s technology can achieve rapid digitization at preservation-grade standards for many, but not all objects. For instance, an A1 size map can be captured in a single capture, allowing rapid digitization of a collection of that size, while an A0 sized map requires resolution only available by stitching or by using slower scanning or multi-shot approaches. Digitization of such oversized material today is certainly possible, but comes at a significantly higher time cost than digitization of smaller material.

Given the constant advancement of technology, it may make sense for some institutions to prioritize digitization of those parts of a collection that can be done rapidly with today’s technology. If prioritized accordingly, technology may be available to digitize the larger material without requiring the higher time-cost.

Consider that in  early 2015 the highest resolution single-shot digital back available was the 80mp series of digital backs from Team Phase One, allowing a maximum object-length of 37” for preservation-grade imaging (assuming 300ppi @ 93% sampling efficiency). Any object larger than 37” on the long side currently requires multiple captures, multi shot, or scanning-capture – all of which are significantly more time consuming and error-prone than a standard single-shot capture. If an institution had two collections, one being a large set of 20”x30” posters and another a large set of 30”x40” posters, it may make sense to prioritize the systematic digitization of the 20”x30” collection since the cost (in time and cost per item digitized) will be significantly lower since they can all be captured with single-shots. In the future, when higher resolution backs become available (e.g. 100mp), the institution could upgrade their digital back and digitize the 30”x40” posters at a similar low per-object cost.

  • Example 1:
    • Ten thousand 20”x30” posters are digitized with 80mp single captures in 2015. Rate is 100 per day.
    • Ten thousand 30”x40” posters are digitized with 100mp single captures in 2016. Rate is 100 per day.
      • Total project time is 200 days.
  • Example 2:
    • Ten thousand 30”x40” posters are digitized with stitched 80mp captures in 2015. Rate is 20 per day.
    • Ten thousand 20”x30” posters are digitized with 80mp single captures in 2017. Rate is 100 per day.
      • Total project time is 600 days.

In both of the above examples the the institution has successfully created 20,000 Preservation Digital Objects from their poster collection. However, in Example 1 they have reduced 400 days of labor and overhead by prioritizing their digitization program based on the suitability of the collection to today’s technology and then upgrading their digital back  using tomorrow’s technology for the sake of efficiency.

Temporal Relevance

Often a collection will vary in relevance to researchers and the public based on anniversaries or a relationship with current events. For instance, the centennial anniversary of the First World War in 2014 created heightened activity around collections from that period. Likewise, the ongoing destruction of sites of Cultural Heritage significance in Syria might change the ROI of collections from that region.

In addition, collections can often have limited periods of relevancy; such was the case with military personnel service records at the National Archives of the United States (NARA). This collection has a hybrid of cultural and practical value, but the latter is fading away as the individuals referenced in the records pass away.

“We are responsible for preserving millions of fire-affected Military Service record documents that were damaged in a fire in 1973 at our St. Louis, Missouri facility. These records are of immediate concern to aging veterans seeking proof-of-service for access to veterans’ benefits. NARA is currently imaging the most heavily damaged and fragile subset of these records with the DT Multispectra Camera; creating digitized versions of these records that reveal lost content from charred areas of documents. Before working with Digital Transitions, NARA tested a number of other imaging systems but could not get the project off the ground due to the lack of quality and information recovery effect in the IR images produced. The DT Multispectra Camera was the only system that met our production requirements of rapidly capturing high quality images, straight from the camera, in both visible and IR. Without the engagement of Digital Transitions in solving this problem, the NARA project would not have proceeded.”

– Noah Durham, Supervisory Imaging Specialist, National Archives of the United States


Political Factors

In an ideal world, political factors could be entirely ignored in deference to the substantive value of a collection. However, the practical reality is that benefactors, grant controllers, and the public will not view the digitization of all collections as equally valuable. The PR and fundraising value of digitizing a collection may be worth considering as part of the ROI calculations when prioritizing digitization initiatives [see Marketing, Branding, Reach, & Politics]. This is especially true with collections that are controversial in nature. For instance, some funding entities may have ideological objections that discourage them from funding digitization of a collection related to the science of evolution or the history of transgendered individuals thereby potentially raising the cost (e.g. more time required to find grant sources). However, broadening access for such collections may relate to institutional goals such as public education and may influence the prioritization of digitization regardless.

Many, including the authors of this paper, would prefer Cultural Heritage institutions entirely transcend politics and cultural sensitivities in their digitization ROI analysis. Institutional missions such as preserving and presenting unbiased and accurate historical records should not take a back seat to realpolitik. We include the topic here since it is clearly part of the conversation that must be had during the planning of a digitization program. If high-grade controversy is likely to arise from the digitization of a particular collection the appropriate stakeholders should be made aware in advance (e.g. Public Relations and Legal Affairs departments) so they may be appropriately prepared.

Physical State of the Collection & Risk of Future Degradation

Some collections are in excellent condition and are inherently stable. Stone statues in good condition, properly stored in a preservation environment, are at very low risk of future degradation. Unfortunately, this is not typical of every collection – sometimes, objects in a collection are already in a poor physical state or are inherently unstable, and are likely to deteriorate further. It is sensible that these at-risk collections receive priority in the planning of a digitization program.

Collection Preparedness & Curatorial Understanding

The creation of a proper digital collection is reliant on an understanding of the meaning and context of the physical collection. Preliminary curatorial research can often reveal aspects of a collection that influence details of its digitization protocols. For example, a curator might, after research, find that a particular painter used unusual substrates which featured interesting textural properties that added to the meaning of the subject matter; knowing this the curator might recommend the paintings receive both standard and texture-enhanced digitization.

It is generally expected that the materials in a collection slated for digitization are stable, organized in a manner that optimizes preparatory processing, and have a supportive collections records to help facilitate collections processing workflows. If a collection is not physically ready, adequately organized and documented, or sufficiently understood, digitization should be delayed, regardless of priority status. The surest path to an inefficient digitization program is needless duplication of effort; “do it once, and do it right.”