Digitization Process

IN Harmony Project Description

In October 2004, the Indiana University Digital Library Program received a grant from the Institute of Museum and Library Services to fund IN Harmony: Sheet Music from Indiana, a three-year project to catalog and digitize sheet music from the collections of four partner institutions, Indiana University's Lilly Library, the Indiana State Museum, the Indiana Historical Society, and the Indiana State Library. The purpose of the project is to create an online sheet music collection that will demonstrate how museums and libraries with complementary materials can work cooperatively to create shared digital resources. By concentrating initially on the collections of American sheet music owned by each of the partners, the project will accomplish two highly adaptable goals: 1) it will demonstrate how approximately 10,000 digitized pieces of sheet music and their attendant metadata can be presented on a single Web site, offering federated searching of all collections or access to one or more selected collections; and 2) it will demonstrate how collaborative digital library development can provide online access to the important regional collections of museums, libraries, and historical societies. These collections may, in fact, be sheet music, or they may be important materials in other formats, such as photographs, maps, manuscripts, or artifacts.

Digitization Process

All digital files included in the IN Harmony project must adhere to stringent guidelines. All color balancing is done prior to scanning using the Silverfast software product as the driver for Epson 10000XL flatbed scanners. All scanners and monitors used for the IN Harmony project are color calibrated regularly to ensure accuracy and consistency of the digital images.

All images are created at 400 dpi. All covers as well as all pages with color are scanned at 24-bit color with imbedded Adobe1998RGB color profiles. All other score pages are digitized at 8-bit grayscale with the imbedded profile of Gray Gamma 2.2. The inclusion of the profiles helps ensure that the image will reproduce as accurately as possible. All master files are scanned at 100% of the page size and saved as uncompressed TIFF files. To aid in visual consistency and ensure there was no cropping of information, all pages in the score must have the exact same pixel dimensions.

Quality Control

All IN Harmony master files must pass a two-step quality assurance procedure. The first step is an automatic quality control process that ensures valid and well formed files. A set of computer programs systematically examine the embedded TIFF tag of every digital file to verify that all files are named according to convention, that they are uncompressed TIFF files, that each file has an embedded profile appropriate to its bit depth, that all images in the same score have identical pixel dimensions, and that all images were scanned at the appropriate resolution.

Screenshot of Quality Control Process

Once the files pass the automatic quality control, a portion of the files are manually examined to ensure good visual quality. Each file selected is examined at 100% pixel display to see that the page orientation is correct, that the color balance matches the original items as closely as possible, that is it a sharp and in-focus scan, that no digital artifacts of scanning are visually present, that no pages were accidentally skipped, and that the image is properly aligned. To aid in the manual quality control process, the physical scores are examined simultaneously to compare color fidelity and page order. If any inconsistencies are found in either the automatic or manual quality assurance checks, the item in question is rescanned until it is acceptable.

Web Derivatives

After the sheet music passes both automatic and manual quality control, derivatives are created for web display. We create three web deliverable files - a thumbnail, a small screen size and a full screen image. The thumbnail is 200 pixels high. The small screen image is to be 600 pixels in width while the full screen is scaled to be 1000 pixels in width. For both screen images, the height is determined by scaling to avoid any visual distortion. The derivative files are saved as jpeg files. The master files are sent to long-term tape storage.

PDF Creation

People want to print sheet music for a variety of reasons - the cover art, the lyrics, the advisements, but most importantly, for the music. People do not want to print each page separately; they want a full score sized to print on 8.5" x 11" paper. The best way to fulfill that functional requirement is via PDF. We want to provide a PDF of the full score optimized for both printing speed and readability. After much experimentation, we developed a process for creating PDFs that have sufficient clarity for performance while maintaining a file size of approximately 1 megabyte. Using a temporary intermediate jpg file for each page image, we resample down from the original 400 PPI to 125 PPI while maintaining the dimensionality of the original. Using ImageMagick, we use a compression of 50. For grayscale images, we include two extra steps to increase the printability and the readability of the scores - a 60% threshold to increase the disparity between black and white and a contrast filter to sharpen the edges.