Technical Implementation

IN Harmony project has three major technology deliverables - an automated quality control process, a cataloging tool, and an online web accessible search and discovery tool. The automated quality control process is described in the Digitization Process section of this website. The cataloging tool and the online web accessible search and discovery tool are discussed below.

IN Harmony Technical Overview diagram
IN Harmony Data Flow

With a distributed data collection process with multiple data types and a centralized data access system, IN Harmony has a complicated data flow. IN Harmony has two distinct types of data to collect - images of the sheet music and metadata about the sheet music. To keep the data synchronized, we use identical identifiers to match the images to the metadata. Each institution contributing to IN Harmony has its own workflow. An institution could catalog their metadata and then scan; an institution could scan and then catalog; or an institution could create metadata in a completely different system and batch load the data into IN Harmony. We support all three workflows. When an institution catalogs first, the online cataloging tool creates a record in the Oracle database with a status of "cataloged but not scanned." The identifier is created and used in the digitizing process. When the images have completed the quality control process, the images are linked using that identifier to the metadata record. When an institution scans prior to cataloging, the quality control process creates a "stub" record in the Oracle database with a status of "scanned but not cataloged."

On a regular schedule, the data is exported from the Oracle database and loaded into the Fedora Digital Repository system. As part of the ingest process, METS documents are created. We create a METS document for the logical object which includes a MODS metadata record and mapping for all of the images. We also create a METS record that drives the functionality of our page turning application that works in conjunction with the Fedora web delivery services. When the data is ingested into Fedora, the master files of the images are stored on the HPSS tape system while the deliverable images are stored on disk that is controlled by Fedora.

The IN Harmony Cataloging Tool

The cataloging tool is a separate application from the online web accessible catalog. The cataloging tool is a Java swing application with an Oracle database for data management. The cataloging tool uses the Indiana University Common Access System (CAS) for authentication. The data uses a proprietary schema for maintaining data integrity. The data is exported from the database in MODS to be ingested into the Fedora Digital Repository System. Source code and installable versions of the cataloging application for Windows and Macintosh are available from SourceForge.

Online Web Accessible Search and Discovery

The online web accessible search and discovery functions have been built on the Fedora Digital Repository system. The MODS records are indexed for searching. The interface is a Java Struts application, which uses CQL queries to access the Fedora repository. The scores are linked via PURLS. The page images are delivered via the Indiana University METS Navigator software.