Transforming a paper legacy into a live database

When the Archivio Storico Ricordi was created in 1994, all the catalogues and metadata relating to the collection only existed in paper form.  However, the unique and comprehensive archive is of critical interest to scholars all over the world, and it receives daily requests for access to its documents.  Providing this access is not straightforward – the archive is not open to the public and its collections can only be accessed by transporting documents to the reading room of the National Library (the Biblioteca Nazionale Braidense, hosted within the same building as the archive).  From the very earliest days of the Archivo, the necessity of using  digital techniques to establish electronic catalogues and metadata was recognised as crucial, both in providing scholars with information about what the archive contained but also in tracking where the archive’s documents actually were.

The labour-intensive process began of inputting data from paper records into Excel and Access files, data formats that were in widespread use at the time, but which subsequently have not provided the sustainability or functionality required for the Archivio’s very complex collection, not only of documents, but also photographs, scores and iconographic documents.  Subsequently, the need to apply a more coherent structure to the archive’s dataset was recognised, and a FileMaker database was created and, in the process of ingesting the existing files, data was standardised, reorganised and restructured.  But even this approach proved problematic, as it was difficult to make the contents available online.

This paper will examine how the same fundamental concerns – wider public access, and creating comprehensive catalogues and metadata about the archive’s collection – have driven the archive’s digital endeavours over the past two decades, and how our approach has evolved as technology has changed.  It describes the challenges of creating a dataset that incorporates but also supplements previous incarnations of that dataset.  It discusses the difficulty of integrating datasets and the alternative challenge of synchronising data across different technical solutions.