How discoverable are your digitised collections? HRI Digital is developing a tool (internal name "Dewdrop") which will support discovery solutions for online research resources both at national and/or institutional level.

Project Status:

In Progress




University of Sheffield


online resource, software development, standards and best practice


API, Java, Natural Language Processing, XML

Project Description

As part of the Jisc project, Spotlight on the Digital, HRI Digital developed a technical specification for a tool that could address the problem of digital orphans. With the successful completion of this project, HRI Digital has now been contracted to undertake the technical design, build, documentation and testing of the proposed tool (internal name “Dewdrop”). The technical work will run alongside a process of demand investigation, business planning, ‘testing in the wild’ and communication planning which will be undertaken by Jisc with the support of HRI Digital.

Digital orphans are online assets (in this case, research resources) that are deemed to be undiscoverable, unused, unknown or forgotten by the wider research community because they are invisible or inaccessible to the normal mechanisms of discovery, such as search engines, subject catalogues, aggregation sites and other subject-specific websites. The invisibility of online resources can be due to a combination of factors such as poor technical design, poor presentation of content, poor marketing and an absence of individual and/or institutional support.

The tool proposed in the specification is intended to address these problems by being capable of developing a discovery-friendly version of a resource’s textual content at the record level. This discovery-friendly version of a resource’s content, presented as a set of optimised data records, will then mediate between the resource and discovery services. The tool will achieve this in two ways: a Crawler will retrieve a copy of the resource’s textual content, including data contained in databases; and an Analyser will generate discovery-friendly records from the content using Natural Language Processing techniques.

Related Links

Duration:: November 2015 – July 2016

Image Credits: Partial map of the internet developed by

Project Team

  • Jamie McLaughlin (Digital Humanities Developer – University of Sheffield)
  • Katherine Rogers (Digital Humanities Developer – University of Sheffield)
  • Matthew Groves (Digital Humanities Developer – University of Sheffield)
  • Michael Pidd (Principal Investigator – University of Sheffield)
  • Ryan Bloor (Developer – University of Sheffield)