Connected Histories: Sources for Building British History, 1500-1900

Summary:

Sophisticated, federated searching of a wide range of electronic sources on the subject of early modern and nineteenth-century British history.

Project Status:

Completed

Funders:

JISC

Partners:

University of Sheffield
Institute of Historical Research (University of London)
University of Hertfordshire

Subjects:

British Isles, early modern period, early printed books, federated searching, historical records, history, large datasets, linked data, online resource, social history, text and image analysis, text data mining, transcriptions

Technologies:

Ajax, API, CSS, HTML, Java, Lucene, MySQL, Natural Language Processing, XML

HRI Online Publication

Project Description

This project has created a federated search facility, ‘Connected Histories’, which brings together a critical mass of quality content drawn from a wide range of electronic sources on the subject of early modern and nineteenth-century British history. More than simply creating a portal for accessing these historical resources, this project combines web crawling with Natural Language Processing techniques in order to remotely `tag´ previously unstructured texts and allow consistent, structured searching of names, places and dates. In so doing the project has added a new level of precision and intellectual rigour to the search process.

The Connected Histories search engine was developed by the HRI and is hosted by the Institute of Historical Research [IHR] within the University of London, sitting as an `umbrella´ over all the sources in the cluster. Testing was carried out by historians at Sheffield, Hertfordshire, and the Institute of Historical Research. Evaluation was conducted by the Centre for Computing in the Humanities, King’s College London.

In the first instance, ‘Connected Histories’ incorporated the following distributed historical sources:

In total, Connected Histories provided access to fourteen major databases of primary source texts, containing more than 412 million words, plus 469,000 publications, 3.1 million further pages of text, 87,000 maps and images, 254,000 individuals in databases, and over 100 million name instances.

Connected Histories has since gone on to incorporate many more datasets and continues to grow, providing access to in excess of 10 billion words.

Duration: 1st October 2009 – 31st March 2011

Project Team

  • Prof. Robert Shoemaker (University of Sheffield)
  • Prof. Tim Hitchcock (University of Hertfordshire)
  • Dr Jane Winters (Institute of Historical Research, University of London)
  • Dr Sharon Howard (Project Manager – Humanities Research Institute)
  • Katherine Rogers (Digital Humanities Developer – Humanities Research Institute)