Linguistic DNA of Modern Western Thought

The aim of this project is to understand the evolution of early modern thought by modelling the semantic and conceptual changes which occurred in English discourse (c.1500-c.1800). The project will use information extraction techniques and data visualisation to identify lexical patterns in 250,000 texts.

The aim of this project is to understand the evolution of early modern thought by modelling the semantic and conceptual changes which occurred in English discourse (c.1500-c.1800). It will do so through the following objectives:cropped-dna-cloud11

1. Using information extraction techniques, identify lexical patterns within approximately 37 million pages, using 48,327 re-keyed texts from Early English Books Online (EEBO) and approximately 205,000 OCR-ed texts from Gale Cengage’s Eighteenth Century Corpus Online (ECCO). The total dataset comprises over 250,000 texts.

2. Using data visualisation, evaluate the accuracy of the information extraction techniques against the projects research questions. These research questions fall into three Research Themes:

  • Contexts of Semantic Change, will explore the historical and discursive circumstances of concept development.
  • Lexical Families and Conceptual Fields, will explore the linguistic characteristics of concepts and their constituent keywords.
  • Lexicalisation Pressure, will explore the characteristics of word formation and vocabulary size within conceptual fields.

3. Develop and share knowledge about the project’s methodologies via two workshops that will be focussed on computer-assisted language analysis, language change and data visualisation.

4. Present the results of the Research Themes as a series of published outputs: a volume of essays on paradigmatic terms between 1500 and 1800; a co-authored book on Language and Conceptual Change; refereed articles; and an online, open source collection of essays on technical methodologies.

5. Make the resulting database of lexical patterns, plus the search and visualisation features, available to the wider community for their own research purposes. This will be in the form of a public website, as well as a Web API and data download feature that will enable the entire dataset to be shared and re-used.

6. Demonstrate the wider applicability of the information extraction and concept modelling techniques by developing a demonstrator for a modern body of scholarship (eg. JSTOR) and hosting two Knowledge Modelling Impact Workshops.

Website

Digital Outputs

Project Team

  • Professor Susan Fitzmaurice (Principal Investigator – University of Sheffield)
  • Michael Pidd (Co-Investigator – The Digital Humanities Institute)
  • Dr Justyna Robinson (Co-Investigator – University of Sussex)
  • Dr Marc Alexander (Co-Investigator – University of Glasgow)
  • Dr Iona Hine (Research Associate – University of Sheffield)
  • Dr Seth Mehl (Research Associate – University of Sheffield)
  • Dr Fraser Dallachy (Research Associate – University of Glasgow)
  • Matthew Groves (Developer – The Digital Humanities Institute)
  • George-Andrei Ionita (Developer – The Digital Humanities Institute)
  • Brian Aitken (Digital Humanities Research Officer – University of Glasgow)