Session 12

Saturday 09:30 - 11:00

High Tor 2

Chair: Michael Pidd

Text-mining, geo-coding and mapping historic smells

  • Deborah Leem ,
  • Daniele Quericia

Cambridge University

We live in the big data era. By 2002 digital data storage overtook analogue storage. Digitising humanities source material has already produced large datasets and the extensive amount of digital information presents us with unprecedented opportunities to shed new light on humanities research in new and innovative ways not possible previously.

 

The MOH reports were published annually by the Medical Officers of Health employed by local authorities. These reports provided vital statistics and a general statement on the health of the population in each borough. 5500 MOH reports for London spanning from 1848-1972 were digitised in 2012 by the Wellcome Library.

 

Although there were attempts at standardisation, the reports display each MOH’s interest, idiosyncrasies and particular strengths. As these reports are not standardised, creating a geo-coded dataset containing smell related vocabulary is a challenging task. We will present the results of the first phase of our research – textmining and geocoding for non-structured text through creating a novel geoparser.

 

For the first time data-mining the OCR’d text of the MOH reports for London will produce models that facilitate new kind of humanities research. Analysing the reports (second phase of the project) tells the intimate narratives of the everyday experiences of 19th and 20th century Londoners through the ‘smellscape’. Furthermore, it enables us to run various comparisons and assess if there are any links to the socio-economic identity of areas in London.

 

This project builds upon Daniele Quericia and his team’s recent project on mapping smells using social media. The MOH smell data will be available via their existing website (http://www.goodcitylife.org/). This has potential benefits of engaging with the public. Text-mining and geo-coding techniques add value to humanities research by demonstrating how new knowledge and insights have risen from the use of digital applications. 

Visualising Convict Lives

  • Bob Shoemaker ,
  • Richard Ward

University of Sheffield

Visualisations are a key tool for the Digital Panopticon project, which is linking together a vast range of digital records about the 90,000 people who were convicted at the Old Bailey in London between 1780 and 1865, and subsequently imprisoned in England or transported to Australia.  The project is using visualisations for two purposes: to better understand our core datasets, and to summarise our findings.  The first has allowed us both to identify errors and limitations in the underlying data, and to see overall patterns.  The second is more challenging: in order to map convict journeys we need to summarise relationships between datasets and the variety of life events they record, both criminal (trials, punishments, convictions and reconvictions) and personal (dates of birth, marriage and death; places of residence, family circumstances and occupations in census records), and involving tens of thousands of people.  This paper will discuss our choice of visualisation tools, assessing their suitability for summarising key information and allowing web-based manipulation by users, and then present some preliminary visualisations of our data.  First, we will demonstrate how visualisations have helped us see distortions in individual datasets, such as ‘age heaping’.  Second, we will examine the results of our ongoing record linkage activities by focusing on the relationship between the judicial sentences recorded in court records and the actual penal outcomes found in the records of executions, transportation and imprisonment.  Third, we will discuss our strategies for using visualisations to summarise individual convict lives.  Finally, we will reflect on the strengths and limitations of visualisations as tools for understanding and analysing large volumes of historical data.

 

Invisible Interpretations: Quantitative Text Analysis and Intellectual History

  • Mark Hill

London School of Economics

This paper has two aims. First, it questions the potential relationship between intellectual history (in particular, the methodology which emerged following Quentin Skinner’s “Meaning and Understanding in the History of Ideas,” and is now referred to as the Cambridge School) and new tools and techniques in quantitative text analysis. Specifically, it asks whether a discipline which is contextual and structuralist can make use of an empiricist and positivist methodology.

To these ends, the paper examines Richard Steele and Joseph Addison’s eighteenth century periodical The Spectator. This source is of interest for two reasons – first, it has been lauded as hugely important historically and politically. It has been credited with heralding in a new era and leading to the emergence of the “public sphere.”[1] To the reader today, however, The Spectator is little more than an amusing historical pamphlet, attacking contemporary affectations and superstitions, and purposefully ignoring political discussions. Thus, one may find it difficult to locate claims of profundity within it. The reason for this is the second point of interest: as Peter Gay wrote, “eloquent as it is, the Spectator does not speak to us directly; we must know something of its time before we know something of its significance.”[2] Thus, it is of particular interest to the intellectual historian interested in contextual meaning.

The paper uses a number of “distant reading” techniques (largely using R and Ken Benoit’s package Quanteda) to attempt to extract contextually relevant and historically interesting information from the corpus. The success (or failure) of these techniques, it is hoped, can lead to developing further best practices for those intellectual historians interested in the digital humanities.

 

 

 

 

 

[1] Jurgen Habermas, Structural Transformation of the Public Sphere, (Cambridge, Mass.: MIT Press, 1989).

[2] Peter Gay, “The Spectator as Actor”, Encounters (1967).