DHC 2016

Distributional semantics as a tool for the humanities: Compatible frameworks or unbridgeable gaps?

Seth Mehl University of Sheffield

This paper critically interrogates the theoretical frameworks and goals of computational distributional semantics in relation to the study of historical concepts in the humanities. Distributional semantics involves the automatic analysis of word meaning in texts, and can in turn identify various types of concepts related to those words (cf. Manning and Schueze 1999, Turney and Pantel 2001). The method involves statistical analysis of lexical co-occurrence patterns in texts. The tradition of distributional semantic analysis emerged in the humanities in the mid-20th century (Geeraerts 2010, cf. Firth 1962, Austin 1962), and has been adapted by computational linguists since the 1990s (cf. Turney and Pantel 2001). Today, distributional semantic studies can be divided between those that aim to learn about language and concepts (cf. Heylen et al. 2008), and those that aim to complete an engineering task (cf. Tahmasebi et al. 2013). Even as those categories of research diverge, their methods remain remarkably similar (cf. Piersman et al. 2007). This similarity raises important questions:

What are the theoretical frameworks and goals that underpin distributional semantics in computational linguistics today?
What are some frameworks and goals in the humanities that can accommodate the frameworks and goals of computational linguistics?
Are the two categories of research compatible?

The paper argues that a careful consideration of both semantics and concepts is necessary to link the two research categories meaningfully, and we propose ‘discursive concepts’ as a useful bridge.

Finally, we present findings from a major research project that uses distributional semantics alongside close reading to map conceptual change in Early Modern English. We present outputs of distributional semantic methods as the basis for discerning patterns of meaning in historical discourse cultures, and link those findings to the theoretical perspective we have proposed.

References

Austin, J. L. 1962. How to do things with words. Cambridge, Massachusetts: Harvard University Press.

Firth, J. R. 1962. A synopsis of linguistic theory. In Studies in linguistic analysis. Oxford: Basil Blackwell. 1-31.

Geeraerts, Dirk. 2010. Theories of lexical semantics. Oxford: Oxford University Press.

Heylen, Kris, Yves Peirsmany, Dirk Geeraerts, Dirk Speelman. 2008. Modelling word similarity: An evaluation of automatic synonymy extraction algorithms. In Proceedings of the Sixth International Language Resources and Evaluation, 3243-49.

Manning, Christopher and Hinrich Schuetze. 2001. Foundations of statistical natural language processing. Boston: MIT Press.

Peirsman, Yves, Kris Heylen, Dirk Speelman. 2007. Finding semantically related words in Dutch: Co-occurrences versus syntactic contexts. In Proceedings of the CoSMO workshop, Roskilde, Denmark, 9-16.

Tahmasebi, Nina, Kai Niklas, Gideon Zenz, Thomas Risse. 2013. On the applicability of word sense discrimination on 201 years of modern english. In International Journal on Digital Libraries 13,135–53.

Turney, Peter D. and Patrick Pantel. 2010. From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research 37, 141-188.

Session 6 — Text Analytics 2: Identifying complex meanings in historical texts

Friday 11:30 - 13:00

High Tor 2

Papers:
- Distributional semantics as a tool for the humanities: Compatible frameworks or unbridgeable gaps?
- The Utility of Count-based Models for the Digital Humanities
- Developing an interface for historical sociolinguistics

DHC 2016 Click here to register

Distributional semantics as a tool for the humanities: Compatible frameworks or unbridgeable gaps?

Session 6 — Text Analytics 2: Identifying complex meanings in historical texts

Friday 11:30 - 13:00

High Tor 2

Papers:

DHC 2016

Click here to register