Session 16 Saturday 10:00 - 11:30 High Tor 2 Chair: Jamie McLaughlin
Before, During and After: A Bilingual Temporal Sentiment Analysis of the Media Coverage of Rio and London Olympic Legacies Caio Mello , Gullal Singh Cheema School of Advanced Study, University of London Keywords: Olympic Legacy; Sentiment Analysis; Discourse Analysis. The Olympic Games happen every four years in a different city around the world for three weeks. These three weeks however are planned for a long time before the event in order to provide not only the spectators but also the citizens with the best experience as possible. However, what is left behind after the games, its legacy, has been often criticized. This paper aims to analyse the narratives about the legacy of both Rio 2016 and London 2012 Olympics over a period of 6 years (3 years before and 3 years after each event). We are going to conduct temporal sentiment analysis on a bilingual data-set (English and Portuguese) based on news articles published by the biggest media companies in Brazil and UK: Globo and BBC. Besides their size and importance to the public opinion, both media companies hold the rights to broadcast officially the games in their countries. Moreover, we will include the three most read newspapers of each country. Texts will be extracted from the websites and analysed first in a distant reading using the method of sentiment analysis and, subsequently in a close reading. The objective of this analysis is to use the measures provided by the sentiment analysis to compare the range of positivity and negativity in the coverage of such different event realities, considering that the Rio Olympics was the first Olympic Games hosted by a country in the South world. The results will be used to define the concept of Legacy by analysing which kind of outcomes are mentioned on the news. The study will then be conducted by using the Discourse Analysis framework, looking at the impact of the usage of some words, mainly adjectives, in building a narrative of a disastrous or satisfactory and successful event. For automated analysis, best suited text analytics and multimodal machine learning techniques will be studied and applied to obtain meaningful results and test the hypothesis.
Between Hermeneutics and Deceit: Keeping Natural Language Generation in Line Albert Meroño-Peñuela , Leah Henrickson King’s College London Keywords: Olympic Legacy; Sentiment Analysis; Discourse Analysis. Advances in machine learning techniques and the high availability of data and compute power have given rise to a new generation of AI and Natural Language Processing (NLP) approaches, which have achieved unprecedented performance in tasks like question answering and Natural Language Generation (NLG). In fact, NLG engines can create texts so readable that they are capable of deceiving readers into thinking they have been written by a human, effectively passing a hypothetical Turing test. This prompts important questions speaking directly to the core of hermeneutics - the study of meaning and interpretation of texts- which has traditionally relied on a perceived social contract between authors and readers (Henrickson, 2021, p. 4). It has been shown that when text creation is carried out by an NLG engine, the contract holds, with readers still perceiving elements of authorship even in generated texts (Henrickson and Meroño-Peñuela, forthcoming). However, this perception seems to occurs when we detach AI and NLG engines from their broader societal contexts. These systems are put in place by someone (e.g. a company, an individual, a government), and they are trained on data created and curated by humans. In the general narratives that permeate society, usually rich in hype towards new technologies (Milne, 2020), AI is presented as useful and objective, but this contrasts with the aforementioned acknowledgement of human intervention. Such intervention often focuses on optimisation for profit, with optimisation efforts contributing to AI that is as biased, fallible and subjective as humans. In this paper, we investigate what it means to have NLG ‘authors’, as well as the ability of the hype surrounding NLG and AI to deceive and mislead. We highlight the need to find ways of keeping NLG in line and accountable through regulation, provenance, and dataset documentation. References Henrickson, L., 2021. Reading Computer-Generated Texts. Cambridge University Press. Henrickson, L., Meroño-Peñuela, A., 2021. The Hermeneutics of Computer-Generated Texts. Configurations — Journal of the Society for Literature, Science, and the Arts (SLSA). JHU Press (in press). Milne, G., 2020. Smoke & Mirrors: How Hype Obscures the Future and How to See Past It. Hachette UK.

Session 16

Saturday 10:00 - 11:30

High Tor 2

Chair: Jamie McLaughlin

Before, During and After: A Bilingual Temporal Sentiment Analysis of the Media Coverage of Rio and London Olympic Legacies

Caio Mello ,
Gullal Singh Cheema

School of Advanced Study, University of London

Keywords: Olympic Legacy; Sentiment Analysis; Discourse Analysis.

The Olympic Games happen every four years in a different city around the world for three weeks. These three weeks however are planned for a long time before the event in order to provide not only the spectators but also the citizens with the best experience as possible. However, what is left behind after the games, its legacy, has been often criticized.

This paper aims to analyse the narratives about the legacy of both Rio 2016 and London 2012 Olympics over a period of 6 years (3 years before and 3 years after each event). We are going to conduct temporal sentiment analysis on a bilingual data-set (English and Portuguese) based on news articles published by the biggest media companies in Brazil and UK: Globo and BBC. Besides their size and importance to the public opinion, both media companies hold the rights to broadcast officially the games in their countries. Moreover, we will include the three most read newspapers of each country. Texts will be extracted from the websites and analysed first in a distant reading using the method of sentiment analysis and, subsequently in a close reading.

The objective of this analysis is to use the measures provided by the sentiment analysis to compare the range of positivity and negativity in the coverage of such different event realities, considering that the Rio Olympics was the first Olympic Games hosted by a country in the South world. The results will be used to define the concept of Legacy by analysing which kind of outcomes are mentioned on the news. The study will then be conducted by using the Discourse Analysis framework, looking at the impact of the usage of some words, mainly adjectives, in building a narrative of a disastrous or satisfactory and successful event.

For automated analysis, best suited text analytics and multimodal machine learning techniques will be studied and applied to obtain meaningful results and test the hypothesis.

Between Hermeneutics and Deceit: Keeping Natural Language Generation in Line

Albert Meroño-Peñuela ,
Leah Henrickson

King’s College London

Keywords: Olympic Legacy; Sentiment Analysis; Discourse Analysis.

Advances in machine learning techniques and the high availability of data and compute power have given rise to a new generation of AI and Natural Language Processing (NLP) approaches, which have achieved unprecedented performance in tasks like question answering and Natural Language Generation (NLG). In fact, NLG engines can create texts so readable that they are capable of deceiving readers into thinking they have been written by a human, effectively passing a hypothetical Turing test. This prompts important questions speaking directly to the core of hermeneutics - the study of meaning and interpretation of texts- which has traditionally relied on a perceived social contract between authors and readers (Henrickson, 2021, p. 4). It has been shown that when text creation is carried out by an NLG engine, the contract holds, with readers still perceiving elements of authorship even in generated texts (Henrickson and Meroño-Peñuela, forthcoming).

However, this perception seems to occurs when we detach AI and NLG engines from their broader societal contexts. These systems are put in place by someone (e.g. a company, an individual, a government), and they are trained on data created and curated by humans. In the general narratives that permeate society, usually rich in hype towards new technologies (Milne, 2020), AI is presented as useful and objective, but this contrasts with the aforementioned acknowledgement of human intervention. Such intervention often focuses on optimisation for profit, with optimisation efforts contributing to AI that is as biased, fallible and subjective as humans. In this paper, we investigate what it means to have NLG ‘authors’, as well as the ability of the hype surrounding NLG and AI to deceive and mislead. We highlight the need to find ways of keeping NLG in line and accountable through regulation, provenance, and dataset documentation.

References

Henrickson, L., 2021. Reading Computer-Generated Texts. Cambridge University Press.

Henrickson, L., Meroño-Peñuela, A., 2021. The Hermeneutics of Computer-Generated Texts. Configurations — Journal of the Society for Literature, Science, and the Arts (SLSA). JHU Press (in press).

Milne, G., 2020. Smoke & Mirrors: How Hype Obscures the Future and How to See Past It. Hachette UK.

DHC 2022 Click here to register