Caroline Sporleder

Université de Göttingen (Allemagne)

Durant le mois de mars 2018, le labex TransferS et Thierry Poibeau (Lattice) accueillent Caroline SPORLEDER, directrice du Centre for Digital Humanities de l’Université de Göttingen (Allemagne).

Linguistique computationnelle pour les humanités numériques

Caroline Sporleder dirige actuellement le Centre for Digital Humanities de l’Université de Göttingen en Allemagne. C’est une spécialiste du traitement automatique du langage naturel : elle a notamment travaillé sur l’analyse du langage figuré, sur la désambiguïsation sémantique et sur l’analyse du discours (structures argumentales, liens entre phrases, cohésion des textes). Son apport au domaine depuis plusieurs années est particulièrement marquant en ce qui concerne l’application du traitement automatique aux humanités numériques.

Dans ce cadre, Caroline Sporleder a développé des techniques permettant l’analyse de la structure typique de romans (approche de type « distant reading »), le suivi de thèmes au cours du temps dans de grands corpus historiques ou la reconnaissance des noms de personnes mentionnées dans des corpus journalistiques.

Lors de sa venue, Caroline Sporleder donnera 4 conférences:

 Mardis 6, 13, 20 et 27 mars 2018
ENS, 45 rue d’Ulm, 75005
salle Cavaillès (1er étage, escalier A)

Mardi 6 mars, 10h-12h, ENS salle Cavaillès
Network Analysis for Literature

The automatic analysis of works of literature such as novels or poems is an interesting and intricate application for natural language processing. Literature not only poses a challenge by being a text type considerably different from the newspaper texts that most NLP tools are trained on, it also opens up intriguing new research questions, including how to detect regions of high suspense and how to model the particular footprint of an author or literary genre. A research topic that has sparked the interest of literary scholars and computational linguists alike is the analysis of character interactions, which are usually modelled as social networks. I will describe analogue and digital approaches to the creation and analysis of character network, focusing on whether such networks can be exploited to detect the author or genre of works of literature.

Mardi 13 mars, 10h-12h, ENS salle Cavaillès
Figurative Language in Discourse

Figurative language poses a serious challenge to NLP systems. The use of idiomatic and metaphoric expressions is not only extremely widespread in natural language ; many figurative expressions, in particular idioms, also behave idiosyncratically. These idiosyncrasies are not restricted to a non-compositional meaning but often also extend to syntactic properties, selectional preferences etc. To deal appropriately with such expressions, NLP tools need to detect figurative language and assign the correct analyses to non-literal expressions. I this talk, I will look at how figurative expressions behave in discourse and how they are chosen for a deliberate effect in a genre, in particular political discourse. I will also report on research dedicated to detecting and analysing figurative expressions automatically.

Mardi 20 mars, 10h-12h, ENS salle Cavaillès

Recognising and Disambiguating Named Entities in Historical Texts

When Humanists explore a text collection they are often particularly interested in named entities, in particular locations and people. Named entity recognition tools are consequently in high demand in the Digital Humanities. Off-the-shelf tools however often provide sub-optimal results. This is partly due to problems relating to a change in domain, which not only tends to decrease recognition performance but also often means that the named entity inventories themselves are inadequate for the target domain. On top of this, named entity recognition typically isn’t sufficient but needs to be complemented with named entity disambiguation, which links entity expressions to their real world referents. Finally, historical texts offer particular challenges, especially for place names. In this talk, I will summarise the challenges and potential solutions for recognising and disambiguation named entities in texts from the Humanities.

Mardi 27 mars, 10h-12h, ENS salle Cavaillès
Analysing Social Media Data for the Digital Humanities and Social Sciences

Microblogging platforms (Twitter) and other social media formats (Facebook, discussion forums) provide data that can be of interest to the Humanities and especially the Social Sciences. In this talk, I will discuss different approaches of analysing social media content in order to detect echo chambers, perform opinion and topic mining or detect fake accounts. The focus will be on the political domain, i.e. looking at how political affairs are discussed by social media users.