Logometry: Corpus, Processing, Models

Projet Manager : Laurent Vanni

Permanent members

Brunet, Étienne – Kor Chahine, Irina – Lavigne, Frédéric – Magri, Véronique – Mayaffre, Damon – Poudat, Céline – Rojas, Minerva – Ruggia, Simona – Vanni, Laurent

Non-permanent members

Babault, Sophie – Beghini, Federica – Bouzereau, Camille – Chandelier, Marie – Haris, Sofiane – Kamagate, Karfa – Longrée, Dominique – Maciel, Carlos – Mahmoudi, Hadi – Maurer, Julia

Presentation

Using automatic or semi-automatic methods, thanks to linguistic processing techniques that draw on computer science, textual statistics and deep learning, the team works on textual corpora: contemporary political speeches, media performances, and ancient and modern literary works.

Working within the fields of discourse analysis, textual linguistics and corpus linguistics, its main objective is to reflect on discursivity/textuality, to reveal the internal organisation of texts, to propose a controlled description of their linguistic composition (recurring vocabulary, dominant grammatical tone, preferred syntactic structures), to establish textual typologies that take into account, in particular, the genre of the discourses considered, the conditions of enunciation, and the socio-historical positioning of the speaker, and finally to objectify reading or interpretative paths.

To do this, the team manipulates digital text corpora (which involves considering how to capture, store, format, lemmatise and tag texts) and develops methods and tools for automatic or semi-automatic processing.

In line with the work of Etienne Brunet, the team favours a quantitative, statistical or mathematical approach to large corpora using the HYPERBASE logometry software, whose performance they are seeking to improve and whose applications they are seeking to diversify, particularly in the field of contemporary political discourse.

Logometry: Corpus, Processing, Models

Permanent members

Non-permanent members

Presentation

Research areas