Corpus, textometry, didactics and foreign languages

This area focuses on the relationships between corpora, textometry, teaching and foreign languages. Work will continue on the textometric exploration of the CEFR (Common European Framework of Reference for Languages) levels based on corpora of texts from textbooks and teaching methods on the one hand, and corpora of learners on the other (work initiated by S. Ruggia during the last contract, and already underway in FLE and Russian with the development and launch of the DeepFLE and Russian Wheel platforms), using deep learning in particular.
The annotations of the learner corpora will be used in a new way: automatic text processing will be applied to identify errors and correct learners’ texts. Preliminary research has shown the feasibility of such a project and we will focus on developing such a tool.

Finally, we will also develop the perspective of a tool-based linguistics whose issues and questions stem from didactic requirements, with a focus on argumentation. What constructions and discursive strategies do L2 learners need to master in order to understand argumentation in the context of language certifications such as CLES in particular? Authentic corpora, mainly television debates, are therefore statistically analysed in terms of pragmatic-enunciative phenomena and argumentative strategies. Statements deemed salient and transferable are extracted and then presented to learners according to the discursive functions identified beforehand.