• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Syntactic Complexity of Written Texts in Russian and English as Foreign Languages

Student: Klykova Elizaveta

Supervisor: Olga Lyashevskaya

Faculty: Faculty of Humanities

Educational Programme: Computational Linguistics (Master)

Final Grade: 10

Year of Graduation: 2024

This study offers a comprehensive perspective of syntactic complexity in English and Russian texts written by L1 and L2 speakers. We analyze 20 syntactic complexity measures pertaining to the sentential, clausal, and phrasal levels, and explore their interrelationships, correlation with proficiency, task type, and genre. We propose a new measure of syntactiс complexity based on Levenshtein distance at the clausal level. Our findings reveal strong correlations among length-based measures and highlight the problematic nature of the Coordination Index commonly used in the literature. We also find support for the idea that complexity generally increases with proficiency, with some measures plateauing at advanced levels. Syntactic complexity measures can also reliably distinguish between texts of different genres and task types; some values are language-specific, differing in the two languages considered. Despite the challenging nature of our data, some complexity features, namely length-based indices and phrasal complexity measures, are useful in the task of automatic proficiency prediction. As a practical application of our research, we introduce syntaxcomp, a Python library for extracting syntactic complexity measures from CoNLL-U annotations.

Full text (added June 7, 2024)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses