• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Applying Graph Neural Networks and Reinforcement Learning for Abstractive Summarization of Long Texts

Student: Vasiliy Kakurin

Supervisor: Sergey Slastnikov

Faculty: HSE Tikhonov Moscow Institute of Electronics and Mathematics (MIEM HSE)

Educational Programme: Applied Mathematics (Bachelor)

Year of Graduation: 2024

Text data analysis tasks have attracted significant attention from researchers in recent years, with notable advancements achieved by models such as ChatGPT in various applications, including dialogue generation and question answering. Reinforcement Learning, and, specifically, Reinforcement Learning from Human Feedback has emerged as a key component contributing to their success. However, certain tasks, particularly the processing of long text sequences, have received comparatively less focus. The prevalent Transformer model, a cornerstone of SOTA models, exhibits quadratic complexity with respect to sequence length, making it inadequate for directly handling lengthy sequences. In this paper we propose an approach addressing the challenge of handling long text sequences in the summarization task. Leveraging pre-trained SOTA models fine-tuned specifically for text summarization, in conjunction with a Graph Neural Network model, this approach demonstrates the capability to process long sequences with lower computational complexity with respect to the length of the input sequence than other Natural Language Processing models. To evaluate results of our approach, we have fine-tuned several models using publicly available datasets with texts and their summaries (ArXiv, Gazeta), and, also, we have collected our own dataset of long texts in Russian, containing sets of all news in the past 24 hours and summaries with main news per this period. The proposed approach, applied to pre-trained models allows to improve their quality in terms of metrics, used by authors of models in the corresponding articles. Since application of this approach to already trained model does not require significant computational resources, researchers and developers can easily modify their existing models, increasing maximal length of processed texts, and improving quality of processing.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses