• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Development of LLM Compression Approaches and Their Automation

Student: Ilya Kozulin

Supervisor: Egor Churaev

Faculty: Faculty of Informatics, Mathematics, and Computer Science (HSE Nizhny Novgorod)

Educational Programme: Applied Mathematics and Information Science (Bachelor)

Year of Graduation: 2024

Large Language Models are highly relevant in tasks related to text generation: document generalization, topic-oriented dialog systems, machine translation, code generation. Such architectures quickly became widespread in the industry and increased the requirements for almost any dialog systems and services involved in natural language processing in one way or another. The effectiveness of such models is due to the large number (billions) of topology parameters, as well as training on impressive bodies of heterogeneous text data of different orientation, which allows them to show impressive generalizing abilities simultaneously on a variety of tasks. However, there is a downside: LLMs require huge computing resources in the form of at least several modern graphics accelerators (GPUs) equipped. Moreover, the availability of the necessary equipment may not guarantee the required level of latency and throughput, which may negatively affect the user experience and, in general, make it difficult to use a neural network. Compression is one of the most fundamental methods of reducing resource consumption and improving the performance of large language models. With this approach, due to the reduced memory consumption, less LLM can be used GPU, and due to calculations with reduced accuracy (for example, matrix multiplications of int8 x int8), throughput can be significantly increased.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses