• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Application of Machine Learning Methods to Analyze R&D Documentation

Student: Levitan Darya

Supervisor: Alexander V. Belov

Faculty: HSE Tikhonov Moscow Institute of Electronics and Mathematics (MIEM HSE)

Educational Programme: Applied Mathematics (Bachelor)

Final Grade: 9

Year of Graduation: 2024

Due to the development of technology, the Russian state has become resolutely assisting digitalization of various spheres of life in the country. The innovations also affected the scientific field and tax authorities, as in Russia there are exemptions that stimulate research in the country by reducing tax deductions. The key condition for obtaining this preference is that the report attached to the application must be examined for its relation to R&D. Consequently, the analysis of reports provided by taxpayers plays a crucial role in the process of conducting scientific and technical examinations, taking into account the legality and adequacy of the use of tax deductions. Nowadays such checks take a lot of time, as well as human resources that can be directed to other tasks. Additionally, all the work is done manually by tax inspectors. Modern computer technologies make it possible to optimize the examination process by analyzing text reports provided by taxpayers. To solve this problem, it was proposed to conduct a binary classification of texts to determine the relationship of work to research or not using machine learning methods. To conclude, the work is aimed at solving the binary classification task based on package of students' final qualifying papers in order to determine the legality of categorizing completed works as R&D.

Full text (added May 18, 2024)

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses