• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Navigating manuscripts: A new project by National Research University Higher School of Economics and the prospects of archival work in Russia

The project «Russia’s cultural heritage: intellectual analysis and thematic models of a manuscript corpus» researched by staff members of the Institute of Russia’s Regional History in cooperation with a team of mathematicians from the Faculty of Computational Mathematics and Cybernetics, Moscow State University, have been supported by the Russian Science Foundation. The project also includes a long-term partnership with a team of philologists and historians based at Tomsk State University. The system of automatized navigation along manuscript texts from the libraries, museums and archives of Russian institutions, developed by the project team, will make searching for manuscript sources drastically easier. Enabled with tools helping search for relevant materials by means of keyword search and graphic elements, researchers will not have to only rely on manual discovery.

The idea of a new project sprang up several years ago as a team of researchers suggested using mathematical methods in studying the manuscripts of Russian writers. Under the guidance of Professor Elena Penskaya, who played an important role in designing the project, two conferences have been held in 2019 and 2020 (Text as DATA: a Manuscript in the Digital Space), and a dedicated site has been set up ( http://literature-archive.ru/ru) to host electronic versions of some manuscripts by 20th century Russian writers. In 2016, a team of mathematicians led by Professor Leonid Mestetsky started working with the set of bitmap images, which opened the way to a joint interdisciplinary study of some of the manuscript texts.

The Institute of Russia’s Regional History at NRU HSE was chosen as a project venue due to its experience in studying manuscript texts. Besides individual research by the institute staff, their collective projects also deal with archival studies, both in regional and federal archives. In 2019-2022, the institute staff did their research in a number of archives of various cities and regions from Saratov to Murmansk and from Smolensk to Vladivostok. They were looking for sources which could be included into a corpus of texts for the “Russia’s regions in the historical perspective” online portal, a collection of archival materials on the history of Russia’s regions (https://regionalhistory.hse.ru/). At the moment, the website features the archive of Nikolai Yadrintsev, a well-known Siberian regionalist (the original is preserved at the Research Library of Tomsk State University), as well as a series of interviews of directors of regional archival institutions. At the beginning of the year 2023, the collection will be expanded to feature digitalized sources from the Russian State Archives of the Far East, the State Archives of Vladimir Oblast’, and the personal archive of the late historian Anatoly Remnyov.

The work on the RSF-supported project will be carried on by the team which brings together philologists and historians from the Higher School of Economics and Tomsk State University, and mathematicians from the Faculty of Computational Mathematics and Cybernetics, Moscow State University. The team will largely focus on personal documents – memoirs, letters and diaries – a field opening a lot of opportunity for experts in either history or literature.

The project team aims to develop a methodology of analysing information from manuscript texts without resorting to manual procession of sources with the aid of a decipherer. This implies setting up an automatized system of navigating the manuscript, which will allow researchers to pick up from the huge dataset only the snippets referring to the most relevant segments in the text. This will drastically cut the time spent on working with the text.

The search algorithms within the suggested system will make use the methodologies of manuscript character recognition by retracing the pen movement, of grapheme segmentation and identifying continuous morphological models, as well as machine learning.

It must be noted that such study aids can have an impact on the general state of affairs in archival work in Russia. Irrespectively of any system of organizing this work, Russian archives of various types have for a number of years been involved in digitalizing their collections. At the moment, a large amount of digital data is available to their visitors – and we can deduce that, alongside with the demand for remote access to archival sources, there will rise a demand for bringing together various existing archival data. In the latter case, the human factor will be less influential, but the volume of data available to readers, on the contrary, will grow exponentially.

Bringing together digitalized archival sources, in its turn, will mandate a breakthrough currently discussed in research literature: digitalization will have to advance to a new level by making use of artificial intelligence and setting up powerful search systems capable of processing large databases. Another challenge will lie in building a methodology of analyzing information found, in particular, in the yet undeciphered manuscript texts.


 

Нашли опечатку?
Выделите её, нажмите Ctrl+Enter и отправьте нам уведомление. Спасибо за участие!
Сервис предназначен только для отправки сообщений об орфографических и пунктуационных ошибках.