Digitization of Manuscripts: Months of Searching Can Turn into Hours and Even Minutes
HSE staff members are participating in the ‘Russian Cultural Heritage: Intellectual Analysis and Thematic Modeling of the Corpus of Handwritten Texts’ project. This is aimed at developing a methodology for the automated analysis of manuscripts, eliminating the need for manual processing. HSE News Service spoke to Ekaterina Boltunova, project manager, Professor, Head of the Laboratory 'Russia’s Regions in Historical Perspective' at HSE Faculty of Humanities.
The ‘Russian Cultural Heritage: Intellectual Analysis and Thematic Modeling of the Corpus of Handwritten Texts’ project was launched by the staff of the School of Philology and the Laboratory 'Russia’s Regions in Historical Perspective' at HSE Faculty of Humanities, together with a group of mathematicians from the Faculty of Computational Mathematics and Cybernetics at Lomonosov Moscow State University. Tomsk State University also participated in the project, and the project was supported by the Russian Science Foundation.
Project Overview
In many ways, we owe the idea for this project to the philologist Lyubov Khachaturian, who suggested considering the possibility of processing the handwritten heritage of Russian writers using mathematical methods. Working with Professor Elena Penskaya, they held two conferences on ‘Text as DATA: The Manuscript in Digital Space’ in 2019 and 2020. As well as this, they launched the website ‘Autograph. XX century’, a site for manuscripts of the classics of 20th century Russian literature.
In 2016, a group of mathematicians led by Professor Leonid Mestetsky joined the work on a collection of raster images of handwritten autographs, which allowed us to commence joint interdisciplinary research on the analysis of some handwritten texts.
I have always found topics related to the introduction of new practices in the study of handwritten texts extremely interesting.
For our lab staff, working with archival documents, primarily handwritten, is an absolute priority and we have visited the archives of many cities and regions
To implement this grant we brought together an interesting team, which includes specialists in humanities, philologists and historians from the HSE and Tomsk State University and mathematicians, representing the Faculty of Computational Mathematics and Cybernetics of Moscow State University.
The Memory of the People: The History of the Second World War
The projects ‘The Memory of the People’ and ‘The Feat of the People’, involving the publication of huge arrays of historical documents from the Great Patriotic War (the Russian name for the part of the Second World War that took place on the territory of the USSR) ,are particularly socially significant.
I remember that my father was deeply moved by the opportunity to find information on one of these resources about his uncle, who went missing in the first months of the war —the family had not been able to find out any information about him for decades.
A large array of digitized data which is open to public access, with documents that you can read from your personal computer, always arouses great interest among researchers. We can recall the attention paid to material relating to the ‘Personal Fund of I.V. Stalin’ and the materials of the Politburo of the Central Committee of the CPSU. But now we would like to focus on materials created during the period of the Russian Empire and the early USSR.
Digitalization of Documents from Pre-revolutionary Russia
We plan to work primarily with sources that have a personal origin — memoirs, diaries and letters. Initially, the work will focus on the end of the 19th and start of the 20th centuries, that is, a time with a large array of handwritten texts of different types, genres and characters.
We aim to develop an automated navigation system for handwritten text, which will allow the researcher to select the materials necessary for work from a huge array of data, which will dramatically reduce the time spent on text parsing
Search algorithms in the proposed system will be created using handwritten text recognition methods based on reconstructed pen trajectory using grapheme segmentation and identification of continuous morphological models, as well as using machine learning.
The system will allow users to search by keywords, word combinations, dates and locations.
The Role of MSU Mathematicians and Project Partners from Tomsk State University
Professor Leonid Mestetsky, who heads the group of mathematicians working on the project, is one of the most renowned experts in working with artificial intelligence systems, graphical navigation and thematic corpus modeling, as well as recognition of raster images. As part of the project, Mestetsky and his students will develop an automated navigation system for unencrypted manuscripts, while historians and philologists will be engaged in manuscript search and selection.
Our colleagues from Tomsk State University, headed by Professor Vitaly Kiselyov, will join the project working with the materials of the poet Vasily Zhukovsky. In addition to literary activity, he conducted active editorial work, and was an educator of the future Tsar, Alexander II.
Qualitative Acceleration of Work with Texts
The main result of our work will be the development of a programme for working with unstructured data arrays. The publication of digitized materials, combined with the ability to search directly through handwritten text, is a gamechanger. We are talking about an extreme speeding up of the process — months of searching can turn into hours or even minutes.
Of course, in recent years, the possibilities for using artificial intelligence have expanded enormously, it can be applied in many different areas of life, which has sometimes led to apocalyptic predictions. But as a historian, I want to remind you that it’s not the first technological innovation that we’ve seen. Just remember how many fears the introduction of personal computers provoked. But, thank God, we are all still alive and mentally healthy, so we’re continuing our research and hoping for new discoveries.
Ekaterina Boltunova
Head, International Laboratory 'Russia’s Regions in Historical Perspective'
See also:
‘The HSE Faculty of Humanities Affirms Its Status as a Leading Centre in Russia and Abroad’
On December 1, the HSE Faculty of Humanities celebrated its 10th anniversary. In honour of the occasion, HSE leaders, representatives of other universities, and members of the Russian Academy of Sciences (RAS) gathered at the university’s Cultural Centre to congratulate the faculty.
I’m Writing to You: What Postcards Can Tell Us
Not so long ago, postcards were a popular way to congratulate someone or send a message. Today the postcard can instead be described as an exotic means of communication, and a rich field for research. This is what encouraged the students and teachers from Fundamental and Applied Linguistics at the Faculty of Humanities to embark on a flash mob project called ‘Send a Postcard to a Linguist’. Deputy Dean Timur Khusyainov of the Faculty of Humanities (Nizhny Novgorod), the curator of this flash mob and an experienced postcrosser, discusses whether postcards can be helpful for researchers and how they relate to digital humanities.
Digital Transformation Requires Unconventional Solutions
Over the next ten years, Russia is expected to undertake a massive digitalization of its public services. The President has repeatedly talked about the need to actively implement artificial intelligence (AI) and big data analysis technologies. On January 11, the International Laboratory for Digital Transformation in Public Administration opened at the HSE Institute for Public Administration and Governance. Below, Laboratory Head Evgeny Styrin discusses what the implementation of digital resources means for citizens, how the government, society, and the business community are working together, and what some bad examples of digitalization are and how to fix them.
Poletaev Readings Consider New Turns In and Away from Theory in the Humanities
The Poletaev Readings, dedicated to the memory of Andrey Poletaev, one of the founders of the Poletaev Institute for Theoretical and Historical Studies in the Humanities (Russian acronym — IGITI), is a major annual event of the Institute. The event was set to mark its 10th anniversary in 2020, but due to the pandemic, the anniversary forum has been postponed to 2021. In its place, the organizers have arranged the Poletaev Readings 9¾, which were held online. HSE News Service spoke with the event organizer and some of the participants.
What Does the Lens of Gender Reveal?
In June, faculty members from HSE’s School of Cultural Studies, the School of Philosophy, and the Poletayev Institute for Theoretical and Historical Studies in the Humanities met with colleagues from the University of Pittsburgh (USA) and a Russian art historian to participate in a round table on the importance of gender studies in the humanities. The researchers discussed questions such as what historians, philosophers, and historians can achieve when approaching their fields of study from the standpoint of gender studies, and what the state of gender studies is in contemporary Russia and abroad.
‘In Russian the Word “Justice” Is Not Associated with the Word “War”’
Researchers from the Higher School of Economics have begun working with the research centre of the French Saint-Cyr Military Academy (École spéciale militaire de Saint-Cyr) on the moral and political issues of modern-day warfare. One part of this partnership was a conference devoted to just war theory and problems with combating terrorism. Below, Faculty of Humanities Professor Boris Kashnikov, also a participant of the conference, tells Scholar Viewpoint whether there can be justice in war and how scholars of the humanities are able to work together with the military.
Vera Pozzi – A Year of Russian Intellectual Culture
Ever since she completed her dissertation on ‘The role of the Ecclesiastical Academies in Reception of Kantianism in the Russian Empire’ in 2015, Vera Pozzi, a native of the northern Italian city of Lecco, has sought an opportunity to return to Russia to take her research to the next level. When she saw HSE’s call for international fellowships, she was drawn by the internationally oriented nature of the application and the opportunity to apply for a field like ‘History of Russian Intellectual Culture’, which aligns perfectly with her current research interests. In September, Vera will be enrolled in the Faculty of Humanities, School of Philosophy for one year under a post-doc fellowship.
Studying Medicine in the Humanities
At the most recent Andrey Poletayev Memorial Readings held by the Poletayev Institute for Theoretical and Historical Studies in the Humanities (IGITI), participants discussed the relationship between the natural sciences and the social sciences. HSE Professor Elena Vishlenkova tells us why scholars in the humanities are interested in the natural sciences and what contribution they can make to this field.
Berlin Scholar to Return to HSE for Series of Lectures on Literature
From September 25 till October 5 2016, Professor Dr Joachim Küpper of the Free University of Berlin will deliver a series of lectures on ‘Humanities and Conceptualization of Time at HSE Moscow. Joachim Küpper’s travel to HSE follows the university’s decision this past summer to join a key project run by the Dahlem Humanities Center at the Free University of Berlin called ‘The Thematic Network Principles of Cultural Dynamics’.
Russian and Italian Intellectuals Speak a Common Language
In late May Moscow hosted a Russian-Italian research conference marking the anniversary of the birth of Italian philosopher Benedetto Croce. The conference entitled 'The Legacy of Benedetto Croce in the 21st Century' was organized by and held at the HSE's Humanities Faculty in conjunction with the Italian Cultural Institute in Moscow.