Neural Networks of Power: AI Unravels Knots and Tangles in Relationships between Humans, Elves and Hobbits

One of the most popular writers of the last century, John Ronald Reuel Tolkien, was born on January 3rd. Researchers from HSE University, AIRI and MISSIS have used machine learning to explore the social connections between the characters of his Middle-earth universe. The algorithm managed to create an accurate picture of the social structures and dynamics of the characters' relationships, providing a unique map of interactions in the epic world. The researchers believe that this approach can be applied in many areas beyond literature. The results of the work were published in IEEE Xplore.
The analysis of literary works is a complex and time-consuming process. When reading any text, the researcher needs to capture numerous nuances and features — from the author's style and word choice to the relationships between characters and their role in the plot. Most often, this work is done manually by literary critics. Ilya Makarov, Senior Research Fellow at the School of Data Analysis and Artificial Intelligence at the HSE Faculty of Computer Science, head of the ‘AI in Industry’ group at the Artificial Intelligence Research Institute (AIRI), and Anastasia Yaschenko, HSE University graduate, applied computational linguistics and machine learning tools to a series of books by John Ronald Reuel Tolkien about Middle-earth. The AI ‘read’ the books, isolating the key elements: the characters, their belonging to a particular race and their social ties. It demonstrated the results in the form of a graph, which allows us to not only trace the relationship between the characters, but also to see more clearly the structure of their social network.
Senior Research Fellow at the School of Data Analysis and Artificial Intelligence
‘We chose the world of Middle-earth as the basis for our analysis for a number of key reasons. Firstly, J. R. R. Tolkien's texts are widely known and loved by readers around the world, which makes the study universal and global. Secondly, the system of characters in Tolkien's books is very rich and diverse, which creates optimal conditions for such an analysis. Finally, thanks to the long history of studying Tolkien's world, a large set of metadata is available, including detailed descriptions of characters and their race, which facilitates the process of automatic clustering and verification of results.’
The main goal was to create a program that could ‘understand’ human language, analyse literary texts, identify the characters of the book and determine their relationship. This work is based on the concept of social networks. This is an approach widely used in sociology, psychology and more recently in the field of computer science. In the context of literature analysis, each character is considered as a node, and the interactions between them are the edges connecting these nodes. When two characters interact with each other in the text, a connection, or edge, is established between their nodes. The more interactions occur between the characters, the stronger this edge is.
The use of machine learning algorithms has made it possible to automatically analyse texts and identify such interactions between characters, turning literary works into simulated social networks. Named Entity Recognition (NER), a natural language processing technology was used to automatically identify and classify entities in the text, such as names, places and organisations.
This technology helped scientists to create a list of each unique character mentioned in the books. Further semantic analysis allowed them to determine the race of each character. It was conducted by analysing the context and linking each character to a specific race based on the words and phrases that accompany his mention. For example, if a character is often referred to in context with the words ‘elf’ or ‘elvish; the algorithm classifies them as an elf. Due to the large amount of metadata of J. R. R. Tolkien's characters (races, related relationships, belonging to a certain kingdom, etc.) the researchers chose racial characteristic to interpret communities, as every character in the universe belongs to a certain race.
In addition, the use of named entities and semantic analysis of the text allowed researchers to determine not only the connection between the characters, but also the nature of these relationships — friendship, enmity or neutral relations. Artificial intelligence managed to identify complex social relationships between the characters and divide the characters into groups.
It is especially important that this approach is not limited only to The Lord of the Rings, but can be applied to any text, opening up new opportunities for automated research in literature.
‘Our study contains a sequence of steps that can be used to extract named entities and their relationships based on other texts. For example, to identify the relationship between the motives of works by different authors or to analyse complex legal documents,’ said Ilya Makarov.
See also:
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
HSE Graduate’s AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
New Neural Network for Science and Innovation Being Developed at HSE University
HSE researchers are training large language models (LLMs) to understand Russian-language scientific terminology while improving their energy efficiency. The adapted model runs 2.7 times faster and requires 73% less memory than the original open model, allowing it to operate on more affordable hardware. The programme has passed state registration.
HSE FCS Researchers Showcase AI and Bioinformatics Breakthroughs at ICLR 2026
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science, along with students from the AI360: Artificial Intelligence Engineering track of the Applied Mathematics and Information Science bachelor’s programme, took part in ICLR, one of the world’s most prestigious international conferences on machine learning and representation learning. This year’s event was held in Rio de Janeiro, Brazil.
The Future of Cardiogenetics Lies in Artificial Intelligence
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a program capable of analysing regions of the human genome that were previously inaccessible for accurate interpretation in genetic testing. The program adapts large generative AI (GenAI) models for cardiogenetics to predict how specific mutations affect the function of individual genes.
Teaching a Machine to Read the Past: HSE Develops Neural Network to Decipher Manuscripts
Diaries and letters are an invaluable resource for humanities scholars. But what can be done when the text is impossible to read? At the HSE Faculty of Humanities, this challenge has been translated into the language of mathematics: a team of philologists, historians, and machine learning specialists has created an information system that not only recognises illegible handwriting but also helps analyse archival content.
HSE and Yandex Propose Method to Speed Up Neural Networks for Image Generation
A team of scientists at HSE FCS and Yandex Research has proposed a method that reduces computational costs and accelerates text-to-image generation in diffusion models without compromising quality. These models currently set the standard for text-to-image generation, but their use is limited by high computational loads, the company said in a statement.
A Trap for the Advanced Student: How to Break the Habit of Blindly Trusting Neural Networks
Andrei Ternikov, Associate Professor at the St Petersburg School of Economics and Management at HSE University–St Petersburg, has developed a method for conducting online exams that significantly limits students’ ability to use ChatGPT and other AI models to obtain correct answers. Andrei Ternikov spoke to the HSE News Service about his approach—which won the HSE University Autumn Educational Innovation Competition, received an Alfa Future grant, and was presented at an international conference in Japan.
HSE Researchers Train Neural Network to Predict Protein–Protein Interactions More Accurately
Scientists at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a model capable of predicting protein–protein interactions with 95% accuracy. GSMFormer-PPI integrates three types of protein data (including information about protein surface properties) to analyse relationships between proteins, rather than simply combining datasets as in previous models. The solution could accelerate the discovery of disease molecular mechanisms, biomarkers, and potential therapeutic targets. The paper has been published in Scientific Reports.
How Neural Networks Detect and Interpret Wordplay: New Insights from HSE Researchers
An international team including researchers from the HSE Faculty of Computer Science has presented KoWit-24, an annotated dataset of 2,700 Russian-language Kommersant news headlines containing wordplay. The dataset enables an assessment of how artificial intelligence detects and interprets wordplay. Experiments with five large language models show that even advanced systems still make mistakes, and that interpreting wordplay is more challenging for them than detecting it. The results were presented at the RANLP conference; the paper is available on Arxiv.org, and the dataset and the code for reproducing the experiments are available on GitHub.


