• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
  • HSE University
  • News
  • Faster and More Precise: Researcher Improves Performance of Image Recognition Neural Network

Faster and More Precise: Researcher Improves Performance of Image Recognition Neural Network

Faster and More Precise: Researcher Improves Performance of Image Recognition Neural Network

© iStock

A scientist from HSE University has developed an image recognition algorithm that works 40% faster than analogues. It can speed up real-time processing of video-based image recognition systems. The results of the study have been published in the journal Information Sciences.

Convolutional neural networks (CNNs), which include a sequence of convolutional layers, are widely used in computer vision. Each layer in a network has an input and an output. The digital description of the image goes to the input of the first layer and is converted into a different set of numbers at the output. The result goes to the input of the next layer and so on until the class label of the object in the image is predicted in the last layer. For example, this class can be a person, a cat, or a chair. For this, a CNN is trained on a set of images with a known class label. The greater the number and variability of the images of each class in the dataset are, the more accurate the trained network will be.

If there are only a few examples in the training set, the additional training (fine-tuning) of the neural network is used. CNN is trained to recognize images from a similar dataset that solves the original problem. For example, when a neural network learns to recognize faces or their attributes (emotions, gender, age), it is preliminary trained to identify celebrities from their photos. The resulting neural network is then fine-tuned on the available small dataset to identify the faces of family or relatives in home video surveillance systems. The more depth (number) of layers there are in a CNN, the more accurately it predicts the type of object in the image. However, if the number of layers is increased, more time is required to recognize objects.

The study’s author, Professor Andrey Savchenko of the HSE Campus in Nizhny Novgorod, was able to speed up the work of a pre-trained convolutional neural network with arbitrary architecture, consisting of 90-780 layers in his experiments. The result was an increase in recognition speed of up to 40%, while controlling the loss in accuracy to no more than 0.5-1%. The scientist relied on statistical methods such as sequential analysis and multiple comparisons (multiple hypothesis testing).

The decision in the image recognition problem is made by a classifier — a special mathematical algorithm that receives an array of numbers (features/embeddings of an image) as inputs, and outputs a prediction about which class the image belongs to. The classifier can be applied by feeding it the outputs of any layer of the neural network. To recognize "simple" images, the classifier only needs to analyse the data (outputs) from the first layers of the neural network.

Andrey Savchenko
Professor, Department of Information Systems and Technologies

There is no need to waste further time if we are already confident in the reliability of the decision made. For "complex" pictures, the first layers are clearly not enough — you need to move on to the next. Therefore, classifiers were added to the neural network into several intermediate layers. Depending on the complexity of the input image, the proposed algorithm decided whether to continue recognition or complete it. Since it is important to control errors in such a procedure, I applied the theory of multiple comparisons: I introduced many hypotheses, at which intermediate layer to stop, and sequentially tested these hypotheses.

If the first classifier already produced a decision that was considered reliable by the multiple hypothesis testing procedure, the algorithm stopped. If the decision was declared unreliable, the calculations in the neural network continued to the intermediate layer, and the reliability check was repeated.

The most accurate decisions are obtained for the outputs of the last layers of the neural network. Early network outputs are classified much faster, which means it is necessary to simultaneously train all classifiers in order to accelerate recognition while controlling loss in accuracy. For example, so that the error due to an earlier stop is no more than 1%.

High accuracy is always important for image recognition. For example, if a decision in face recognition systems is made incorrectly, then either someone outside can gain access to confidential information or conversely the user will be repeatedly denied access, because the neural network cannot identify him correctly. Speed ​​can sometimes be sacrificed, but it matters, for example, in video surveillance systems, where it is highly desirable to make decisions in real time, that is, no more than 20-30 milliseconds per frame. To recognize an object in a video frame here and now, it is very important to act quickly, without losing accuracy.

See also:

Beauty in Details: HSE University and AIRI Scientists Develop a Method for High-Quality Image Editing

Researchers from theHSE AI Research Centre, AIRI, and the University of Bremen have developed a new image editing method based on deep learning—StyleFeatureEditor. This tool allows for precise reproduction of even the smallest details in an image while preserving them during the editing process. With its help, users can easily change hair colour or facial expressions without sacrificing image quality. The results of this three-party collaboration were published at the highly-cited computer vision conference CVPR 2024.

HSE University at VK Fest: VR Games and Emotion Recognition

On July 13-14, 2024, the annual large-scale VKontakte festival took place at Moscow’s Luzhniki Stadium. HSE University, as usual, participated in the event. The university's tent featured a variety of activities, including emotion recognition challenge, quizzes about artificial intelligence, IT career testing, a smile detector, VR gaming, and a blue tractor equipped with a smart sprinkler system.

Russian Researchers Improve Neural Networks' Spatial Navigation Performance

Researchers at HSE University, MISiS National University of Science and Technology, and the Artificial Intelligence Research Institute (AIRI) have developed an enhanced approach to reinforcement learning for neural networks tasked with navigation in three-dimensional environments. By using the attention mechanism, they managed to improve the performance of a graph neural network by 15%. The study results have been published in IEEE Access.

Neural Network Developed at HSE Campus in Perm Will Determine Root Cause of Stroke in Patients

Specialists at HSE Campus in Perm and clinicians at Perm City Clinical Hospital No. 4, have been collaborating to develop a neural network capable of determining the root cause of a stroke. This marks the world's first attempt to create such a system, the developers note.

HSE Researchers Teach Neural Networks to Better Detect Humour

A group of scientists from the HSE Faculty of Computer Science has conducted a study on the ability of neural networks to detect humour. It turns out that for more reliable recognition, it’s necessary to change the approach to creating datasets on which neural networks are trained. The scientists presented these results at one of the world's most important conferences on natural language processing — EMNLP 2023. 

Neural Networks of Power: AI Unravels Knots and Tangles in Relationships between Humans, Elves and Hobbits

One of the most popular writers of the last century, John Ronald Reuel Tolkien, was born on January 3rd. Researchers from HSE University, AIRI and MISSIS have used machine learning to explore the social connections between the characters of his Middle-earth universe. The algorithm managed to create an accurate picture of the social structures and dynamics of the characters' relationships, providing a unique map of interactions in the epic world. The results of the work were published in IEEE Xplore.

Specialists from the HSE Institute of Education Confirm GigaChat’s Erudition in Social Sciences

A multimodal neural network model by Sber, under the supervision of HSE University’s expert commission, has successfully passed the Unified State Exam in social studies. GigaChat completed all exam tasks and scored 67 points.

Child Ex Machina: What Artificial Intelligence Can Learn from Toddlers

Top development teams around the world are trying to create a neural network similar to a curious but bored three-year-old kid. IQ.HSE shares why this approach is necessary and how such methods can bring us closer to creating strong artificial intelligence.

HSE Researchers Join Forces with Yandex Cloud to Develop a Neural Network for Predicting El Niño

A team of researchers from HSE University, jointly with the Yandex School of Data Analysis and Yandex Cloud, have developed a neural network for anticipating El Niño climate anomalies. The new algorithm enables more precise predictions of changes in the average surface temperature of oceanic waters that can trigger natural disasters in specific regions of the world. At present, the model is capable of predicting El Niño events one and a half years in advance, and the researchers are working towards extending the forecast period to two years.

‘Neural Networks Are Something that Move the World Forward’

On September 4, the HSE University building on Pokrovsky Bulvar hosted ARTificial Fest, an event devoted to neural network art. The festival was organised by the HSE University Faculty of Creative Industries, the HSE Career centre, and the Chisty List (‘Blank Page’) student organisation. The event was open not only to students and staff of HSE University, but also to anyone interested in the blending of machine algorithms and art.