New Clustering Method Simplifies Analysis of Large Data Sets

Researchers from HSE University and the Institute of Control Sciences of the Russian Academy of Sciences have proposed a new method of data analysis: tunnel clustering. It allows for the rapid identification of groups of similar objects and requires fewer computational resources than traditional methods. Depending on the data configuration, the algorithm can operate dozens of times faster than its counterparts. The study was published in the journal Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia.
Each year, the volume of information requiring processing continues to grow. Data comes from a variety of sources: scientific research, financial reports, medical examinations, and many others. Clustering methods—which group data based on similar characteristics—are used to detect patterns and organise information within such large datasets. These groupings are known as clusters.
One of the most widely used clustering methods is the k-means algorithm. It divides data into a predetermined number of clusters, initially selecting their centres (centroids). However, this method has a limitation: the number of clusters must be known beforehand, which is not always possible when dealing with complex data. Scientists from HSE University and the V.A. Trapeznikov Institute of Control Sciences have proposed a new approach to simplify this process—tunnel clustering. Unlike the k-means method, this algorithm does not require the number of clusters to be set in advance; it determines the necessary number itself by analysing the data structure.
‘The algorithm forms “tunnels” in the data—regions in multidimensional space where objects with similar characteristics group together,’ explained Fuad Aleskerov, Head of the Department of Mathematics at the HSE Faculty of Economic Sciences. ‘Users can choose from three modes of operation: with fixed cluster boundaries, with adaptive boundaries that adjust to the data structure, or a combined approach. This makes the method flexible and suitable for various types of tasks.’
The method was tested on a synthetic (artificially generated) dataset of 100,000 objects, as well as on real-world tasks in public administration and the banking sector.

The main advantage of the new method is its speed. Unlike classical algorithms that demand significant computational resources, tunnel clustering can, depending on the data configuration, perform the analysis dozens of times faster.
In addition, the researchers introduced the concept of the ‘transition degree’—a parameter indicating how many characteristics of an object must change for it to be classified into a different cluster. This helps assess the clarity of cluster boundaries and identify objects situated at the intersection of different groups.
‘People are generating more and more data, and the pace is only accelerating. According to the latest Digital 2025: Global Overview Report, as of early 2025, there were 5.56 billion internet users—nearly 68% of the global population. Adults spend an average of 6 hours and 38 minutes online each day, communicating, working, watching videos, and consuming content,’ said Alexey Myachin, Senior Research Fellow at the HSE International Centre for Decision Choice and Analysis. ‘Companies that ignore data analysis are losing vast sums of money.’
The authors continue to refine the algorithm, including conducting research into dimensionality reduction, which will help further decrease the time required to identify patterns in data.
The study was carried out with partial support from the Russian Science Foundation.
See also:
Scientists Discover Why Parents May Favour One Child Over Another
An international team that included Prof. Marina Butovskaya from HSE University studied how willing parents are to care for a child depending on the child’s resemblance to them. The researchers found that similarity to the mother or father affects the level of care provided by parents and grandparents differently. Moreover, this relationship varies across Russia, Brazil, and the United States, reflecting deep cultural differences in family structures in these countries. The study's findings have been published in Social Evolution & History.
When a Virus Steps on a Mine: Ancient Mechanism of Infected Cell Self-Destruction Discovered
When a virus enters a cell, it disrupts the cell’s normal functions. It was previously believed that the cell's protective response to the virus triggered cellular self-destruction. However, a study involving bioinformatics researchers at HSE University has revealed a different mechanism: the cell does not react to the virus itself but to its own transcripts, which become abnormally long. The study has been published in Nature.
Researchers Identify Link between Bilingualism and Cognitive Efficiency
An international team of researchers, including scholars from HSE University, has discovered that knowledge of a foreign language can improve memory performance and increase automaticity when solving complex tasks. The higher a person’s language proficiency, the stronger the effect. The results have been published in the journal Brain and Cognition.
Artificial Intelligence Transforms Employment in Russian Companies
Russian enterprises rank among the world’s top ten leaders in AI adoption. In 2023, nearly one-third of domestic companies reported using artificial intelligence. According to a new study by Larisa Smirnykh, Professor at the HSE Faculty of Economic Sciences, the impact of digitalisation on employment is uneven: while the introduction of AI in small and large enterprises led to a reduction in the number of employees, in medium-sized companies, on the contrary, it contributed to job growth. The article has been published in Voprosy Ekonomiki.
Lost Signal: How Solar Activity Silenced Earth's Radiation
Researchers from HSE University and the Space Research Institute of the Russian Academy of Sciences analysed seven years of data from the ERG (Arase) satellite and, for the first time, provided a detailed description of a new type of radio emission from near-Earth space—the hectometric continuum, first discovered in 2017. The researchers found that this radiation appears a few hours after sunset and disappears one to three hours after sunrise. It was most frequently observed during the summer months and less often in spring and autumn. However, by mid-2022, when the Sun entered a phase of increased activity, the radiation had completely vanished—though the scientists believe the signal may reappear in the future. The study has been published in the Journal of Geophysical Research: Space Physics.
Banking Crises Drive Biodiversity Loss
Economists from HSE University, MGIMO University, and Bocconi University have found that financial crises have a significant negative impact on biodiversity and the environment. This relationship appears to be bi-directional: as global biodiversity declines, the likelihood of new crises increases. The study examines the status of populations encompassing thousands of species worldwide over the past 50 years. The article has been published in Economics Letters, an international journal.
Scientists Discover That the Brain Responds to Others’ Actions as if They Were Its Own
When we watch someone move their finger, our brain doesn’t remain passive. Research conducted by scientists from HSE University and Lausanne University Hospital shows that observing movement activates the motor cortex as if we were performing the action ourselves—while simultaneously ‘silencing’ unnecessary muscles. The findings were published in Scientific Reports.
Russian Scientists Investigate Age-Related Differences in Brain Damage Volume Following Childhood Stroke
A team of Russian scientists and clinicians, including Sofya Kulikova from HSE University in Perm, compared the extent and characteristics of brain damage in children who experienced a stroke either within the first four weeks of life or before the age of two. The researchers found that the younger the child, the more extensive the brain damage—particularly in the frontal and parietal lobes, which are responsible for movement, language, and thinking. The study, published in Neuroscience and Behavioral Physiology, provides insights into how age can influence the nature and extent of brain lesions and lays the groundwork for developing personalised rehabilitation programmes for children who experience a stroke early in life.
Scientists Test Asymmetry Between Matter and Antimatter
An international team, including scientists from HSE University, has collected and analysed data from dozens of experiments on charm mixing—the process in which an unstable charm meson oscillates between its particle and antiparticle states. These oscillations were observed only four times per thousand decays, fully consistent with the predictions of the Standard Model. This indicates that no signs of new physics have yet been detected in these processes, and if unknown particles do exist, they are likely too heavy to be observed with current equipment. The paper has been published in Physical Review D.
HSE Scientists Reveal What Drives Public Trust in Science
Researchers at HSE ISSEK have analysed the level of trust in scientific knowledge in Russian society and the factors shaping attitudes and perceptions. It was found that trust in science depends more on everyday experience, social expectations, and the perceived promises of science than on objective knowledge. The article has been published in Universe of Russia.


