кластеризація

Numerical Optimization Method for Clustering in Content-Based Image Retrieval Systems

The object of the study is the process of organizing a descriptor repository in content-based image retrieval systems. The subject of the study is a method of numerical optimization of descriptor clustering in a multidimensional space. The aim of this work is to develop a clustering optimization method in the Multidimensional Cube model to improve search efficiency. The core idea is to ensure a more uniform distribution of descriptors across clusters by adjusting interval boundaries in each dimension, which reduces imbalance in cluster density and improves retrieval performance.

Development of an Automated Natural Language Text Analysis System Using Transformers

The article is dedicated to the study of the development of an automated medical text analysis system using modern artificial intelligence technologies and natural language processing. The current state and prospects for the development of automated medical text analysis are analyzed. The main methods and technologies used in this field, including machine learning, deep learning, and natural language processing, are examined.

Study of the Effectiveness of Applying the K-Means Method to Decompose Large-Scale Traveling Salesman Problems

The decomposition of the problem is based on clustering the input set of points using the well-known k- means method, combined with an algorithm for extending partial solutions within clusters. k-means clustering algorithm is examined for partitioning the input data set of large-scale TSP instances into smaller subproblems. The efficiency of using it to reduce problem size is substantiated. Based on experiments, the application of a hierarchical version of the algorithm is proposed for problems with more than one million points.

Comparison and Clustering of Textual Information Sources Based on the Cosine Similarity Algorithm

This article presents a study aimed at developing an optimal concept for analyzing and comparing information sources based on large amounts of text information using natural language processing (NLP) methods. The object of the study was Telegram news channels, which are used as sources of text data. Pre-processing of texts was carried out, including cleaning, tokenization and lemmatization, to form a global dictionary consisting of unique words from all information sources.

Application of a Test Survey System Based on Cluster Analysis and Machine Learning in the Tasks of Professional Selection of Specialists

The research is aimed at developing a test survey for the effective selection of specialists in the IT field, based on the use of modern machine learning methods, particularly cluster analysis using the k- means method. Given the limited access to existing testing platforms, which are typically available only to large companies on a paid basis, the decision was made to create an alternative web application. This application will become an accessible tool for a wider range of users and will allow automating the process of evaluating candidates' skills.

Дослідження і розроблення методів і алгоритмів неієрархічної кластеризації

Розроблено і досліджено методи й алгоритми неієрархічної кластеризації, які дають змогу визначити оптимальну початкову кількість кластерів без будь-якої початкової інформації про їхнє розміщення. Розроблені методи і алгоритми досліджено на відомому тестовому наборі Iris.

Developed and studied the methods and non-hierarchical clustering algorithms for determining the optimal initial number of clusters without any background information on the location of the clusters. The methods and algorithms are studied in the famous test set Iris.

Особливості опрацювання даних для ієрархічної кластеризації складних схем

Пропонується гнучкий і універсальний підхід для опису електричних схем та дерева згортання, за яким можна оптимізувати виконання ключових етапів ієрархічної кластеризації.

A flexible and universal approach for presentation of electric circuits and reduction tree, which can optimizes the performance of key stages of hierarchical clustering is proposed.

Алгоритми кластеризації робочого поля з обмеженнями для задачі комівояжера

Описано три підходи до кластеризації робочого поля для задачі комівояжера, що забезпечує поділ множини точок на частини з заданими обмеженнями. Один із відомих алгоритмів використовується для отримання розв’язків в кожному кластері з подальшим зшиванням часткових розв’язків.

Article describes three approaches to clustering set of points of TSP into subsets with given constraints. One of the well-known basic algorithms is used for solutions at every cluster with further joining of partial solutions.

Solutions and approaches analysis for geospatial data clustering to optimize performance and user experience of web maps

In the contemporary epoch, the management and visualization of geospatial information in web browsers have gained substantial importance. Web maps are indispensable tools across various tourism, goods delivery, and ecology sectors. Furthermore, the extensive support of web browsers on diverse devices enhances the accessibility of geospatial data on the web for various users. However, the incessant increment of geospatial information poses new challenges in efficiently displaying data and navigation through these data on web maps.

Intelligent system for clustering users of social networks based on the message sentiment analysis

The main objective of this article is the analysis of the intelligent system for clustering users of social networks based on the messages sentiment analysis. The main goal of this intelligent system is to form a general image of the user of the system by analyzing the sentiment of the data of the user's social networks and their subsequent clustering. An intelligent system was designed, which, using the Identity and Access/Refresh JWT token algorithms, provides fast  and maximally secure registration, authentication and processing of various system user sessions.