МЕТОД ОБМЕЖЕНИХ СТРУКТУР ЛОГІЧНИХ ДЕРЕВ У ЗАДАЧІ КЛАСИФІКАЦІЇ ДИСКРЕТНИХ ОБ'ЄКТІВ

І. Ф. Повхан

Розглядається проблема побудови моделі логічних дерев класифікації на підставі обмеженого методу селекції елементарних ознак для масивів геологічних даних. Запропоновано метод апроксимації масиву реальних даних набором елементарних ознак з фіксованим критерієм зупинки процедури розгалуження на етапі побудови дерева класифікації. Даний підхід дає змогу забезпечити необхідну точність моделі, знизити її структурну складність та досягти необхідних показників ефективності. Розроблено обмежений метод побудови дерев класифікації, який спрямований на добудову тільки тих шляхів (ярусів) структури дерева класифікації, де є найбільша кількість помилок (усіх типів) класифікації. Такий підхід синтезу моделі розпізнавання дає можливість досить ефективно регулювати складність (точність) моделі дерева класифікації, що будується, причому доцільним є його застосування в ситуаціях з обмеженнями щодо апаратних ресурсів інформаційної системи, обмеженнями точності та структурної складності моделі, обмеженнями на структуру, послідовність та глибину розпізнавання масиву даних навчальної вибірки. Обмежена схема синтезу дерев класифікації дає змогу будувати моделі майже на 20 % швидше. Побудоване логічне дерево класифікації буде безпомилково класифікувати (розпізнавати) всю навчальну вибірку за якою побудована модель, мати мінімальну структуру (структурну складність) та складатися з компонентів – наборів елементарних ознак як вершини конструкції, атрибутів дерева. На підставі запропонованої модифікації методу селекції елементарних ознак розроблено програмне забезпечення, яке дає змогу роботу з набором різнотипних прикладних задач. Пропонується підхід синтезу нових моделей розпізнавання на підставі обмеженої схеми логічних дерев та вибору параметрів препрунінгу. Тобто розроблена ефективна схема розпізнавання дискретних об'єктів на підставі покрокової оцінки і вибору наборів атрибутів (узагальнених ознак) за відібраними шляхами в структурі дерева класифікації на кожному кроці синтезу схеми.

логічне дерево класифікації

розпізнавання образів

класифікація

ознака

критерій розгалуження

Bodyanskiy, Y., Vynokurova, O., Setlak, G. & Pliss, I. (2015). Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm, Xth Scien. and Tech. Conf. "Computer Sciences and Information Technologies" (CSIT), Lviv, 111-114. https://doi.org/10.1109/STC-CSIT.2015.7325445
Breiman, L. L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Boca Raton, Chapman and Hall/CRC, 368 p.
De Mántaras, R. L. (1991). A distance-based attribute selection measure for decision tree induction. Machine learning, 6(1), 81–92. https://doi.org/10.1007/BF00153761
Deng, H., Runger, G., & Tuv, E. (2011). Bias of importance measures for multi-valued attributes and solutions, Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN), 293–300. https://doi.org/10.1007/978-3-642-21738-8_38
Deng, H., Runger, G., & Tuv, E. (2011). Bias of importance measures for multi-valued attributes and solutions, 21st International Conference on Artificial Neural Networks (ICANN), Espoo, 14–17 June 2011: proceedings. Berlin, Springer-Verlag, 2, 293–300. https://doi.org/10.1007/978-3-642-21738-8_38
Hastie, T., Tibshirani, R., & Friedman, J. (2008). The Elements of Statistical Learning. Berlin, Springer, 768.
Kamiński, B., Jakubczyk, M., & Szufel, P. (2017). A framework for sensitivity analysis of decision trees. Central European Journal of Operations Research, 26(1), 135–159. https://doi.org/10.1007/s10100-017-0479-6
Karimi, K. L., & Hamilton, H. J. (2011). Generation and Interpretation of Temporal Decision Rules. International Journal of Computer Information Systems and Industrial Management Applications, 3, 314–323.
Koskimaki, H., Juutilainen, I., Laurinen, P., & Roning, J. (2008). Two-level clustering approach to training data instance selection: a case study for the steel industry, Neural Networks: International Joint Conference (IJCNN-2008), Hong Kong, 1–8 June 2008: proceedings. Los Alamitos, IEEE, 3044–3049. https://doi.org/10.1109/IJCNN.2008.4634228
Kotsiantis, S. B. (2007). Supervised Machine Learning: A Review of Classification Techniques. Informatica, 31, 249–268.
Laver, V. O., & Povkhan, I. F. (2019). The algorithms for constructing a logical tree of classification in pattern recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 30(69), 4, 100–106. https://doi.org/10.32838/2663-5941/2019.4-1/18
Lupei, M., Mitsa, A., Repariuk, V., & Sharkan, V. (2020). Identification of authorship of Ukrainian-language texts of journalistic style using neural networks. Eastern-European Journal of Enterprise Technologies, 2(103), 30–36. https://doi.org/10.15587/1729-4061.2020.195041
Miyakawa, M. (1989). Criteria for selecting a variable in the construction of efficient decision trees. IEEE Transactions on Computers, 38(1), 130–141. https://doi.org/10.1109/12.8736
Painsky, A., & Rosset, S. (2017). Cross-validated variable selection in tree-based methods improves predictive performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(11), 2142–2153. https://doi.org/10.1109/TPAMI.2016.2636831
Povhan, I. (2016). Designing of recognition system of discrete objects, IEEE First International Conference on Data Stream Mining & Processing (DSMP), Ukraine. Lviv, 226–231.
Povhan, I. (2019). General scheme for constructing the most complex logical tree of classification in pattern recognition discrete objects. Collection of scientific papers Electronics and information technologies, Lviv, 11, 112–117. https://doi.org/10.30970/eli.11.7
Povhan, I. F. (2019). The problem of general estimation of the complexity of the maximum constructed logical classification tree. Bulletin of the national technical University Kharkiv Polytechnic Institute, 13, 104−117. https://doi.org/10.20998/2411-0558.2019.13.10
Povkhan, I. (2020). Classification models of flood-related events based on algorithmic trees. Eastern-European Journal of Enterprise Technologies, 6-4(108), 58–68. https://doi.org/10.15587/1729-4061.2020.219525
Povkhan, I. F. (2018). The problem of functional evaluation of a training sample in discrete object recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 29(68), 6, 217–222.
Povkhan, I. F. (2019). Features of synthesis of generalized features in the construction of recognition systems using the logical tree method, Materials of the international scientific and practical conference "Information technologies and computer modeling ITKM-2019". Ivano-Frankivsk, 169–174.
Povkhan, I. F. (2019). Features random logic of the classification trees in the pattern recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 30(69), 5, 152–161. https://doi.org/10.32838/2663-5941/2019.5-1/22
Quinlan, J. R. (1986). Induction of Decision Trees, Machine Learning, 1, 81–106. https://doi.org/10.1007/BF00116251
Srikant, R., Agrawal, R. (1997). Mining generalized association rules. Future Generation Computer Systems, 13(2), 61–180. https://doi.org/10.1016/S0167-739X(97)00019-8
Subbotin, S. (2013). The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition. Optical Memory and Neural Networks (Information Optics), 22(2), 97–103. https://doi.org/10.3103/S1060992X13020082
Subbotin, S. A. (2013). Methods of sampling based on exhaustive and evolutionary search. Automatic Control and Computer Sciences, 47(3), 113–121. https://doi.org/10.3103/S0146411613030073
Subbotin, S. A. (2014). Methods and characteristics of localitypreserving transformations in the problems of computational intelligence. Radio Electronics, Computer Science, Control, 1, 120–128. https://doi.org/10.15588/1607-3274-2014-1-17
Subbotin, S. A. (2019). Construction of decision trees for the case of low-information features. Radio Electronics, Computer Science, Control, 1, 121–130. https://doi.org/10.15588/1607-3274-2019-1-12
Subbotin, S., & Oliinyk, A. (2017). The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis, Recent Advances in Systems, Control and Information Technology, [eds.: R. Szewczyk, M. Kaliczyńska]. Cham, Springer, 11–19. https://doi.org/10.1007/978-3-319-48923-0_2
Vasilenko, Y. A., Vashuk, F. G., & Povkhan, I. F. (2011). The problem of estimating the complexity of logical trees recognition and a general method for optimizing them. European Journal of Enterprise Technologies, 6/4(54), 24–28.
Vasilenko, Y. A., Vashuk, F. G., & Povkhan, I. F. (2012). General estimation of minimization of tree logical structures. European Journal of Enterprise Technologies, 1/4(55), 29–33.
Vasilenko, Y. A., Vashuk, F. G., Povkhan, I. F., Kovach, M. Y., & Nikarovich, O. D. (2004). Minimizing logical tree structures in image recognition tasks. European Journal of Enterprise Technologies, 3(9), 12–16.
Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2002). Defining the concept of a feature in pattern recognition theory. Artificial Intelligence, 4, 512–517.
Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2003). Branched feature selection method in mathematical modeling of multi-level image recognition systems. Artificial Intelligence, 7, 246−249.
Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2004). Conceptual basis of image recognition systems based on the branched feature selection method. European Journal of Enterprise Technologies, 7(1), 13–15.