The problem of constructing a model of logical classification trees based on a limited method of selecting elementary features for geological data arrays is considered. A method for approximating an array of real data with a set of elementary features with a fixed criterion for stopping the branching procedure at the stage of constructing a classification tree is proposed. This approach allows to ensure the necessary accuracy of the model, reduce its structural complexity, and achieve the necessary performance indicators. A limited method for constructing classification trees has been developed, which is aimed at completing only those paths (tiers) of the classification tree structure where there are the greatest number of errors (of all types) of classification. This approach to synthesizing the recognition model makes it possible to effectively regulate the complexity (accuracy) of the classification tree model that is being built, and it is advisable to use it in situations with restrictions on the hardware resources of the information system, restrictions on the accuracy and structural complexity of the model, restrictions on the structure, sequence and depth of recognition of the training sample data array. The limited scheme of synthesis of classification trees allows to build models almost 20 % faster. The constructed logical classification tree will accurately classify (recognize) the entire training sample that the model is based on, will have a minimal structure (structural complexity), and will consist of components – sets of elementary features as design vertices, tree attributes. Based on the proposed modification of the elementary feature selection method, software has been developed that allows working with a set of different types of applied problems. An approach to synthesizing new recognition models based on a limited logic tree scheme and selecting pre-pruning parameters is proposed. In other words, an effective scheme for recognizing discrete objects has been developed based on step-by-step evaluation and selection of sets of attributes (generalized features) based on selected paths in the classification tree structure at each stage of scheme synthesis.
- Bodyanskiy, Y., Vynokurova, O., Setlak, G. & Pliss, I. (2015). Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm, Xth Scien. and Tech. Conf. "Computer Sciences and Information Technologies" (CSIT), Lviv, 111-114. https://doi.org/10.1109/STC-CSIT.2015.7325445
- Breiman, L. L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Boca Raton, Chapman and Hall/CRC, 368 p.
- De Mántaras, R. L. (1991). A distance-based attribute selection measure for decision tree induction. Machine learning, 6(1), 81–92. https://doi.org/10.1007/BF00153761
- Deng, H., Runger, G., & Tuv, E. (2011). Bias of importance measures for multi-valued attributes and solutions, Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN), 293–300. https://doi.org/10.1007/978-3-642-21738-8_38
- Deng, H., Runger, G., & Tuv, E. (2011). Bias of importance measures for multi-valued attributes and solutions, 21st International Conference on Artificial Neural Networks (ICANN), Espoo, 14–17 June 2011: proceedings. Berlin, Springer-Verlag, 2, 293–300. https://doi.org/10.1007/978-3-642-21738-8_38
- Hastie, T., Tibshirani, R., & Friedman, J. (2008). The Elements of Statistical Learning. Berlin, Springer, 768.
- Kamiński, B., Jakubczyk, M., & Szufel, P. (2017). A framework for sensitivity analysis of decision trees. Central European Journal of Operations Research, 26(1), 135–159. https://doi.org/10.1007/s10100-017-0479-6
- Karimi, K. L., & Hamilton, H. J. (2011). Generation and Interpretation of Temporal Decision Rules. International Journal of Computer Information Systems and Industrial Management Applications, 3, 314–323.
- Koskimaki, H., Juutilainen, I., Laurinen, P., & Roning, J. (2008). Two-level clustering approach to training data instance selection: a case study for the steel industry, Neural Networks: International Joint Conference (IJCNN-2008), Hong Kong, 1–8 June 2008: proceedings. Los Alamitos, IEEE, 3044–3049. https://doi.org/10.1109/IJCNN.2008.4634228
- Kotsiantis, S. B. (2007). Supervised Machine Learning: A Review of Classification Techniques. Informatica, 31, 249–268.
- Laver, V. O., & Povkhan, I. F. (2019). The algorithms for constructing a logical tree of classification in pattern recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 30(69), 4, 100–106. https://doi.org/10.32838/2663-5941/2019.4-1/18
- Lupei, M., Mitsa, A., Repariuk, V., & Sharkan, V. (2020). Identification of authorship of Ukrainian-language texts of journalistic style using neural networks. Eastern-European Journal of Enterprise Technologies, 2(103), 30–36. https://doi.org/10.15587/1729-4061.2020.195041
- Miyakawa, M. (1989). Criteria for selecting a variable in the construction of efficient decision trees. IEEE Transactions on Computers, 38(1), 130–141. https://doi.org/10.1109/12.8736
- Painsky, A., & Rosset, S. (2017). Cross-validated variable selection in tree-based methods improves predictive performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(11), 2142–2153. https://doi.org/10.1109/TPAMI.2016.2636831
- Povhan, I. (2016). Designing of recognition system of discrete objects, IEEE First International Conference on Data Stream Mining & Processing (DSMP), Ukraine. Lviv, 226–231.
- Povhan, I. (2019). General scheme for constructing the most complex logical tree of classification in pattern recognition discrete objects. Collection of scientific papers Electronics and information technologies, Lviv, 11, 112–117. https://doi.org/10.30970/eli.11.7
- Povhan, I. F. (2019). The problem of general estimation of the complexity of the maximum constructed logical classification tree. Bulletin of the national technical University Kharkiv Polytechnic Institute, 13, 104−117. https://doi.org/10.20998/2411-0558.2019.13.10
- Povkhan, I. (2020). Classification models of flood-related events based on algorithmic trees. Eastern-European Journal of Enterprise Technologies, 6-4(108), 58–68. https://doi.org/10.15587/1729-4061.2020.219525
- Povkhan, I. F. (2018). The problem of functional evaluation of a training sample in discrete object recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 29(68), 6, 217–222.
- Povkhan, I. F. (2019). Features of synthesis of generalized features in the construction of recognition systems using the logical tree method, Materials of the international scientific and practical conference "Information technologies and computer modeling ITKM-2019". Ivano-Frankivsk, 169–174.
- Povkhan, I. F. (2019). Features random logic of the classification trees in the pattern recognition problems. Scientific notes of the Tauride national University. Series: technical Sciences, 30(69), 5, 152–161. https://doi.org/10.32838/2663-5941/2019.5-1/22
- Quinlan, J. R. (1986). Induction of Decision Trees, Machine Learning, 1, 81–106. https://doi.org/10.1007/BF00116251
- Srikant, R., Agrawal, R. (1997). Mining generalized association rules. Future Generation Computer Systems, 13(2), 61–180. https://doi.org/10.1016/S0167-739X(97)00019-8
- Subbotin, S. (2013). The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition. Optical Memory and Neural Networks (Information Optics), 22(2), 97–103. https://doi.org/10.3103/S1060992X13020082
- Subbotin, S. A. (2013). Methods of sampling based on exhaustive and evolutionary search. Automatic Control and Computer Sciences, 47(3), 113–121. https://doi.org/10.3103/S0146411613030073
- Subbotin, S. A. (2014). Methods and characteristics of localitypreserving transformations in the problems of computational intelligence. Radio Electronics, Computer Science, Control, 1, 120–128. https://doi.org/10.15588/1607-3274-2014-1-17
- Subbotin, S. A. (2019). Construction of decision trees for the case of low-information features. Radio Electronics, Computer Science, Control, 1, 121–130. https://doi.org/10.15588/1607-3274-2019-1-12
- Subbotin, S., & Oliinyk, A. (2017). The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis, Recent Advances in Systems, Control and Information Technology, [eds.: R. Szewczyk, M. Kaliczyńska]. Cham, Springer, 11–19. https://doi.org/10.1007/978-3-319-48923-0_2
- Vasilenko, Y. A., Vashuk, F. G., & Povkhan, I. F. (2011). The problem of estimating the complexity of logical trees recognition and a general method for optimizing them. European Journal of Enterprise Technologies, 6/4(54), 24–28.
- Vasilenko, Y. A., Vashuk, F. G., & Povkhan, I. F. (2012). General estimation of minimization of tree logical structures. European Journal of Enterprise Technologies, 1/4(55), 29–33.
- Vasilenko, Y. A., Vashuk, F. G., Povkhan, I. F., Kovach, M. Y., & Nikarovich, O. D. (2004). Minimizing logical tree structures in image recognition tasks. European Journal of Enterprise Technologies, 3(9), 12–16.
- Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2002). Defining the concept of a feature in pattern recognition theory. Artificial Intelligence, 4, 512–517.
- Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2003). Branched feature selection method in mathematical modeling of multi-level image recognition systems. Artificial Intelligence, 7, 246−249.
- Vasilenko, Y. A., Vasilenko, E. Y., & Povkhan, I. F. (2004). Conceptual basis of image recognition systems based on the branched feature selection method. European Journal of Enterprise Technologies, 7(1), 13–15.