Information System of Semi-supervised Learning for Analysis of Data Samples With High Dimensions

2024;
: pp. 133 - 144
1
Kharkiv National University of Radio Electronics, Department of Systems Engineering
2
Kharkiv National University of Radio Electronics, Department of Systems Engineering
3
Kharkiv National University of Radio Electronics, Department of Systems Engineering

The study of large datasets to uncover hidden patterns and trends has become increasingly important and valuable in recent years. These large datasets are characterized by wide availability, structural complexity, and significant volume of information.

This article proposes a detailed description of a semi-supervised learning information system for analyzing high-dimensional data samples. The system is designed to process large datasets using semi- supervised learning methods for effective analysis and classification. Existing information systems capable of working with high-dimensional data samples, as well as methods for efficient analysis and classification of these data samples, were analyzed for this purpose.

The article provides a detailed description of the system architecture, including data processing methods, feature selection, preprocessing modules, and training optimization methods.

  1. Xu, L., Abidi, S.R. (2019) Intelligent health data analytics: A convergence of artificial intel- ligence and big data. Healthcare Management Forum 32(4), 178–182. https://doi.org/doi.org/10.1177/0840470419846134.
  2. Wang, X., & Calvanese, D. (2021, July). Editorial for Special Issue of Journal of Big Data Research on “Big Data Meets Knowledge Graphs.” Big Data Research, 25, 202–215. https://doi.org/10.1016/j.bdr.2021.100215.
  3. Philip Chen, C., & Han, R. (2022, June). Graph-based sparse bayesian broad learning system for semi- supervised learning. Information Sciences, 597, 193–210. https://doi.org/10.1016/j.ins.2022.03.037.
  4. Tran, Q. T., Alom, M. Z., & Orr, B. A. (2022, June 8). Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors. BMC Bioinformatics, 23(1), 313-319. https://doi.org/10.1186/s12859-022-04764-1.
  5. Yadav, S. K., & S., V. (2019, April 30). EEG Classification using Semi Supervised Learning. International Journal of Trend in Scientific Research and Development, Volume-3(Issue-3), 1441–1445. https://doi.org/10.31142/ ijtsrd23355.
  6. Nasrabadi, N. M. (2007). Pattern Recognition and Machine Learning. Journal of Electronic Imaging, 16(4), 172–178. https://doi.org/10.1117/1.2819119
  7. Nazirova, T. O., & Kostenko, O. B. (2018, October 25). Neural network information technology for processing medical data. Scientific Bulletin of UNFU, 28(8), 141–145. https://doi.org/10.15421/40280828.
  8. Lyrchykov, V. O., & Baybuz, O. H. (2022, December 25). Technology of extracting data on disease risks based on the analysis of electronic medical records. Actual Problems of Automation and Information Technology, 26(1), 118–129. https://doi.org/10.15421/432208.
  9. Nazirova, T. O., & Kostenko, O. B. (2018, October 25). Neural network information technology for processing medical data. Scientific Bulletin of UNFU, 28(8), 141–145. https://doi.org/10.15421/40280828.
  10. Key. (2020, July 13). Electronic information systems in medicine and biology: a general analysis. Medical Informatics and Engineering, 2, 111–123. https://doi.org/10.11603/mie.1996-1960.2020.2.11183.
  11. Humen, & Rachek. (2023, December 26). Neural networks and machine learning in data processing for space weather forecasting. Applied Questions of Mathematical Modeling, 6(2), 19–23. https://doi.org/10.32782/mathematical-modelling/2023-6-2-2