FEATURE ENGINEERING FOR THE IMPLEMENTATION OF MACHINE LEARNING IN CLINICAL DATA PROCESSING

2024;
: 162-171
1
Kharkiv National University of Radio Electronics
2
Kharkiv National University of Radio Electronics

This paper presents a study of feature engineering for the application of machine learning (ML) in clinical data processing, focusing on binary classification of time series data. The study demonstrates the effectiveness of using the Haar transform to enhance feature importance and improve classification performance. The Haar transform allows for increased predictive accuracy by augmenting the weight of significant features, which is especially crucial in handling complex clinical data. The research results show a substantial increase in the area under the receiver operating characteristic curve (AUC-ROC) from 0.44 for the baseline model to 0.82 for the Haar transform model, indicating significant improvements in predictive accuracy. The methodology described in the paper encompasses various stages, including data preprocessing, model training using the XGBoost algorithm, and performance evaluation via AUC-ROC curves. Data preprocessing involves cleaning and normalizing the data, critical steps to ensure high-quality machine learning outcomes. Special attention is given to using Internet of Things (IoT) data in clinical settings, which opens new possibilities for predictive analytics and decision-making in healthcare. The approaches described in the paper can be utilized to analyze large amounts of information collected from various medical devices connected to the IoT network. This allows for more accurate predictions and informed decisions based on real data, contributing improving of medical services and patient care quality. The research results underscore the potential of machine learning methods in healthcare institutions to enhance predictive accuracy and decision-making. Future research directions may include exploring additional feature engineering methods and using advanced machine learning algorithms to further increase the utility of clinical IoT data analytics. In particular, exploring the possibilities of deep learning and neural networks may open new horizons for clinical data analysis and processing.

[1]   T. Thompson, "P2933 - Standard for Clinical Internet of Things (IoT) Data and Device Interoperability with TIPPSS - Trust, Identity, Privacy, Protection, Safety, Security," IEEE EMBC, 21 05 2019. [Online]. Available: https://standards.ieee.org/project/2933.html.

[2]   M. Saqib, Y. Sha and M. D. Wang, "Early Prediction of Sepsis in EMR Records Using Traditional ML Techniques and Deep Learning LSTM Networks," 2018 40th Annual International Conference of the IEEE EMBC, 2018, pp. 4038-4041.

[3]   K. Järvinen, "Voice Activity Detector (VAD) for Enhanced Full Rate (EFR) speech traffic channels," 21 07 2020. [Online]. Available: https://portal.3gpp.org/...2753. [Accessed 16 12 2024].

[4]   V. M. Bezruk, S. A. Krivenko, M. B. Samochernov, L. S. Kryvenko and S. S. Krivenko, "Model Discrete Wavelet Transform for Clinical IoT Data and Device Interoperability," 2022 IEEE 16th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine, 2022, pp. 64-69, doi:10.1109/TCSET55632.2022.9767044.

[5]   [K. Järvinen, "Test sequences for the GSM Enhanced Full Rate (EFR) speech codec," 08 04 2022. [Online].

[6]   [K. Järvinen, "Enhanced Full Rate (EFR) speech transcoding," 08 04 2022. [Online]. Available: https://portal.3gpp.org//....2748. [Accessed 14 02 2024].

[7]   K. Järvinen, "ANSI-C code for the GSM Enhanced Full Rate (EFR) speech codec," 08 04 2022. [Online].

[8]   Pulavskyi A., Krivenko S., Krivenko S. “The computation of line spectral frequencies using discrete wavelet transform for electrocardiograms processing”, IEEE 36th International Conference on ELNANO, 2016, pp. 202-205.