XIDINTV: XGBoost-based intrusion detection of imbalance network traffic via variational auto-encoder

: pp. 930–945
Received: January 08, 2024
Revised: August 15, 2024
Accepted: August 17, 2024

Abdulganiyu O. H., Ait Tchaoucht T., Ezziyyani M., Benslimane M.  XIDINTV: XGBoost-based intrusion detection of imbalance network traffic via variational auto-encoder. Mathematical Modeling and Computing. Vol. 11, No. 4, pp. 930–945 (2024)

Euromed University of Fes, UEMF, Morocco
Euromed University of Fes, UEMF, Morocco
Mathematical Laboratory and Applications, Abdelmalek Essaadi University Faculty of Science and Technology, Tangier, Morocco
Laboratory of Sciences, Engineering and Management, Sidi Mohamed Ben Abdellah University, Morocco

In networks characterized by imbalanced traffic, detecting malicious cyber-attacks poses a significant challenge due to their ability to blend seamlessly with regular data volumes.  This creates a formidable hurdle for Network Intrusion Detection Systems (NIDS) striving for accurate and timely identification.  The imbalance in normal and attack data, coupled with the diversity among attack categories, complicates intrusion detection.  This research proposes a novel approach to address this issue by combining Extreme Gradient Boosting with variational autoencoder (XIDINTV).  The methodology focuses on rectifying class imbalance by generating diverse rare-class attack data while maintaining similarities with the original samples.  This enhances the classifier's ability to discern differences during training, improving classification performance. Evaluations on NSL-KDD and CSE-CIC-IDS2018 datasets demonstrate the effectiveness of XIDINTV, particularly when compared to SMOTE sampling technique and traditional classification models, with Xtreme Gradient Boosting excelling in detecting rare instances of attack traffic.

  1. Abdulganiyu O. H., Ait Tchakoucht T., Saheed Y. K.  A systematic literature review for network intrusion detection system (IDS).  International Journal of Information Security.  22 (5), 1125–1162 (2023).
  2. Mallouk I., Abou el Majd B., Sallez Y.  A generic model of the information and decisional chain using Machine Learning based assistance in a manufacturing context.  Mathematical Modeling and Computing.  10 (4), 1023–1036 (2023).
  3. Khoroshchuk D., Liubinskyi B.  Machine learning in lung lesion detection caused by certain diseases.  Mathematical Modeling and Computing.  10 (4), 1084–1092 (2023).
  4. Merzouk S., Gandoul R., Marzak A., Sael N.  Toward new data for IT and IoT project management method prediction.  Mathematical Modeling and Computing.  10 (2), 557–565 (2023).
  5. Bridges R. A., Glass-Vanderlan T. R., Iannacone M. D., Vincent M. S., Chen Q.  A Survey of Intrusion Detection Systems Leveraging Host Data.  ACM Computing Surveys.  52 (6), 1–35 (2020).
  6. Abdulganiyu O. H., Tchakoucht T. A., Saheed Y. K.  Towards an efficient model for network intrusion detection system (IDS): systematic literature review.  Wireless Networks.  30, 453–482 (2024).
  7. Aldweesh A., Derhab A., Emam A. Z.  Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues.  Knowledge-Based Systems.  189, 105124 (2020).
  8. Kayode Saheed Y., Harazeem Abdulganiyu O., Ait Tchakoucht T.  A novel hybrid ensemble learning for anomaly detection in industrial sensor networks and SCADA systems for smart city infrastructures.  Journal of King Saud University – Computer and Information Sciences.  35 (5), 101532 (2023).
  9. Masdari M., Khezri H.  A survey and taxonomy of the fuzzy signature-based Intrusion Detection Systems.  Applied Soft Computing.  92, 106301 (2020).
  10. Masdari M., Khezri H.  Towards fuzzy anomaly detection-based security: a comprehensive review.  Fuzzy Optimization and Decision Making.  20 (1), 1–49 (2021).
  11. Gu J., Wang L., Wang H., Wang S.  A novel approach to intrusion detection using SVM ensemble with feature augmentation.  Computers & Security.  86, 53–62 (2019).
  12. Liu J., Gao Y., Hu F.  A fast network intrusion detection system using adaptive synthetic oversampling and LightGBM.  Computers & Security.  106, 102289 (2021).
  13. Nazir A., Khan R. A.  A novel combinatorial optimization based feature selection method for network intrusion detection.  Computers & Security.  102, 102164 (2021).
  14. Sohi S. M., Seifert J. P., Ganji F.  RNNIDS: Enhancing network intrusion detection systems through deep learning.  Computers & Security.  102, 102151 (2021).
  15. Zhang J., Ling Y., Fu X., Yang X., Xiong G., Zhang R.  Model of the intrusion detection system based on the integration of spatial-temporal features.  Computers & Security.  89, 101681 (2020).
  16. Selvakumar B., Muneeswaran K.  Firefly algorithm based feature selection for network intrusion detection.  Computers & Security.  81, 148–155 (2019).
  17. Mebawondu J. O., Alowolodu O. D., Mebawondu J. O., Adetunmbi A. O.  Network intrusion detection system using supervised learning paradigm.  Scientific African.  9, e00497 (2020).
  18. Wang Z., Liu Y., He D., Chan S.  Intrusion detection methods based on integrated deep learning model.  Computers & Security.  103, 102177 (2021).
  19. Ashiku L., Dagli C.  Network Intrusion Detection System using Deep Learning.  Procedia Computer Science.  185, 239–247 (2021).
  20. Bhati B. S., Rai C. S., Balamurugan B., Al-Turjman F.  An intrusion detection scheme based on the ensemble of discriminant classifiers.  Computers and Electrical Engineering.  86, 106742 (2020).
  21. Gu J., Lu S.  An effective intrusion detection approach using SVM with naГЇve Bayes feature embedding.  Computers & Security.  103, 102158 (2021).
  22. Jia H., Liu J., Zhang M., He X., Sun W.  Network intrusion detection based on IE-DBN model.  Computer Communications.  178, 131–140 (2021).
  23. Jeatrakul P., Wong K. K. W., Fung L. C. C.  Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm.  Neural Information Processing. Models and Applications.  152–159 (2010).
  24. Yan B., Han G.  LA-GRU: Building Combined Intrusion Detection Model Based on Imbalanced Learning and Gated Recurrent Unit Neural Network.  Security and Communication Networks.  2018, 6026878 (2018).
  25. Abdulhammed R., Faezipour M., Abuzneid A., Abumallouh A.  Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic.  IEEE Sensors Letters.  3 (1), 7101404 (2019).
  26. Chuang P.-J., Wu D. Y.  Applying Deep Learning to Balancing Network Intrusion Detection Datasets.  2019 IEEE 11th International Conference on Advanced Infocomm Technology (ICAIT).  213–217 (2019).
  27. Bedi P., Gupta N., Jindal V.  Siam-IDS: Handling class imbalance problem in Intrusion Detection Systems using Siamese Neural Network.  Procedia Computer Science.  171, 780–789 (2020).
  28. Hafiza Anisa A., Anum H., Narmeen Zakaria B.  Network intrusion detection using oversampling technique and machine learning algorithms.  PeerJ Computer Science.  8, e820 (2022).
  29. Zhang Y., Liu Q.  On IoT intrusion detection based on data augmentation for enhancing learning on unbalanced samples.  Future Generation Computer Systems.  133, 213–227 (2022).
  30. Andresini G., Appice A., Rose L. D., Malerba D.  GAN augmentation to deal with imbalance in imaging-based intrusion detection.  Future Generation Computer Systems.  123, 108–127 (2021).
  31. Kumar V., Sinha D.  Synthetic attack data generation model applying generative adversarial network for intrusion detection.  Computers & Security.  125, 103054 (2023).
  32. Yang Y., Gu Y., Yan Y.  Machine Learning-Based Intrusion Detection for Rare-Class Network Attacks.  Electronics.  12 (18), 3911 (2023).
  33. Liu L., Wang P., Lin J., Liu L.  Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning.  IEEE Access.  9, 7550–7563 (2021).
  34. Balla A., Habaebi M. H., Elsheikh E. A. A., Islam M. R., Suliman F. M.  The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems.  Sensors.  23 (2), 758 (2023).
  35. Talukder M. A., Hasan F., Islam M., Uddin M. A., Akhter A., Yousuf M., Alharbi F., Moni M. A.  A dependable hybrid machine learning model for network intrusion detection.  Journal of Information Security and Applications.  72, 103405 (2023).
  36. Lavanya T., Rajalakshmi K.  Heterogenous ensemble learning driven multi-parametric assessment model for hardware Trojan detection.  Integration.  89, 217–228 (2023).
  37. Douzas G., Bacao F.  Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE.  Information Sciences.  501, 118–135 (2019).
  38. Zhu M., Ye K., Wang Y., Xu C.-Z.  A Deep Learning Approach for Network Anomaly Detection Based on AMF-LSTM.  Network and Parallel Computing.  137–141 (2018).
  39. Chawla N. V., Bowyer K. W., Hall L. O., Kegelmeyer W. P.  SMOTE: synthetic minority over-sampling technique.  Journal of Artificial Intelligence Research.  16 (1), 321–357 (2002).
  40. Onah J. O., Abdulhamid S. M., Abdullahi M., Hassan I. H., Al-Ghusham A.  Genetic Algorithm based feature selection and Naïve Bayes for anomaly detection in fog computing environment.  Machine Learning with Applications.  6, 100156 (2021).
  41. Gupta N., Jindal V., Bedi P.  LIO-IDS: Handling class imbalance using LSTM and improved one-vs-one technique in intrusion detection system.  Computer Networks.  192, 108076 (2021).
  42. Li X., Yi P., Wei W., Jiang Y., Tian L.  LNNLS-KH: A Feature Selection Method for Network Intrusion Detection.  Security and Communication Networks.  2021, 8830431 (2021).