Advanced text-based transformer architecture for malicious social bots detection

2025;
: pp. 972–981
Received: March 28, 2025
Revised: September 28, 2025
Accepted: September 29, 2025

Ellaky Z., Benabbou F.  Advanced text-based transformer architecture for malicious social bots detection.  Mathematical Modeling and Computing. Vol. 12, No. 3, pp. 972–981 (2025)

1
Laboratory of Information Technology and Modeling, Hassan II University of Casablanca Faculty of Sciences Ben M'Sick
2
Laboratory of Information Technology and Modeling, Hassan II University of Casablanca Faculty of Sciences Ben M'Sick

The increasing prevalence of automated social media accounts, or Social Media Bots (SMBs), presents significant challenges in maintaining authentic online discourse and preventing disinformation campaigns on social platforms.  This research introduces a novel multiclass classification framework for detecting and categorizing SMBs, leveraging fine-tuned transformer-based models.  In this study, we conducted a comprehensive comparative analysis of various transformer variants, including BERT, DistilBERT, RoBERTa, DeBERTa, XLNet, and ALBERT, to evaluate their efficacy in recognizing diverse types of social bots, such as spambots, politically motivated SMBs, Sybil-type accounts, fraudulent and fake accounts, as well as legitimate human users.  The empirical findings indicate that the proposed methodology substantially outperforms traditional machine learning and deep learning approaches. Notably, the DistilBERT architecture demonstrated exceptional performance metrics, achieving 96.83% accuracy and 96.85% precision in social bot classification.

  1. Ellaky Z., Benabbou F., Ouahabi S., Sael N.  Word Embedding for Social Bot Detection Systems.  2021 Fifth International Conference On Intelligent Computing in Data Sciences (ICDS). 1–8 (2021).
  2. Ellaky Z., Benabbou F., Ouahabi S., Sael N.  A Survey of Spam Bots Detection in Online Social Networks.  2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA). 58–65 (2021).
  3. Ellaky Z., Benabbou F.  Political social media bot detection: Unveiling cutting-edge feature selection and engineering strategies in machine learning model development.  Scientific African.  25, e02269 (2024).
  4. Goyal B., Gill N. S., Gulia P., Prakash O., Priyadarshini I., Sharma R., Obaid A. J., Yadav K.  Detection of Fake Accounts on Social Media Using Multimodal Data With Deep Learning.  IEEE Transactions on Computational Social Systems.  1–12 (2024).
  5. Cresci S., Pietro R. D., Petrocchi M., Spognardi A., Tesconi M.  The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race.  WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion.  963–972 (2017).
  6. Benabbou F., Boukhouima H., Sael N.  Fake accounts detection system based on bidirectional gated recurrent unit neural network.  International Journal of Electrical and Computer Engineering.  12 (3), 3129 (2022).
  7. Feng S., Wan H., Wang N., Li J., Luo M.  TwiBot-20: A Comprehensive Twitter Bot Detection Benchmark.  CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4485–4494 (2021).
  8. Fazil M., Sah A. K., Abulaish M.  DeepSBD: A Deep Neural Network Model With Attention Mechanism for SocialBot Detection.  IEEE Transactions on Information Forensics and Security.  16, 4211–4223 (2021).
  9. Wei F., Nguyen U. T.  Twitter bot detection using bidirectional long short-term memory neural networks and word embeddings.  2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). 101–109 (2019).
  10. Heidari M., Zad S., Hajibabaee P., Malekzadeh M., HekmatiAthar S., Uzuner O., Jones J. H.  Bert model for fake news detection based on social bot activities in the Covid-19 pandemic.  2021 IEEE 12th Annual Ubiquitous Computing, Electronics \& Mobile Communication Conference (UEMCON).  0103–0109 (2021).
  11. Purba K. R., Asirvatham D., Murugesan R. K.  Classification of instagram fake users using supervised machine learning algorithms.  International Journal of Electrical and Computer Engineering.  10 (3), 2763–2772 (2020).
  12. Heidari M., Jones J. H., Uzuner O.  Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter.  2020 International Conference on Data Mining Workshops (ICDMW).  480–487 (2020).
  13. Kumar S., Garg S., Vats Y., Parihar A. S.  Content Based Bot Detection using Bot Language Model and BERT Embeddings.  2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP). 285–289 (2021).
  14. Guo Q., Xie H., Li Y., Ma W., Zhang C.  Social bots detection via fusing bert and graph convolutional networks.  Symmetry.  14 (1), 30 (2021).
  15. Hayawi K., Mathew S., Venugopal N., Masud M. M., Ho P.-H.  DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data.  Social Network Analysis and Mining.  12 (1), 43 (2022).
  16. Hayawi K., Shahriar S., Serhani M. A., Taleb I., Mathew S. S.  ANTi-Vax: a novel Twitter dataset for COVID-19 vaccine misinformation detection.  Public Health.  203, 23–30 (2022).
  17. Messai A., Hamida Z. F., Drif A., Giordano S.  Multi-input BiLSTM deep learning model for social bot detection.  2023 International Conference on Advances in Electronics, Control and Communication Systems (ICAECCS).  1–6 (2023).
  18. Martin-Gutierrez D., Hernandez-Penaloza G., Hernandez A. B., Lozano-Diez A., Alvarez F.  A Deep Learning Approach for Robust Detection of Bots in Twitter Using Transformers.  IEEE Access.  9, 54591–54601 (2021).
  19. Ilias L., Kazelidis I. M., Askounis D.  Multimodal Detection of Bots on X (Twitter) Using Transformers.  IEEE Transactions on Information Forensics and Security.  19, 7320–7334 (2024).
  20. Ellaky Z., Benabbou F., Matrane Y., Qaqa S.  A Hybrid Deep Learning Architecture for Social Media Bots Detection Based on BiGRU-LSTM and GloVe Word Embedding.  IEEE Access.  12, 100278–100294 (2024).
  21. Ellaky Z., Benabbou F., Ouahabi S.  Systematic Literature Review of Social Media Bots Detection Systems.  Journal of King Saud University – Computer and Information Sciences.  35 (5), 101551 (2023).
  22. Devlin J., Chang M. W., Lee K., Toutanova K.  BERT: Pre-training of deep bidirectional transformers for language understanding.  Preprint arXiv:1810.04805 (2018).
  23. Rothman D.  Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more. Packt Publishing Ltd (2021).
  24. Delobelle P., Winters T., Berendt B.  RobBERT: a Dutch RoBERTa-based Language Model.  Findings of the Association for Computational Linguistics: EMNLP 2020.  3255–3265 (2020).
  25. Sanh V., Debut L., Chaumond J., Wolf T.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.  Preprint arXiv:1910.01108 (2019).
  26. Pujari S. C., Friedrich A., Strotgen J.  A Multi-task Approach to Neural Multi-label Hierarchical Patent Classification Using Transformers.  Advances in Information Retrieval. 513–528 (2021).
  27. Cresci S., Pietro R. D., Petrocchi M., Spognardi A., Tesconi M.  Fame for sale: Efficient detection of fake Twitter followers.  Decision Support Systems.  80, 56–71 (2015).
  28. Yang K., Varol O., Davis C. A., Ferrara E., Flammini A., Menczer F.  Arming the public with artificial intelligence to counter social bots.  Human Behavior and Emerging Technologies.  1 (1), 48–61 (2019).