The evaluation of scientific publications is a cornerstone of scholarly research, providing essential insights into the impact, significance, and intellectual contributions of research outputs. Traditional bibliometric indicators, including Impact Factor (IF), h-index, and citation counts, have historically been the dominant measures to assess research quality. However, with the rapid evolution of Artificial Intelligence (AI) and its increasing integration into various scientific disciplines, these conventional evaluation methodologies are being reevaluated due to their inherent limitations and lack of adaptability to modern research landscapes. This study conducts an in-depth analysis of both citation-based and content-driven assessment methodologies, critically assessing their efficacy in contemporary academic evaluation. The study highlights the fundamental constraints of traditional bibliometric metrics, particularly their inability to fully encapsulate the qualitative dimensions of scientific contributions. In response, we explore state-of-the-art AI-driven approaches, employing advanced Natural Language Processing (NLP) and machine learning (ML) techniques to refine and enhance the precision, contextual understanding, and multidimensional assessment of research quality. Building on these advancements, we propose a novel AI-based assessment metric aimed at addressing the deficiencies of conventional approaches, offering a more comprehensive and nuanced evaluation framework. Empirical case studies are used to validate the practical implementation of this metric, allowing for direct performance comparisons with established bibliometric techniques. In addition, we examine the broader implications of AI-driven assessment methodologies, discussing ethical considerations, algorithmic biases, and the challenges associated with their large-scale adoption. The study concludes by advocating for the integration of AI-enhanced assessment tools in academic publishing and research evaluation, providing strategic recommendations for researchers, journal editors, funding agencies, and policymakers to promote a more rigorous, equitable, and transparent scientific assessment ecosystem.
- Mifrah S., Hourrane O., Benlahmar E. H., Rachdi M. Citation Sentiment Analysis: A Brief Comprehensive Study. Journal of Scientometric Research. 3, 145–156 (2017).
- Carobene A,, Padoan A., Cabitza F., Banfi G., Plebani M. Rising adoption of artificial intelligence in scientific publishing: evaluating the role, risks, and ethical implications in paper drafting and review process. Clinical Chemistry and Laboratory Medicine. 62 (5), 835–843 (2024).
- Bahammam A. S., Trabelsi K., Pandi-Perumal S. R., Jahrami H. Adapting to the Impact of Artificial Intelligence in Scientific Writing: Balancing Benefits and Drawbacks while Developing Policies and Regulations. Journal of Nature and Science of Medicine. 6 (3), 152–158 (2023).
- Mifrah S., Ben Lahmar E. H. Semantic Relationship Study between Citing and Cited Scientific Articles Using Topic Modeling. BDIoT '19: Proceedings of the 4th International Conference on Big Data and Internet of Things. Article No. 5, 1–8 (2020).
- Iqbal S., Hassan S.-U., Aljohani N. R., Alelyani S., Nawaz R., Bornmann L. A Decade of In-text Citation Analysis based on Natural Language Processing and Machine Learning Techniques: An overview of empirical studies. Scientometrics. 126, 6551–6599 (2021).
- Fassin Y. The $h_a$-index: The average citation $h$-index. Quantitative Science Studies. 4 (3), 756–777 (2023).
- Shen Z., Li L., Liao Y. The Unique Citing Documents Journal Impact Factor (Uniq-JIF) as a Supplement for the standard Journal Impact Factor. Journal of Data and Information Science. 9 (3), 1–3 (2024).
- Liu Y. Addressing the challenges in journal evaluation during the "covidization" of scientific research era: insights from the CAS journal ranking. Humanities and Social Sciences Communications. 12, 363 (2025).
- Kaptay G. The k-index is introduced to replace the h-index to evaluate better the scientific excellence of individuals. Heliyon. 6 (7), e04415 (2020).
- Meštrović R., Dragović B. Extensions of Egghe's g-index: Improvements of Hirsch h-index. Preprint arXiv:2303.10011 (2023).
- Wohlrabe K., Bornmann L. The CalculAuthor: determining authorship using a simple-to-use, fair, objective, and transparent process. BMC Research Notes. 16, 329 (2023).
- Thelwall M., Kousha K., Abdoli M., Stuart E., Makita M., Wilson P., Levitt J. Do altmetric scores reflect article quality? Evidence from the UK Research Excellence Framework 2021. Journal of the Association for Information Science and Technology. 74 (5), 582–593 (2023).
- Arroyo-Machado W., Torres-Salinas D. Evaluative altmetrics: is there evidence for its application to research evaluation? Frontiers in Research Metrics and Analytics. 8, 1188131 (2023).
- Gholampour S. Does social media contribute to research impact? An Altmetric study of online engagement and scholarly influence. Total Quality Management & Business Excellence. 35 (13–14), 1671–1701 (2024).
- Jarić I., Pipek P, Novoa A. A call for broadening the altmetrics tent to democratize science. PLOS Biology. 23 (2), e3003010 (2025).
- Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers. Proceedings of NAACL-HLT 2019. 4171–4186 (2018).
- Sachini E., Sloumalas-Christodoulou K., Christopoulos S., Karampekios N. AI for AI: Using AI methods for classifying AI science documents. Quantitative Science Studies. 3 (4), 1119–1132 (2022).
- Lahiri A., Sanyal D. K., et al. CitePrompt: Using Prompts to Identify Citation Intent in Scientific Papers. 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL). 51–55 (2023).
- Mercier D., Rizvi S. T. R., Rajashekar V., Dengel A., Ahmed S. ImpactCite: An XLNet-based method for Citation Impact Analysis. Preprint arXiv:2005.06611 (2020).
- Rogers A., Kovaleva O., Rumshisky A. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics. 8, 842–866 (2020).
- Beltagy I., Lo K., Cohan A. SciBERT: A Pretrained Language Model for Scientific Text. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3615–3620 (2019).
- Cohan A., Goharian N. Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge. SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1133–1136 (2017).
- Marrhich A., Lafram I., Berbiche N. Multi-task learning with BERT, RoBERTa, GPT-3.5, ELECTRA, and XLNet for urgency classification, topic similarity, and sentiment analysis in MOOCs. Ingénierie des Systèmes d'Information. 29 (5), 1891–1901 (2024).
- Grootendorst M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. Preprint arXiv:2203.05794 (2022).
- Zahrotun L. Comparison Jaccard similarity, Cosine Similarity and Combined Both of the Data Clustering With Shared Nearest Neighbor Method. Computer Engineering and Applications. 5 (1), 11–18 (2016).
- Brown T., Mann B., Ryder N., et al. Language models are few-shot learners. Advances in Neural Information Processing Systems (NeurIPS 2020). 33, 1877–1901 (2020).
- Resnik D. B., Hosseini M. The ethics of using artificial intelligence in scientific research: new guidance needed for a new tool. AI and Ethics. 5, 1499–1521 (2024).
- Rethlefsen M. L., Norton H. F., Meyer S. L., MacWilkinson K. A., Smith II P. L., Ye H. Interdisciplinary Approaches and Strategies from Research Reproducibility 2020: Educating for Reproducibility. Journal of Statistics and Data Science Education. 30 (3), 219–227 (2022).
- Trotta A., Ziosi M., Lomonaco V. The future of ethics in AI: challenges and opportunities. AI & Society. 38 (2), 439–441 (2023).
- Feng B. Y. Research on the Application Effects of Artificial Intelligence in Personalized Marketing. Journal of Computer and Communications. 12 (11), 120–129 (2024).