Predicting students' academic performance and modeling using data mining techniques

2024;
: pp. 814–825
https://doi.org/10.23939/mmc2024.03.814
Received: January 07, 2024
Revised: August 13, 2024
Accepted: August 15, 2024

Jedidi Y., Ibriz A., Benslimane M., Hachmoud A., Tmimi M., Hajjioui Y., Rahhali M.  Predicting students' academic performance and modeling using data mining techniques.  Mathematical Modeling and Computing. Vol. 11, No. 3, pp. 814–825 (2024)

1
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
2
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
3
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
4
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
5
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
6
Innovative Technologies Laboratory, EST, Sidi Mohamed Ben Abdellah University, Fez
7
ENSA, Sidi Mohamed Ben Abdellah University, Fez

In educational institutions and universities, the issue of study interruptions can be addressed by using e-learning.  As a result, this field has recently attracted a lot of attention.  In this study, we applied four machine-learning methods to predict students' academic progress: logistic regression, decision trees, random forests, and Naive Bayes.  The Open University Learning Analytics Dataset (OULAD), which contains a subset of the OU student data, was the source of the student data for all of these techniques.  There is information regarding the students' VLE interactions as well as their demographics.  Nowadays universities frequently use data mining techniques to analyze available data and extract knowledge and information that helps in decision making.  The percentage split and the 10-fold cross-validation are used to measure and compare the prediction performance of four classifiers.  When employing the percentage split, it was shown that the Naive Bayes classifier performs better than other classifiers, obtaining an overall prediction accuracy of 93%.  This study aims to assist teachers in enhancing students' academic performance.

  1. Istanbullu A., Karahasan M.  A new student performance analysing system using knowledge discovery in higher educational databases.  Computers & Education.  55 (1), 247–254 (2010).
  2. Mohamad S. K., Tasir Z.  Educational Data Mining: A Review.  Procedia – Social and Behavioral Sciences.  97, 320–324 (2013).
  3. Khoroshchuk D., Liubinskyi B.  Machine learning in lung lesion detection caused by certain diseases.  Mathematical Modeling and Computing.  10 (4), 1084–1092 (2023).
  4. Al-Radaideh A. Q., Al-Shawakfa M. E., Al-Najjar I. M.  Mining Student Data Using Decision Trees.  The 2006 International Arab Conference on Information Technology (ACIT'2006).  1–5 (2006).
  5. Hand J. D.  Principles of Data Mining.  A Bradford Book. The MIT Press (2001).
  6. Mallouk I., Abou el Majd B., Sallez Y.  A generic model of the information and decisional chain using Machine Learning based assistance in a manufac-turing context.  Mathematical Modeling and Computing.  10 (4), 1023–1036 (2023).
  7. Shahiri M. A., Husain W., Rashid A. N.  A Review on Predicting Student's Performance Using Data Mining Techniques.  Procedia Computer Science.  72, 414–422 (2015).
  8. Christian M. T., Ayub M.  Exploration of classification using NBTree for predicting students' performance.  International Conference on Data and Software Engineering (ICODSE).  1–6 (2014).
  9. Nguyen Thi Ngoc Hien, Haddawy P.  A decision support system for evaluating international student applications.  2007 37th Annual Frontiers In Education Conference – Global Engineering: Knowledge Without Borders, Opportunities Without Passports. F2A-1–F2A-6 (2007).
  10. Arsad M. P., Buniyamin N., Manan A. J.  A neural network students' perfor-mance prediction model (NNSPPM).  IEEE International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA).  1–5 (2013).
  11. Romero C., López I. M., Luna M. J., Ventura S.  Predicting students' final performance from participation in on-line discussion forums.  Computers and Education.  68, 458–472 (2013).
  12. Aldowah H., Al-Samarraie H., Fauzy M. W.  Educational data mining and learn-ing analytics for 21st century higher education: A review and synthesis.  Telematics and Informatics.  37, 13–49 (2019).
  13. Mishra T., Kumar D., Gupta S.  Mining Students' Data for Prediction Performance.  2014 Fourth International Conference on Advanced Computing & Communication Technologies.  255–262 (2014).
  14. Quadri N. M. M.  Drop Out Feature of Student Data for Academic Performance Using Decision Tree Techniques.  Global Journal of Computer Science and Technology.  10 (2), 2–5 (2010).
  15. Natek S., Zwilling M.  Student data mining solution–knowledge management system related to higher education institutions.  Expert Systems with Applications.  41 (14), 6400–6407 (2014).
  16. Gray G., McGuinness C., Owende P.  An application of classification models to predict learner progression in tertiary education.  2014 IEEE International Advance Computing Conference (IACC). 549–554 (2014).
  17. Kuzilek J., Hlosta M., Zdrahal Z.  Open University Learning Analytics dataset.  Scientific Data.  4 (1), 170171 (2017).
  18. El-Hafeez A. T., Omar A.  Student Performance Prediction Using Machine Learning Techniques.  In Review, preprint (2022).
  19. Al-Radaideh A. Q, Al-Shawakfa M. E, Al-Najjar I. M.  Mining Student Data Using Decision Trees.  The 2006 International Arab Conference on Information Technology (2006).
  20. Jindal R., Borah D.  A Survey on Educational Data Mining and Research Trends.  International Journal of Database Management Systems.  5 (3), 53–73 (2013).
  21. Marrakchi N., Bergam A., Fakhouri H., Kenza K.  A hybrid model for predicting air quality combining Holt–Winters and Deep Learning Approaches: A novel method to identify ozone concentration peaks.  Mathematical Modeling and Computing.  10 (4), 1154–1163 (2023).
  22. Dreiseitl S., Ohno-Machado L.  Logistic regression and artificial neural network classification models: a methodology review.  Journal of Biomedical Informatics.  35 (5–6), 352–359 (2002).
  23. El Naqa I., Murphy J. M., Martin J.  What Is Machine Learning?  Machine Learning in Radiation Oncology.  3–11 (2015).
  24. Pedregosa I., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay É.  Scikit-learn: Machine Learning in Python.  Journal of Machine Learning Research.  12 (85), 2825–2830 (2011).