This study explores an approach to improving the performance of logistic regression model (LR) integrated with Analytic Hierarchy Process (AHP) for weight initialization model with regularization and adaptation of gradient descent (GD). Traditional LR model relies on random weight initialization leading to suboptimal performances. By employing AHP, a hybrid model that deployed priority vector as initial weights is obtained, reflecting the relative importance of input features. Previous works reported subpar performances of AHP-LR hybrid model due to the lack of optimizing for the initialized weights. In this study, the weights are proposed to be optimized with L1 and L2 regularization approach, penalizing deviations from the AHP-initialized weights through modified log-likelihood function with modified GD optimization. This comparative analysis involves four models: LR with L2 regularization, AHP weights as LR weights, and AHP-weights optimized with L1 and L2 regularization. A prediction experiment is conducted using synthetic dataset to assess the models' performance in terms of accuracy, recall, precision, F1-score, and ROC-AUC. The results indicate that optimizing weights with L1 or L2 regularization significantly enhances model performance, compared to direct application of AHP weights without optimization yields near-random guesses. Additionally, incorporating true expert-derived weights, evaluating their impact on model performance and experimenting with authentic dataset and different weight derivation methods would offer valuable insights.
- Kim B., Shin S. J. Principal weighted logistic regression for sufficient dimension reduction in binary classification. Journal of the Korean Statistical Society. 48 (2), 194–206 (2019).
- Huang F., Cao Z., Guo J., Jiang S.-H., Li S., Guo Z. Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. CATENA. 191, 104580 (2020).
- Bui D. T., Tsangaratos P., Nguyen V.-T., Liem N. V., Trinh P. T. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. CATENA. 188, 104426 (2020).
- Pham B. T., Jaafari A., Prakash I., Bui D. T. A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling. Bulletin of Engineering Geology and the Environment. 78 (4), 2865–2886 (2019).
- Wu Y., Ke Y., Chen Z., Liang S., Zhao H., Hong H. Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. CATENA. 187, 104396 (2020).
- Ruan Z., Li D., Cheng X., Jin M., Liu Y., Qiu Z., Chen X. The association between sleep duration, respiratory symptoms, asthma, and COPD in adults. Frontiers in Medicine. 10, 1108663 (2023).
- Breed D. G., Verster T., Schutte W. D., Siddiqi N. Developing an Impairment Loss Given Default Model Using Weighted Logistic Regression Illustrated on a Secured Retail Bank Portfolio. Risks. 7 (4), 123 (2019).
- Maymin P. Z. Smart kills and worthless deaths: eSports analytics for League of Legends. Journal of Quantitative Analysis in Sports. 17 (1), 11–27 (2021).
- Chen Y., Annebicque D., Philippot A., Carré-Ménétrier V., Daneau T. Evaluation Methodology of Interoperability for the Industrial Domain: Standardization vs. Mediation. Processes. 11 (4), 1274 (2023).
- Chen C., Zhou J., Wang L., Wu X., Fang W., Tan J., Wang L., Liu A. X., Wang H., Hong C. When Homomorphic Encryption Marries Secret Sharing: Secure Large-Scale Sparse Logistic Regression and Applications in Risk Control. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2652–2662 (2021).
- Renganathan V. Overview of artificial neural network models in the biomedical domain. Bratislava Medical Journal. 120 (7), 536–540 (2019).
- Boateng E. Y., Abaye D. A. A Review of the Logistic Regression Model with Emphasis on Medical Research. Journal of Data Analysis and Information Processing. 7 (4), 190–207 (2019).
- Sharma V., Hong T., Cecchi V., Hofmann A., Lee J. Y. Forecasting weather-related power outages using weighted logistic regression. IET Smart Grid. 6 (5), 470–479 (2023).
- Kang S. K., Peng L., Xiao H. Risk analysis with categorical explanatory variables. Insurance: Mathematics and Economics. 91, 238–243 (2020).
- Alsmadi I., Hoon G. K. Term weighting scheme for short-text classification: Twitter corpuses. Neural Computing and Applications. 31 (8), 3819–3831 (2019).
- Trinh T., Luu B. T., Le T. H. T., Nguyen D. H., Van Tran T., Van Nguyen T. H., Nguyen K. Q., Nguyen L. T. A comparative analysis of weight-based machine learning methods for landslide susceptibility mapping in Ha Giang area. Big Earth Data. 7 (4), 1005–1034 (2022).
- Maalouf M., Homouz D., Trafalis T. B. Logistic regression in large rare events and imbalanced data: A performance comparison of prior correction and weighting methods. Computational Intelligence. 34 (1), 161–174 (2018).
- Liu Y., Eckert C. M., Earl C. A review of fuzzy AHP methods for decision-making with subjective judgements. Expert Systems with Applications. 161, 113738 (2020).
- Darko A., Chan A. P. C., Ameyaw E. E., Owusu E. K., Pärn E., Edwards D. J. Review of application of analytic hierarchy process (AHP) in construction. International Journal of Construction Management. 19 (5), 436–452 (2019).
- Kusherbaeva V., Zhou N. Application of the analytic hierarchy process to plant operation optimization. 2019 International Conference on Computational Science and Computational Intelligence (CSCI). 1196–1202 (2019).
- Stofkova J., Krejnus M., Stofkova K. R., Malega P., Binasova V. Use of the Analytic Hierarchy Process and Selected Methods in the Managerial Decision-Making Process in the Context of Sustainable Development. Sustainability. 14 (18), 11546 (2022).
- Repetski E., Sarkani S., Mazzuchi T. Applying The Analytic Hierarchy Process (AHP) To Expert Documents. International Journal of the Analytic Hierarchy Process. 14 (1), 1–14 (2022).
- He H., Hu D., Sun Q., Zhu L., Liu Y. A Landslide Susceptibility Assessment Method Based on GIS Technology and an AHP-Weighted Information Content Method: A Case Study of Southern Anhui, China. ISPRS International Journal of Geo-Information. 8 (6), 266 (2019).
- Alzarooni E., Ali T., Atabay S., Yilmaz A. G., Mortula Md. M., Fattah K. P., Khan Z. GIS-Based Identification of Locations in Water Distribution Networks Vulnerable to Leakage. Applied Sciences. 13 (8), 4692 (2023).
- Shu H., Guo Z., Qi S., Song D., Pourghasemi H., Ma J. Integrating Landslide Typology with Weighted Frequency Ratio Model for Landslide Susceptibility Mapping: A Case Study from Lanzhou City of Northwestern China. Remote Sensing. 13 (18), 3623 (2021).
- Mohammadi A., Kiani B., Mahmoudzadeh H., Bergquist R. Pedestrian Road Traffic Accidents in Metropolitan Areas: GIS-Based Prediction Modelling of Cases in Mashhad, Iran. Sustainability. 15 (13), 10576 (2023).
- Salomão Í. L., Pinheiro P. R. Exploring Analytical Hierarchy Process for Multicriteria Assessment of Reinforced Concrete Slabs. Applied Sciences. 13 (17), 9604 (2023).
- Faul F., Erdfelder E., Buchner A., Lang A.-G. Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods. 41 (4), 1149–1160 (2009).
- Guan Y., Fu G.-H. A Double-Penalized Estimator to Combat Separation and Multicollinearity in Logistic Regression. Mathematics. 10 (20), 3824 (2022).
- Stoltzfus J. C. Logistic Regression: A Brief Primer. Academic Emergency Medicine. 18 (10), 1099–1104 (2011).
- Sainani K. L. Logistic Regression. PM&R. 6 (12), 1157–1162 (2014).
- Sperandei S. Understanding logistic regression analysis. Biochemia Medica. 24 (1), 12–18 (2014).
- Abalo R., Vernetta M., Gutiérrez-Sánchez A. Prevention of injuries to lower limbs using logistic regression equations in aerobic gymnastics. Medicina Dello Sport. 66 (2), 265–276 (2013).
- Lapresa D., Arana J., Anguera M. T., P\'erez-Castellanos J. I., Amatria M. Application of logistic regression models in observational methodology: Game formats in grassroots football in initiation into football. Anales de Psicologia. 32 (1), 288–294 (2016).
- Shipe M. E., Deppen S. A., Farjah F., Grogan E. L. Developing prediction models for clinical use using logistic regression: an overview. Journal of Thoracic Disease. 11 (Suppl 4), S574–S584 (2019).
- Junus A., Hsu Y.-C., Wong C., Yip P. S. F. Is internet gaming disorder associated with suicidal behaviors among the younger generation? Multiple logistic regressions on a large-scale purposive sampling survey. Journal of Psychiatric Research. 161, 2–9 (2023).
- Lavanya K., Rambabu P., Suresh G. V., Bhandari R. Gene expression data classification with robust sparse logistic regression using fused regularisation. International Journal of Ad Hoc and Ubiquitous Computing. 42 (4), 281–291 (2023).
- Seitshiro M. B., Mashele H. P. Assessment of model risk due to the use of an inappropriate parameter estimator. Cogent Economics & Finance. 8 (1), 1710970 (2020).
- Lecun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 86 (11), 2278–2324 (1998).
- Zou D., Cao Y., Zhou D., Gu Q. Gradient descent optimizes over-parameterized deep ReLU networks. Machine Learning. 109 (3), 467–492 (2020).
- Chapelle O., Vapnik V., Bousquet O., Mukherjee S. Choosing multiple parameters for support vector machines. Machine Learning. 46 (1/3), 131–159 (2002).
- Saaty T. L., Kearns K. P. The Analytic Hierarchy Process. Analytical Planning. 19–62 (1985).
- Costa W. S., Pinheiro P. R., dos Santos N. M., Cabral L. D. A. F. Aligning the Goals Hybrid Model for the Diagnosis of Mental Health Quality. Sustainability. 15 (7), 5938 (2023).
- Saha A., Mandal S., Saha S. Geo-spatial approach-based landslide susceptibility mapping using analytical hierarchical process, frequency ratio, logistic regression and their ensemble methods. SN Applied Sciences. 2 (10), 1647 (2020).
- Hu X., Si M., Luo H., Guo M., Wang J. The Method and Model of Ecological Technology Evaluation. Sustainability. 11 (3), 886 (2019).
- Chen H., Chen J., Ding J. Data Evaluation and Enhancement for Quality Improvement of Machine Learning. IEEE Transactions on Reliability. 70 (2), 831–847 (2021).
- Vela D., Sharp A., Zhang R., Nguyen T., Hoang A., Pianykh O. S. Temporal quality degradation in AI models. Scientific Reports. 12 (1), 11654 (2022).
- Tay J. K., Narasimhan B., Hastie T. Elastic Net Regularization Paths for All Generalized Linear Models. Journal of Statistical Software. 106 (1), (2023).
- Sghir N., Adadi A., Lahmer M. Recent advances in Predictive Learning Analytics: A decade systematic review (2012–2022). Education and Information Technologies. 28 (7), 8299–8333 (2023).
- Baker S. G., Schuit E., Steyerberg E. W., Pencina M. J., Vickers A., Moons K. G. M., Mol B. W. J., Lindeman K. S. How to interpret a small increase in AUC with an additional risk prediction marker: decision analysis comes through. Statistics in Medicine. 33 (22), 3946–3959 (2014).
- Bejani M. M., Ghatee M. A systematic review on overfitting control in shallow and deep neural networks. Artificial Intelligence Review. 54 (8), 6391–6438 (2021).
- Chan C. P., Yang J. H., Chang W.-H. Entropy-based Time-series Financial Distress Model Based on Attribute Selection and MetaCost Methods for Imbalance Class. Proceedings of the 2023 3rd International Conference on Artificial Intelligence, Automation and Algorithms. 140–151 (2023).
- Stanlly, Putra F. A., Qomariyah N. N. DOTA 2 Win Loss Prediction from Item and Hero Data with Machine Learning. 2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). 204–209 (2022).
- Lyu S., Zhao N., Zhang Y., Chen W., Zhou H., Zhu T. Predicting Risk Propensity Through Player Behavior in DOTA 2: A Cross-Sectional Study. Frontiers in Psychology. 13, 827008 (2022).
- Trivedi U. B., Bhatt M., Srivastava P. Prevent Overfitting Problem in Machine Learning: A Case Focus on Linear Regression and Logistics Regression. Innovations in Information and Communication Technologies (IICT-2020). Advances in Science, Technology & Innovation. 345–349 (2021).