This article studies salary prediction under distributional drift using explainable boosting models and hybrid forecasting. We integrate unseen-aware feature engineering, robust objectives, SHAP-based interpretability, drift detection, and time-series forecasting (Prophet/SARIMAX) on multi-year data (2020–2024), and report a comprehensive evaluation aligned with typical MMC guidelines. Modern salary data are heterogeneous, heavy-tailed, and non-stationary. Therefore we combine robust tree-based learners with drift monitoring and explainable forecasting to prioritize stable absolute error, transparency, and maintainability over raw variance capture. Our best integrated pipeline reaches $R^2=0.31$ on a 2024 hold-out while keeping MAE/RMSE stable across folds, and uncovers year-to-year drift that necessitates periodic retraining monthly and quarterly forecasts indicate a sustained upward trend with seasonality, where SARIMAX captures short-term fluctuations and Prophet yields interpretable trend decompositions.
- George E. P. Box, Gwilym M. Jenkins. Time Series Analysis: Forecasting and Control. Holden-Day (1976).
- Chen Q., Ge J., Xie H., Xu X., Yang Y. Large language models at work in China's labor market. China Economic Review. 92, 102413 (2025).
- Gama J., Žliobaitė I., Bifet A., Pechenizkiy M., Bouchachia A. A survey on concept drift adaptation. ACM Computing Surveys. 46 (4), 1–37 (2014).
- Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T.-Y. LightGBM: A highly efficient gradient boosting decision tree. 31st Conference on Neural Information Processing Systems (NIPS 2017). 1–9 (2017).
- Hinder F., Vaquet V., Hammer B. One or two things we know about concept drift – a survey on monitoring in evolving environments. Part A: detecting concept drift. Frontiers in Artificial Intelligence. 7, 1330257 (2024).
- Lundberg S. M., Lee S.-I. A unified approach to interpreting model predictions. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems. 4768–4777 (2017).
- Kim K. Unemployment Dynamics Forecasting with Machine Learning Regression Models. Preprint arXiv:2505.01933 (2025).
- Prokhorenkova L., Gusev G., Vorobev A., Dorogush A. V., Gulin A. Catboost: unbiased boosting with categorical features. Proceedings of the 32nd International Conference on Neural Information Processing Systems. 6639–6649 (2018).
- Taylor S. J., Letham B. Forecasting at scale. The American Statistician. 72 (1), 37–45 (2018).
- Wang C., Shakhovska N., Sachenko A., Komar M. A new approach for missing data imputation in big data interface. Information Technology and Control. 49 (4), 541–555 (2020).
- Acharya D. B., Divya B., Kuppan K. Explainable and Fair AI: Balancing Performance in Financial and Real Estate Machine Learning Models. IEEE Access. 12, 154022–154034 (2024).