The Expectation-Maximization (EM) algorithm is an efficient method for estimating the parameters of a mixture regression model in the presence of outliers in the Y-direction. Unfortunately, this method breaks down when leverage points are present in the dataset. The most common procedure used in the literature involves removing leverage points after identifying them with single detection methods. However, some authors have pointed out that single detection methods can be inaccurate and have therefore proposed multiple diagnostic approaches. This manuscript proposes the Weighted EM (WEM) method to address the problem of leverage points without requiring data deletion. Moreover, it builds upon the DRGP (RMVN) framework, which is one of the multiple diagnostic methods. Real data and simulation studies were conducted to evaluate the efficiency of the proposed method compared to existing approaches. The results show that the WEM method is more robust and reliable than other methods, particularly when sample sizes are small.
- Quandt R. E. A new approach to estimating switching regressions. Journal of the American Statistical Association. 67 (338), 306–310 (1972).
- Quandt R. E., Ramsey J. B. Estimating mixtures of normal distribution and switching regressions. Journal of the American Statistical Association. 73 (364), 730–738 (1978).
- Dempster A. P., Laird N. M., Rubin D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodologica). 39 (1), 1–22 (1977).
- Markatou M. Mixture models, robustness, and the weighted likelihood methodology. Biometrics. 56 (2), 483–486 (2000).
- Shen R., Ghosh D., Chinnaiyan A. M. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data. BMC Genomics. 5, 94 (2004).
- Peel D., McLachlan G. J. Robust mixture modelling using the t distribution. Statistics and Computing. 10, 339–348 (2000).
- Song W., Yao W., Xing Y. Robust mixture regression model fitting by Laplace distribution. Computational Statistics & Data Analysis. 71, 128–137 (2004).
- Yao W., Wei Y., Yu C. Robust mixture regression using the t-distribution. Computational Statistics & Data Analysis. 71, (2014).
- Markatou M. Mixture models, robustness, and the weighted likelihood methodology. Biometrics. 56 (2), 483–486 (2000).
- Rousseeuw P. J., Van Zomeren B. C. Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association. 85 (411), 633–639 (1990).
- Imon A. H. M. R. Sub-sample Methods in Regression Residual Prediction and Diagnostics. University of Birmingham (1996).
- Uraibi H. S., Haraj S. A. A. Group diagnostic measures of different types of outliers in multiple linear regression model. Malaysian Journal of Science. 41 (sp1), 23–33 (2022).