Robust bootstrap regression testing in the presence of outliers

2022;
: pp. 26–35
https://doi.org/10.23939/mmc2022.01.026
Received: July 07, 2021
Accepted: November 16, 2021

Mathematical Modeling and Computing, Vol. 9, No. 1, pp. 26–35 (2022)

1
Department of Statistics, University of Al-Qadisiyah
2
Department of Statistics, University of Al-Qadisiyah

Bootstrap is one of the random sampling methods with replacement, that was proposed to address the problem of small samples whose distributions are difficult to derive.  The distribution of bootstrap samples is empirical or free and due to its random sampling with replacement, the probability of choosing a specific observation may be equal to one.  Unfortunately, when the original sample data contains an outlier, there is a serious problem that leads to a breakdown OLS (Ordinary Least Squares) estimator, and robust regression methods should be recommended.  It is well known that the best robust regression method has a high breakdown point is not more than 0.50, so the robust regression method would break down when the percentage of outliers in the bootstrap sample exceeds 0.50. It is well known that fixed-x bootstrap is resampled the residuals which probably are having outliers.  Moreover, the leverage point(s) is an outlier that occurs in X-direction, so the effects of it on fixed-x bootstrap samples would be existence.  However, the decision-making about the null hypothesis of bootstrap regression coefficients could not be reliable.  In this paper, we propose using weighted fixed-x bootstrap with a probability approach to guarantee the percentage of outliers in the bootstrap samples will be very low.  And then weighted M-estimate should be to tackle the problem of outliers and leverage points and taking a more reliable decision about bootstrap regression coefficients hypothesis test.  The performance of the suggested method has been tested with others by using real data and simulation.  The results show our proposed method is more efficient and reliable than the others.

  1. Amado C., Pires A. M.  Robust bootstrap with non random weights based on the influence function.  Communications in Statistics – Simulation and Computation. 33 (2), 377–396 (2004).
  2. Athreya K., Hinkley D. V.  Bootstrap of the mean in the infinite variance case.  Annals of Statistics. 15 (2),  724–731 (1987).
  3. Croux C., Filzmoser P., Pison G., Rousseeuw P. J.  Fitting multiplicative models by robust alternating regressions.  Statistics and Computing volume. 13, 23–36  (2003).
  4. Efron B.  Bootstrap methods: another look at the jackknife.  In Breakthroughs in Statistics. 569–593 (1992).
  5. Giloni A., Simonoff J. S., Sengupta B.  Robust weighted LAD regression.  Computational Statistics & Data Analysis. 50 (11), 3124–3140 (2006).
  6. Huber P. J.  The place of the L1-norm in robust estimation.  Computational Statistics & Data Analysis. 5 (4), 255–262 (1987).
  7. Huber P. J., Ronchetti E. M.  Robust statistics.  Wiley Series in Probability and Mathematical Statistics.  John Wiley & Sons, Inc.  (1981).
  8. Yohai J. V., Maronna A. R.  Location estimators based on linear combinations of modified order statistics.  Communications in Statistics – Theory and Methods. 5 (5), 481–486 (1976).
  9. Koenker R.  Quantile regression.  Cambridge University Press (2005).
  10. Midi H., Uraibi H. S., Talib B. A.  Dynamic robust bootstrap method based on LTS estimators.  European Journal of Scientific Research. 32 (3), 277–287 (2009).
  11. Habshah M., Norazan M. R., Rahmatullah Imon A. H. M.  The performance of diagnostic-robust generaliozed potential approach for the identification of multiple high leverage points in linear regression.
     Journal of Applied Statistics. 36 (5), 507–520 (2009)
    .
  12. Rousseeuw P. J.  Least median of squares regression.  Journal of the American Statistical Association. 79 (388), 871–880 (1984).
  13. Rousseeuw P. J., Leroy A. M.  Robust Regression and Outlier Detection.  Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc.  (1987).
  14. Shao J.  Bootstrap estimation of the asymptotic variances of statistical functionals.  Annals of the Institute of Statistical Mathematics. 42, 737–752 (1990).
  15. Shao J.  Bootstrap variance estimators with truncation.  Statistics & Probability Letters. 15 (2), 95–101 (1992).
  16. Singh K.  Breakdown theory for bootstrap quantiles.  Annals of Statistics. 26 (5), 1719–1732 (1998).
  17. Stromberg A. J.  Robust covariance estimates based on resampling.  Journal of Statistical Planning and Inference.  57 (2), 321–334 (1997).
  18. Uraibi H. S.  Weighted Lasso Subsampling for High Dimensional Regression.  Electronic Journal of Applied Statistical Analysis.  12 (1), 69–84 (2019).
  19. Uraibi H. S., Midi H.  On  Robust  Bivariate and Multivariate Correlation Coefficient.  Economic computation and economic cybernetics studies and research.  53 (2/2019), 221–239 (2019).
  20. Uraibi H. S., Midi H., Rana S.  Robust stability best subset selection for autocorrelated data based on robust location and dispersion estimator.  Journal of Probability and Statistics.  2015, Article ID 432986, 8 pages (2015).
  21. Uraibi H. S., Midi H., Rana S.  Robust multivariate least angle regression.  ScienceAsia.  43 (1), 56–60 (2017).
  22. Uraibi H. S., Midi H., Talib B. A., Yousif J. H.  Linear regression model selection based on robust bootstrapping technique.  American Journal of Applied Sciences.  6 (6), 1191–1198 (2009).
  23. Willems G. S., Aelst S.  Fast and Robust Bootstrap for LTS.  Computational Statistics & Data Analysis.  48 (4), 703–715 (2005).