In hydrological datasets, particularly rainfall, the study of extreme values is crucial. The appropriate analysis of such datasets can provide vital information about the return levels of extreme rainfall, which can play a significant role in disaster prevention. In many situations, the GPD has been a well-respected option for studying extreme data; nonetheless, there are still concerns about the GPD's threshold selection method. The commonly used Mean Residual Life (MRL) plot technique for threshold selection in Generalized Pareto Distribution (GPD) analysis suffers from subjectivity and requires extensive prior knowledge, limiting its reproducibility. This paper introduces a straightforward, computationally inexpensive, and automated procedure for threshold selection. By employing interval-based candidate thresholds and goodness-of-fit (GOF) tests, the proposed method determines the optimal threshold that maximizes the p-value, enhancing objectivity and accuracy. Several combinations of estimation methods and GOF tests were investigated, with the CVM-Lmoment combination emerging as the most robust. Through extensive simulation studies, our approach demonstrated significant improvements in reducing bias and RMSE compared to traditional methods. The application of the proposed methodology to a rainfall dataset from South-West England confirmed its robustness and practical utility, making it a valuable tool for extreme value modeling and disaster management.
- Benstock D., Cegla F. Extreme value analysis (EVA) of inspection data and its uncertainties. NDT & E international. 87, 68–77 (2017).
- Davison A. C., Smith R. L. Models for exceedances over high thresholds. Journal of the Royal Statistical Society Series B: Statistical Methodology. 52 (3), 393–425 (1990).
- Scarrott C., MacDonald A. A review of extreme value threshold estimation and uncertainty quantification. REVSTAT – Statistical journal. 10 (1), 33–60 (2012).
- Castillo E., Hadi A. S. Fitting the generalized Pareto distribution to data. Journal of the American Statistical Association. 92 (440), 1609–1620 (1997).
- Embrechts P., Klüppelberg C., Mikosch T. Modelling Extremal Events: for Insurance and Finance. Vol. 33, Springer Science & Business Media (2013).
- Coles S., Bawa J., Trenner L., Dorazio P. An introduction to Statistical Modeling of Extreme Values. Vol. 208, Springer (2001).
- Beirlant J., Goegebeur Y., Teugels J. L., Segers J. Statistics of Extremes: Theory and Applications. John Wiley & Sons (2006).
- McNeil A. J., Frey R. Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach. Journal of Empirical Finance. 7 (3–4), 271–300 (2000).
- Solari S., Egüen M., Polo M. J., Losada M. A. Peaks Over Threshold (POT): A methodology for automatic threshold estimation using goodness of fit p-value. Water Resources Research. 53 (4), 2833–2849 (2017).
- Wu G., Qiu W. Threshold Selection for POT Framework in the Extreme Vehicle Loads Analysis Based on Multiple Criteria. Shock and Vibration. 2018 (1), 4654659 (2018).
- Liu H., Yang F., Wang H. Research on Threshold Selection Method in Wave Extreme Value Analysis. Water. 15 (20), 3648 (2023).
- Fukutome S., Liniger M., Süveges M. Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology. 120, 403–416 (2015).
- Liang B., Shao Z., Li H., Shao M., Lee D. An automated threshold selection method based on the characteristic of extrapolated significant wave heights. Coastal Engineering. 144, 22–32 (2019).
- Thompson P., Cai Y., Reeve D., Stander J. Automated threshold selection methods for extreme wave analysis. Coastal Engineering. 56 (10), 1013–1021 (2009).
- Bader B., Yan J., Zhang X. Automated threshold selection for extreme value analysis via Goodness-of-Fit tests with application to batched return level mapping. Preprint arXiv:1604.02024 (2016).
- Solari S., Losada M. A. A unified statistical model for hydrological variables including the selection of threshold for the peak over threshold method. Water Resources Research. 48 (10), W10541 (2012).
- Curceac S., Atkinson P. M., Milne A., Wu L., Harris P. An evaluation of automated GPD threshold selection methods for hydrological extremes across different scales. Journal of Hydrology. 585, 124845 (2020).
- Ozger M. Scaling characteristics of ocean wave height time series. Physica A: Statistical Mechanics and its Applications. 390 (6), 981–989 (2011).
- Du H., Wu Z., Zong S., Meng X., Wang L. Assessing the characteristics of extreme precipitation over northeast China using the multifractal detrended fluctuation analysis. Journal of Geophysical Research: Atmospheres. 118 (12), 6165–6174 (2013).
- Coles S. G., Tawn J. A. Modelling Extremes of the Areal Rainfall Process. Journal of the Royal Statistical Society: Series B: Statistical Methodology. 58 (2), 329–347 (1996).
- Hosking J. R. M. L-Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics. Journal of the Royal Statistical Society Series B: Statistical Methodology. 52 (1), 105–124 (1990).
- Rossi R. J. Mathematical Statistics: An Introduction to Likelihood Based Inference. John Wiley & Sons (2018).
- Asquith W. H. Distributional Analysis with L-moment Statistics using the R Environment for Statistical Computing. CreateSpace Scotts Valley, CA, USA (2011).
- Hosking J. R. M. Moments or L Moments? An Example Comparing two Measures of Distributional Shape. The American Statistician. 46 (3), 186–189 (1992).
- Hosking J. On the characterization of distributions by their L-moments. Journal of Statistical Planning and Inference. 136 (1), 193–198 (2006).
- van Staden P. J., Loots M. T. Method of L-moment estimation for the generalized lambda distribution. Proceedings of the Third Annual ASEARC Conference. 1–4 (2009).
- Wasserstein R. L., Lazar N. A. The ASA Statement on $p$-Values: Context, Process, and Purpose. The American Statistician. 70 (2), 129–133 (2016).
- Greenland S., Senn S. J., Rothman K. J., Carlin J. B., Poole C., Goodman S. N., Altman D. G. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology. 31 (4), 337–350 (2016).
- Choulakian V., Lockhart R. A., Stephens M. A. Cramér – von Mises statistics for discrete distributions. The Canadian Journal of Statistics. 22 (1), 125–137 (1994).
- Ahsan-ul-Haq M. A new Cramèr – von Mises Goodness-of-fit test under Uncertainty. Neutrosophic Sets and Systems. 49 (1), 262–268 (2022).
- Chu J., Dickin O., Nadarajah S. A review of goodness of fit tests for Pareto distributions. Journal of Computational and Applied Mathematics. 361, 13–41 (2019).
- Martins A. L. A., Liska G. R., Beijo L. A., de Menezes F. S., Cirillo M. Â. Generalized Pareto distribution applied to the analysis of maximum rainfall events in Uruguaiana, RS, Brazil. SN Applied Sciences. 2 (9), 1479 (2020).
- Majid M. H. A., Ibrahim K. Composite pareto distributions for modelling household income distribution in Malaysia. Sains Malaysiana. 50 (7), 2047–2058 (2021).
- Teodorescu S., Vernic R., et al. Some composite Exponential – Pareto models for actuarial prediction. Romanian Journal of Economic Forecasting. 12 (4), 82–100 (2009).
- Abu Bakar S. A., Nadarajah S., ABSL Kamarul Adzhar Z. A., Mohamed I. Gendist: An R Package for Generated Probability Distribution Models. PLOS One. 11 (6), e0156537 (2016).
- Ramachandran K. M., Tsokos C. P. Mathematical statistics with applications in R. Academic Press (2020).
- Hesterberg T. Bootstrap. Wiley Interdisciplinary Reviews: Computational Statistics. 3 (6), 497–526 (2011).