Solutions and approaches analysis for geospatial data clustering to optimize performance and user experience of web maps

2023;
: 88-96
https://doi.org/10.23939/ujit2023.02.088
Received: October 20, 2023
Accepted: October 26, 2023

Цитування за ДСТУ: Арзубов М. В., Батюк А. Є. Аналіз рішень та підходів кластеризації геопросторових даних для оптимізації продуктивності веб-карти та взаємодії користувача. Український журнал інформаційних технологій. 2023. Т. 5, № 2. С. 88–96.
Citation APA: Arzubov, M. V., & Batiuk, A. Y. (2023). Solutions and approaches analysis for geospatial data clustering to optimize performance and user experience of web maps. Ukrainian Journal of Information Technology, 5(2), 88–96. https://doi.org/10.23939/ujit2023.02.088

1
Lviv Polytechnic National University, Lviv, Ukraine
2
Lviv Polytechnic National University, Lviv, Ukraine

In the contemporary epoch, the management and visualization of geospatial information in web browsers have gained substantial importance. Web maps are indispensable tools across various tourism, goods delivery, and ecology sectors. Furthermore, the extensive support of web browsers on diverse devices enhances the accessibility of geospatial data on the web for various users. However, the incessant increment of geospatial information poses new challenges in efficiently displaying data and navigation through these data on web maps. Therefore, the clustering of geospatial data is crucial in dealing with them. Different clustering methods may affect the performance or visual clarity of web maps.

To improve the user experience and optimize the use of computing resources, geodata clustering becomes a necessary tool for processing large volumes of markers on the map. Despite significant progress in the development of geodata clustering solutions in web maps, there are some challenges that developers and users may encounter. In this article, challenges with scaling, dynamic cluster data, and heterogeneity of data are described. Existing problems in geodata clustering in web maps require additional research and development. Understanding these issues will help developers and researchers improve existing solutions and create new methods and approaches for efficient clustering of geodata in web maps. The urgency of solving the problem lies in the search for effective clustering solutions that provide an opportunity to ensure convenient interactivity and fast processing of geodata in web maps.

This study provides a comprehensive review of data types and clustering methods. Tools and libraries for geodata clustering in web maps are analyzed. Different types of geodata and approaches to working with them were also studied. Concepts such as semi-static data and their positions alongside static and dynamic data types are elucidated.

Through the analysis, optimal scenarios for applying specific clustering methods or the utilization of server-side clustering approaches have been identified. Conclusions have also been drawn on the preferred approach when handling extensive volumes of static or semi-static geospatial data, particularly advocating for the application of server-side clustering with caching.

In conclusion, various clustering approaches in web maps, both client-side and server-side, have been scrutinized. The advantages and disadvantages of both approaches, along with recommendations on when to apply each method, have been delineated. A noticeable absence of explicit approaches in clustering vast geospatial data for web map representation underpins the relevance and necessity of research in this direction.

1. Agarwal, S., & Rajan, K. S. (2016). Performance analysis of MongoDB versus PostGIS/PostGreSQL databases for line intersection and point containment spatial queries. Spat. Inf. Res., 24, 671 677. 
https://doi.org/10.1007/s41324-016-0059-1
2. Amini, A., Wah, T. Y., Saybani, M. R., & Yazdi, S. R. A. S. (2011, July). A study of density-grid based clustering algorithms on data streams. In 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), vol. 3, 1652-1656. IEEE. 
https://doi.org/10.1109/FSKD.2011.6019867
3. Ankerst, M., Breunig, M. M., Kriegel, H. P., & Sander, J. (1999). OPTICS: Ordering points to identify the clustering structure. ACM Sigmod record, 28(2), 49-60. 
https://doi.org/10.1145/304181.304187
4. Cekule, M., Mitrofanovs, I., & Cabs, K. (2023). Information technology for real-time monitoring and visualization of load in urban public open spaces based on spatial and statistical data analyses of human behaviour. International Multidisciplinary Scientific GeoConference: SGEM, 23(2.1), 89-96. 
https://doi.org/10.5593/sgem2023/2.1/s07.12
5. Choi, S., & Bae, B. (2015). The Real-Time Monitoring System of Social Big Data for Disaster Management. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. 
https://doi.org/10.1007/978-3-662-45402-2_115
6. Doroshenko A. (2020). Analysis of the Distribution of COVID-19 in Italy Using Clustering Algorithms, 2020 IEEE Third International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, pp. 325-328. 
https://doi.org/10.1109/DSMP47368.2020.9204202
7. ElHaj, K., Alshamsi, D. & Aldahan, A. (2023). GeoZ: a Region-Based Visualization of Clustering Algorithms. J geovis spat anal, 7, 15. 
https://doi.org/10.1007/s41651-023-00146-0
8. Guo, D., & Onstein, E. (2020). State-of-the-Art Geospatial Information Processing in NoSQL Databases. ISPRS Int. J. Geo-Inf., 9, 331. 
https://doi.org/10.3390/ijgi9050331
9. Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100 108. 
https://doi.org/10.2307/2346830
10. Kramer, O. (2016). Scikit-Learn. In: Machine Learning for Evolution Strategies. Studies in Big Data, vol 20. Springer, Cham. 
https://doi.org/10.1007/978-3-319-33383-0_5
11. Kulawiak, M., Dawidowicz, A., & Pacholczyk, M. E. (2019). Analysis of server-side and client-side Web-GIS data processing methods on the example of JTS and JSTS using open data from OSM and geoportal. Computers & Geosciences, 129, 26-37. 1
https://doi.org/10.1016/j.cageo.2019.04.011
12. Laasasenaho, K., Lensu, A., Lauhanen, R., & Rintala, J. (2019). GIS-data related route optimization, hierarchical clustering, location optimization, and kernel density methods are useful for promoting distributed bioenergy plant planning in rural areas. Sustainable Energy Technologies and Assessments, 32, 47-57. 
https://doi.org/10.1016/j.seta.2019.01.006
13. Levus, Ye. V., & Vasyliuk, R. B. (2022). Recommendation algorithm using data clustering. Ukrainian Journal of Information Technology, 4(2), 18-24.
https://doi.org/10.23939/ujit2022.02.018
14. Lytvyn, V., Uhryn, D., Ushenko, Y., Masikevych, A., & Bairachnyi, V. (2023). The Method of Clustering Geoinformation Data for Stationary Sectoral Geoinformation Systems Using Swarm Intelligence Methods. In: Cioboată, D.D. (eds) International Conference on Reliable Systems Engineering (ICoRSE) - 2023. ICoRSE 2023. Lecture Notes in Networks and Systems, vol 762. Springer, Cham. 
https://doi.org/10.1007/978-3-031-40628-7_44
15. Muenchow, J., Schäfer, S., & Krüger, E. (2019). Reviewing qualitative GIS research-Toward a wider usage of open‐source GIS and reproducible research practices. Geography Compass, 13(6), e12441. 
https://doi.org/10.1111/gec3.12441
16. Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1), 86-97. 
https://doi.org/10.1002/widm.53
17. Netek, R., Brus, J., & Tomecka, O. (2019). Performance Testing on Marker Clustering and Heatmap Visualization Techniques: A Comparative Study on JavaScript Mapping Libraries. ISPRS Int. J. Geo-Inf., 8, 348.
https://doi.org/10.3390/ijgi8080348
18. Praene, J. P., Malet-Damour, B., Radanielina, M. H., Fontaine, L., & Riviere, G. (2019). GIS-based approach to identify climatic zoning: A hierarchical clustering on principal component analysis. Building and Environment, 164, 106330. 
https://doi.org/10.1016/j.buildenv.2019.106330
19. Rezaei, M., & Franti, P. (2018). Real-time clustering of large Geo-referenced data for visualizing on map. Adv. Electr. Comput. En., 18(4), 63-74, Nov. 2018. 
https://doi.org/10.4316/AECE.2018.04008
20. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Transactions on Database Systems (TODS), 42(3), 1-21. 
https://doi.org/10.1145/3068335
21. Yu, J., Wu, J., Sarwat, M. (2015). GeoSpark: a cluster computing framework for processing large-scale spatial data. In: Proceedings of the ACM SIGSPATIAL GIS, USA. 
https://doi.org/10.1145/2820783.2820860