OPTIMIZATION OF OBJECT DETECTION IN CLOSED SPACE USING MOBILE ROBOTIC SYSTEMS WITH OBSTACLE AVOIDANCE

2024;
: 41–48
https://doi.org/10.23939/ujit2024.02.041
Received: September 27, 2024
Accepted: November 19, 2024
1
Lviv Polytechnic National University, Lviv, Ukraine
2
Lviv Polytechnic National University, Lviv, Ukraine

Introducing neural network training process modification that uses combination of several datasets to optimize search of objects and obstacles using mobile robotic systems in a closed space. The study includes an analysis of papers and existing approaches aiming to solve the problem of object boundary detection and discovers the key features of several neural network architectures. During research, it was discovered that there is an insufficient amount of data about the effectiveness of using obstacle detection approaches by mobile robotics systems in a closed space. The presented method is a combination of a deep neural network-based approach for object boundary recognition and visual data for obstacles in a closed space. The base for the object detection neural network is a Deeplab model architecture, trained on the NYU Depth Dataset V2 and ADE20K datasets with extensive data for varying scene types and object categories along with corresponding annotations. The first one consists exclusively of indoor images with precise masks even for little objects, while the latter contains both indoor and outdoor images. The paper provides the results of several conducted experiments aiming to estimate the performance and feasibility of using the developed system for various tasks with the UNet and Deeplab architectures on different datasets. The experiments determined that the developed system built utilizing the Deeplab neural network architecture trained using the combination of ADE20K scene parsing dataset  and NYU Depth indoor data dataset reached an accuracy of 86.9 % in multi-object segmentation. The visual results were presented to demonstrate the obstacle detection of various objects in a closed space and to compare object detection accuracy of several approaches in different situations, like an empty apartment with random obstacles on the way or a crowded study space at a university. The results show an excellent distinction between room sides like ceiling and floor, walls and doors, as well as detecting people and pieces of furniture. The benefits of this study constitute that the proposed neural network training process modification facilitates precise obstacle detection in a closed space by mobile robotic systems, provides more optimal solution to be used in navigation component of such systems by gathering information about precise objects outlines and constructing more optimal path to destination, and is an effective approach to solving the assigned tasks in various environments.

[1] Tsmots, I. G., Teslyuk, V. M., Opotiak, Yu. V., & Oliinyk, O. O. (2023). Development of the scheme and improvement of the motion control method of a group of mobile robotic platforms. Ukrainian Journal of Information Technology, 5(2), 97 104. https://doi.org/10.23939/ujit2023.02.097
https://doi.org/10.23939/ujit2023.02.097
[2] Tsmots I.G., Opotiak Yu. V., Obelovska K.M., & Tesliuk S.V. (2024). Methods and means of conflict-free data exchange in the group of mobile robotic platforms. Ukrainian Journal of Information Tecnology, 6(1), 65 75. https://doi.org/10.23939/ujit2024.01.065
https://doi.org/10.23939/ujit2024.01.065
[3] Huan Wang, Can Qin, Yue Bai, Yulun Zhang, & Yun Fu. (2022). Recent Advances on Neural Network Pruning at Initialization. https://doi.org/10.48550/arXiv.2103.06460
https://doi.org/10.24963/ijcai.2022/786
[4] Borkivskyi, B. P., Teslyuk, V. M. (2023). Application of neural network tools for object recognition in mobile systems with obstacle avoidance. Scientific Bulletin of UNFU, 33(4), 84-89. https://doi.org/10.36930/40330412
https://doi.org/10.36930/40330412
[5] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, & Alan L. Yuille. (2017). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. https://doi.org/10.48550/arXiv.1606.00915
https://doi.org/10.1109/TPAMI.2017.2699184
[6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. (2015). Deep Residual Learning for Image Recognition. https://doi.org/10.48550/arXiv.1512.03385
[7] Omar Mohamed Awad, Habib Hajimolahoseini, Michael Lim, Gurpreet Gosal, Walid Ahmed, Yang Liu, & Gordon Deng. (2023). Improving Resnet-9 Generalization Trained on Small Datasets. https://doi.org/10.48550/arXiv.2309.03965
[8] François Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. https://doi.org/10.48550/arXiv.1610.02357
https://doi.org/10.1109/CVPR.2017.195
[9] Olaf Ronneberger, Philipp Fischer, & Thomas Brox. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. https://doi.org/10.48550/arXiv.1505.04597
https://doi.org/10.1007/978-3-319-24574-4_28
[10] Gaurav Prasanna, John Rohit Ernest, Lalitha G, & Sathiya Narayanan. (2023). Squeeze Excitation Embedded Attention UNet for Brain Tumor Segmentation. https://doi.org/10.48550/arXiv.2305.07850
https://doi.org/10.1007/978-981-99-6855-8_9
[11] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, & Jiaya Jia. (2017). Pyramid Scene Parsing Network. https://doi.org/10.48550/arXiv.1612.01105
[12] Li Wang, Dong Li, Han Liu, Jinzhang Peng, Lu Tian, & Yi Shan. (2021). Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving. https://doi.org/10.48550/arXiv.2103.11351
[13] Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, & Bin Xiao. (2020). Deep High-Resolution Representation Learning for Visual Recognition. https://doi.org/10.48550/arXiv.1908.07919
[14] Yixuan Zhou, Xuanhan Wang, Xing Xu, Lei Zhao, & Jingkuan Song. (2023). X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention. https://doi.org/10.48550/arXiv.2310.08042
https://doi.org/10.1109/ICME52920.2022.9859751
[15] Liu, G., Guo, Y., Jin, Q., Chen, G., Saheya, B., & Wu, C. (2024). A region of interest focused Triple UNet architecture for skin lesion segmentation. International Journal of Imaging Systems and Technology, 34(3). https://doi.org/10.48550/arXiv.2311.12581
https://doi.org/10.1002/ima.23090
[16] Vinh Quoc Luu, Duy Khanh Le, Huy Thanh Nguyen, Minh Thanh Nguyen, Thinh Tien Nguyen, & Vinh Quang Dinh. (2024). Semi-Supervised Semantic Segmentation using Redesigned Self-Training for White Blood Cells. https://doi.org/10.48550/arXiv.2401.07278
[17] Liang-Chieh Chen, George Papandreou, Florian Schroff, & Hartwig Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. https://doi.org/10.48550/arXiv.1706.05587
[18] Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor Segmentation and Support Inference from RGBD Images. In Computer Vision - ECCV 2012 (pp. 746-760). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-33715-4_54
https://doi.org/10.1007/978-3-642-33715-4_54
[19] Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2017). Scene Parsing through ADE20K Dataset. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5122-5130). https://doi.org/10.1109/CVPR.2017.544
https://doi.org/10.1109/CVPR.2017.544