Road users detection for traffic congestion classification

A. Es Swidi; S. Ardchir; A. Daif; M. Azouazi

One of the important problems that urban residents suffer from is Traffic Congestion. It makes their life more stressful, it impacts several sides including the economy: by wasting time, fuel and productivity. Moreover, the psychological and physical health. That makes road authorities required to find solutions for reducing traffic congestion and guaranteeing security and safety on roads. To this end, detecting road users in real-time allows for providing features and information about specific road points. These last are useful for road managers and also for road users about congested points. The goal is to build a model to detect road users including vehicles and pedestrians using artificial intelligence especially machine learning and computer vision technologies. This paper provides an approach to detecting road users using as input a dataset of 22983 images, each image contains more than one of the target objects, generally about 81000 target objects, distributed on persons (pedestrians), cars, trucks/buses (vehicles), and also motorcycles/bicycles. The dataset used in this study is known as Common Objects in Context (MS COCO) published by Microsoft. Furthermore, six different models were built based on the approaches RCNN, Fast RCNN, Faster RCNN, Mask RCNN, and the 5th and the 7th versions of YOLO. In addition, a comparison of these models using evaluation metrics was provided. As a result, the chosen model is able to detect road users with more than 55% in terms of mean average precision.

Traffic congestion cost the US economy nearly 87 billion in 2018.
Palubinskas G., Kurz F., Reinartz P. Model based traffic congestion detection in optical remote sensing imagery. European Transport Research Review. 2 (2), 85–92 (2010).
Alsmadi M. K. Content-Based Image Retrieval Using Color, Shape and Texture Descriptors and Features. Arabian Journal for Science and Engineering. 45 (4), 3317–3330 (2020).
Dalal N., Triggs B. Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). 1, 886–893 (2005).
Bhosle S., Khanale D. Texture Classification Approach and Texture Datasets: A Review. International Journal of Research and Analytical Reviews. 6 (2), 218–224 (2019).
COCO – Common Objects in Context.
Perreault H., Bilodeau G. A., Saunier N., Héritier M. CenterPoly: real-time instance segmentation using bounding polygons. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). 2982–2991 (2021).
Girshick R., Donahue J., Darrell T., Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition. 580–587 (2014).
Girshick R. Fast R-CNN, arXiv:1504.08083 [cs] (2015).
Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems. 28, Curran Associates, Inc. (2015).
He K., Gkioxari G., Dollár P., Girshick R. Mask R-CNN, arXiv:1703.06870 [cs] (2018).
Redmon J., Divvala S., Girshick R., Farhadi A. You Only Look Once: Unified, Real-Time Object Detection. arXiv:1506.02640 [cs], arXiv: 1506.02640 (2016).
Wang C.-Y., Bochkovskiy A., Liao H.-Y. M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. ArXiv:2207.02696 (2022).
Long X., Deng K., Wang G., Zhang Y., Dang Q., Gao Y., Shen H., Ren J., Han S., Ding E., Wen S. PP-YOLO: An Effective and Efficient Implementation of Object Detector. ArXiv: 2007.12099 (2020).
Bochkovskiy A., Wang C.-Y., Liao H.-Y. M. YOLOv4: Optimal Speed and Accuracy of Object Detection. ArXiv:2004.10934 (2020).
Redmon J., Farhadi A. YOLOv3: An Incremental Improvement. ArXiv:1804.02767 (2018).