ADVANCING VIDEO SEARCH CAPABILITIES: INTEGRATING FEEDFORWARD NEURAL NETWORKS FOR EFFICIENT FRAGMENT-BASED RETRIEVAL

Nataliia Melnykova; Petro Pobereiko

In the context of rapidly increasing volumes of video data, the problem of their efficient search and analysis becomes more acute. This research aims to develop and test an innovative system to improve the speed and accuracy of video search, utilizing the capabilities of Deep Convolutional Neural Networks (DCNN) and Feedforward Neural Networks (FFNN). Within the methodology developed for this study, video data are processed through several sequential stages: from feature extraction to key frame identification and the formation of an abstract vector representation. Deep Convolutional Neural Networks are central to the system for image analysis and Feedforward Neural Networks for optimizing the search process. The main results of the study include an increase in video search efficiency by reducing data processing time and increasing the accuracy of identifying relevant fragments. The originality of the work lies in the integration of two types of neural networks for structured analysis of video data, which is a new step in the development of video search technologies. The practical significance of the research is expressed in the possibility of applying the developed system in various areas where fast and accurate video search is needed: from the media industry to security systems. The scope of further research includes adapting the system to specific types of video content and expanding the capabilities of artificial intelligence for a deeper understanding of video data.

deep convolutional neural networks

video search

data processing

feature extraction

feedforward neural networks

[1] Indriyani and P. Dewanti, “Analysis of the Effect of Social Media on the Marketing Process in a Store or Business Entity ‘Social Media Store’,” Budapest International Research and Critics Institute-Journal, vol. 4, no. 4, pp. 9804-9814, 2021

[2] A. W. Bridges, “Skills, content knowledge, and tools needed in a 21st century university-level graphic design program,” Visual Communications Journal, vol. 52, no. 2, pp. 1–12, 2016.

[3] M. Y. Saragih and A. I. Harahap, “The Challenges of Print Media Journalism in the Digital Era. Budapest International Research and Critics Institute,” BIRCI-Journal, vol. 3, no. 1, pp. 540-548, 2020. https://doi.org/10.33258/birci.v3i1.805

[4] Konrad J, Wang M, Ishwar P, Wu C, Mukherjee D. LearningBased, Automatic 2D-to-3D Image and Video Conversion. IEEE Transactions on Image Processing. 2013; 22(9):3485–96. https://doi.org/10.1109/TIP.2013.2270375

[5] B. Gong, W.-L. Chao, K. Grauman, and F. Sha. Diverse sequential subset selection for supervised video summarization. In Advances in Neural Information Processing Systems, pages 2069–2077, 2014.

[6] Zhang HJ, Wu J, Zhong D, Smoliar SW (1997) An integrated system for content-based video retrieval and browsing. Pattern Recognit 30(4):643–658 https://doi.org/10.1016/S0031-3203(96)00109-4

[7] C. Cotsaces, N. Nikolaidis, and I. Pitas. Shot detection and condensed representation – a review. IEEE Signal Processing Magazine, 23:28–37, 2006. https://doi.org/10.1109/MSP.2006.1621446

[8] D. Mohammad, I. Aljarrah, and M. Jarrah. Searching surveillance video contents using convolutional neural network. IJECE, vol. 11, no. 2, pp. 1656-1665, 2021 https://doi.org/10.11591/ijece.v11i2.pp1656-1665

[9] D. Varshni, K. Thakral, L. Agarwal, R. Nijhawan and A. Mittal, "Pneumonia Detection Using CNN based Feature Extraction", 2019 IEEE International Conference on Electrical Computer and Communication Technologies (ICECCT), pp. 1-7, 2019. https://doi.org/10.1109/ICECCT.2019.8869364

[10] G. Huang, Z. Liu, L. V. D. Maaten and K. Q. Weinberger, "Densely connected convolutional networks", CVPR, pp. 2261-2269, July 2017. https://doi.org/10.1109/CVPR.2017.243

[11]. L. Shao, F. Zhu, X. Li, Transfer learning for visual categorization: A survey, IEEE transactions on neural networks and learning systems vol. 26, pp. 1019–1034, 2014. https://doi.org/10.1109/TNNLS.2014.2330900

[12]. P. Naveen and B. Diwan, "Relative Analysis of ML Algorithm QDA LR and SVM for Credit Card Fraud Detection Dataset", 2020 Fourth International Conference on I-SMAC (IoT in Social Mobile Analytics and Cloud) (I-SMAC), pp. 976-981, 2020. https://doi.org/10.1109/I-SMAC49090.2020.9243602

[13]. Y. Zhuang, Y. Rui, T. S. Huang, and S. Mehrotra. Adaptive key frame extracting using unsupervised clustering. Proc. of IEEE Int Conf on Image Processing, pages 866–870, 1998.

[14]. Wolf. Key frame selection by motion analysis. IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1228–1231, 1996.

[15]. Li, D.; Wang, R.; Xie, C.; Liu, L.; Zhang, J.; Li, R.; Wang, F.; Zhou, M.; Liu, W. A Recognition Method for Rice Plant Diseases and Pests Video Detection Based on Deep Convolutional Neural Network. Sensors 2020, 20, 578. https://doi.org/10.3390/s20030578

[16]. Liu, Z.G., Zhang, X.Y., Wu, C.C.: A flame detection algorithm based on bag-of-features in the YUV color space. In: Proceedings on International Conference on Intelligent Computing and Internet of Things, Harbin, pp. 64–67 (2015).

[17]. Divya Shree, Chander Kant. Building Efficient Neural Networks For Brain Tumor Detection. Journal of Positive School Psychology, vol. 6, no. 11, 2022.

[18]. Z. Qiumei, T. Dan and W. Fenghua, "Improved Convolutional Neural Network Based on Fast Exponentially Linear Unit Activation Function," in IEEE Access, vol. 7, pp. 151359-151367, 2019, https://doi.org/10.1109/ACCESS.2019.2948112

[19]. L. Li, M. Doroslovački and M. H. Loew, "Approximating the Gradient of Cross-Entropy Loss Function," in IEEE Access, vol. 8, pp. 111626-111635, 2020, https://doi.org/10.1109/ACCESS.2020.3001531

[20]. N. Ohadi, A. Kamandi, M. Shabankhah, S. M. Fatemi, S. M. Hosseini and A. Mahmoudi, "SW-DBSCAN: A Grid-based DBSCAN Algorithm for Large Datasets," 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 2020, pp. 139-145, https://doi.org/10.1109/ICWR49608.2020.9122313

[21]. T. Kwon, "Average Data Rate Analysis for Hierachical Cell Structure under Nakagami-m Fading Channel with a Two-layer Feed-Forward Neural Network," 2019 International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Barcelona, Spain, 2019, pp. 1-4, https://doi.org/10.1109/WiMOB.2019.8923280

[22]. L. D. Medus, T. Iakymchuk, J. V. Frances-Villora, M. Bataller-Mompeán and A. Rosado-Muñoz, "A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks," in IEEE Access, vol. 7, pp. 76084-76103, 2019, https://doi.org/10.1109/ACCESS.2019.2920885

[23]. X. Luo, O. Ye and B. Zhou, "An Modified Video Stream Classification Method Which Fuses Three-Dimensional Convolutional Neural Network," 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 2019, pp. 105-108, https://doi.org/10.1109/MLBDBI48998.2019.00026