OPERATIONAL BASIS OF ARTIFICIAL NEURAL NETWORKS AND EVALUATION OF HARDWARE CHARACTERISTICS FOR ITS IMPLEMENTATION

Ivan Tsmots; Yurii Opotyak; Bohdan Shtohrinets; T. B. Mamchur; Oleksandr Oliinyk

The tasks performed by the intelligent components of mobile robotic systems (MRS) are analyzed and their features are determined. The operational basis for the implementation of hardware accelerators of artificial neural networks (ANN) is defined and divided into three groups of neurooperations: preprocessing, processing and calculation of transfer functions. It is shown that the operations of the first group provide the transformation of the input data to the form that will give the best results, the operations of the second group (multiplication, addition, group summation, calculation of the dot product, calculation of a two-dimensional convolution, multiplication of the matrix by a vector) are performed directly in the neural network itself in the process of training and functioning, operations of the third group provide calculation of transfer functions. It is determined that the specialized hardware of the intelligent components of the MRS should provide real-time operation and take into account the limitations in terms of dimensions and power consumption. It is proposed to carry out the development of specialized hardware of intelligent components of the MRS on the basis of an integrated approach, which covers the capabilities of the modern element base, parallel methods of data processing, algorithms and structures of hardware and takes into account the requirements of specific applications. For the development of hardware accelerators ANN, the following principles were chosen: modularity; homogeneity and regularity of the structure; localization and reduction of the number of connections between elements; pipeline and spatial parallelism; coordination of intensities in the receipt of input data, calculation and issuance of results; specialization and adaptation of hardware structures to algorithms for the implementation of neurooperations. It is proposed to use the following characteristics to evaluate specialized hardware: hardware resources, operation time and equipment utilization efficiency. Analytical expressions and a simulation model for evaluating the characteristics of specialized hardware have been developed, the results of which are used to select the most effective accelerator and elemental structure for the implementation of intelligent components of the MRS. The method of selection of the element base for the implementation of intelligent components of the MRS has been improved, which, by taking into account the results of the assessment of the characteristics of hardware accelerators, the requirements of a specific application and the existing element base for their implementation, ensures the selection of the most effective of the existing ones.

artificial neural network

operational basis

specialized hardware

method of selection of element base

1. Lee, D., Park, M., Kim, H., & Jeon, M. (2021). AI-based mobile robot navigation using deep neural networks and reinforcement learning. IEEE Access, 9, 329-345. https://doi.org/10.1109/ACCESS.2021.3102345

2. Soori, M., Arezoo, B., & Dastres, R. (2023). Artificial intelligence, machine learning and deep learning in advanced robotics: A review. Cognitive Robotics, 3, 54-70. https://doi.org/10.1016/j.cogr.2023.04.001

3. Sze, V., Chen, Y.-H., Yang, T.-J., & Emer, J. S. (2017). Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12), 2295-2329. https://doi.org/10.1109/JPROC.2017.2761740

4. Chen, J., Lin, K., Yang, L., & Ye, W. (2024). An energy-efficient edge processor for radar-based continuous fall detection utilizing mixed-radix FFT and updated blockwise computation. IEEE Internet of Things Journal, 11(19), 32117-32128. https://doi.org/10.1109/JIOT.2024.3422251

5. Gu, J., & Joseph, R. (2024). Perspective Chapter: Dynamic timing enhanced computing for microprocessor and deep learning accelerators. In Deep Learning - Recent Findings and Research. https://doi.org/10.5772/intechopen.113296

6. Sabareeshwari, V., & S. K. C. (2025). Artificial Intelligence in Communications. In Z. Hammouch & O. Jamil (Eds.), Convergence of Antenna Technologies, Electronics, and AI (pp. 209-238). IGI Global. https://doi.org/10.4018/979-8-3693-3775-2.ch008

7. Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2020). A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems, 31(7), 2227-2249. https://doi.org/10.1109/TNNLS.2020.2996649

8. Hussain, M. (2024). Sustainable machine vision for Industry 4.0: A comprehensive review of convolutional neural networks and hardware accelerators in computer vision. AI, 5(3), 1324-1356. https://doi.org/10.3390/ai5030064

9. Wang, C., & Luo, Z. (2022). A review of the optimal design of neural networks based on FPGA. Applied Sciences, 12(21), 10771. https://doi.org/10.3390/app122110771

10. Gilbert, M., Wu, Y. N., Emer, J. S., & Sze, V. (2024). LoopTree: Exploring the fused-layer dataflow accelerator design space. IEEE Transactions on Circuits and Systems for Artificial Intelligence, 1(1), 97-111. https://doi.org/10.1109/TCASAI.2024.3461716

11. Taherdoost, H. (2023). Deep learning and neural networks: Decision-making implications. Symmetry, 15(9), 1723. https://doi.org/10.3390/sym15091723

12. Xu, Y., Luo, J., & Sun, W. (2024). Flare: An FPGA-based full precision low power CNN accelerator with reconfigurable structure. Sensors, 24, 2239. https://doi.org/10.3390/s24072239

13. Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2022). A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems, 33(12), 6999-7019. https://doi.org/10.1109/TNNLS.2021.3084827

14. Goel, S., Kedia, R., Sen, R., & Balakrishnan, M. (2024). EXPRESS: A framework for execution time prediction of concurrent CNNs on Xilinx DPU accelerator. ACM Transactions on Embedded Computing Systems, 24(1), 11. https://doi.org/10.1145/3697835

15. Tsmots, I., Rabyk, V., Kryvinska, N., Yatsymirskyy, M., & Teslyuk, V. (2022). Design of processors for fast cosine and sine Fourier transforms. Circuits, Systems, and Signal Processing, 41(9), 4928-4951. https://doi.org/10.1007/s00034-022-02012-8

16. Tsmots, I., Teslyuk, V., Kryvinska, N., Skorokhoda, O., & Kazymyra, I. (2023). Development of a generalized model for parallel-streaming neural element and structures for scalar product calculation devices. Journal of Supercomputing, 79(5), 4820-4846. https://doi.org/10.1007/s11227-022-04838-0

17. Цмоць, І. Г., Скорохода, О. В., & Теслюк, В. М. (2013). Пристрій для обчислення скалярного добутку. Патент України на винахід №101922, 13.05.2013, Бюл. №9.