Hardware Optimization of Video Quality Improvement Methods Based on Deep Neural Networks

2025;
: pp. 122 - 133
1
Lviv Polytechnic National University, Department of Electronic Computational Machines, Ukraine
2
Національний університет «Львівська політехніка», Україна, Комп'ютерна Академія IT STEP, Україна

The paper addresses various aspects of optimizing deep video enhancement models for efficient execution on modern hardware. The focus is on a multi-frame generative network with multi-scale structure and frame-by-frame smoothing (MST-GAN). A comprehensive hardware acceleration strategy is proposed, which includes structural thinning, quantization (FP16/INT8), pipeline, parallelization, and model compilation using TensorRT. A comparative analysis is performed before and after optimizations, including changes in FPS, latency, memory consumption, and FLOPs. The results demonstrate that the neural model, after optimization, achieves a 4.3x speedup with minimal loss of quality, allowing for its use in real-time applications. A comparison with other modern VSR models (BasicVSR, RSDN, EDVR) in the context of their hardware efficiency is also considered.

  1. Maksymiv, M., & Rak, T. (2023). Methods of video quality-improving. Artificial  Intelligence, (3(97)), 47–62. DOI: 10.15407/jai2023.03.047.
  2. Rak, T., & Maksymiv, M. (2021). Methods to increase the contrast of the image with preserving the visual quality. Achievements in Cyber-Physical Systems, 6(2), 140–145. DOI: 10.23939/acps2021.02.140.
  3. Wang, X., Chan, K., Yu, K., Dong, C., & Loy, C. C. (2019). EDVR: Video restoration with enhanced deformable convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). DOI: 10.1109/CVPRW.2019.00291.
  4. Isobe, T., Li, X., Li, Y., Li, H., & Shan, Y. (2020). Recurrent structure-detail network for video super-resolution. In European Conference on Computer Vision (ECCV). DOI: 10.1007/978-3-030-58568-6_3.
  5. Chan, K., Wang, X., Yu, K., Dong, C., & Loy, C. C. (2021). BasicVSR: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR46437.2021.00874.
  6. Maksymiv, M., & Rak, T. (2025). Multi-scale temporal GAN-based method for high-resolution and motion-stable video enhancement. Radio Electronics, Computer Science, Control, (3(74)), 86–94. DOI: 10.15407/jai2023.03.047.
  7. Chan, K., Xie, L., Dong, X., & Loy, C. C. (2022). BasicVSR++: Improving video super-resolution with enhanced propagation and alignment. In European Conference on Computer Vision (ECCV). DOI: 10.1007/978-3-031-19818-6_23.
  8. Chan, K., Dong, X., & Loy, C. C. (2023). Efficient video super-resolution through recurrent latent propagation. arXiv preprint arXiv:2304.03804. DOI: 10.48550/arXiv.2304.03804.
  9. OpenMMLab. (n.d.). MMagic VSR Library. OpenMMLab documentation. URL: https://github.com/open-mmlab/mmediting.
  10. Jo, Y., Wug Oh, S., Kang, J., & Kim, S. J. (2018). Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR.2018.00799.
  11. Huang, W., & Chen, X. (2022). Improved EDVR for efficient video super-resolution. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops (WACV-W). DOI: 10.1109/WACVW54576.2022.00093.
  12. Cao, X., Wang, L., Zhang, C., Wu, J., & Ding, E. (2021). EGVSR: Efficient generative video super- resolution. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. arXiv preprint arXiv:2107.05307. DOI: 10.1109/ICASSP39728.2021.9414952.
  13. Chu, M., Xie, T., Mayer, H., & Thuerey, N. (2019). TecoGAN: Temporally coherent GAN for video super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). DOI: 10.1109/ICCV.2019.01071.
  14. Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. DOI: 10.48550/arXiv.1503.02531.
  15. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS). DOI: 10.48550/arXiv.1912.01703.
  16. Xia, M., Zhang, Y., Liu, Y., & Chen, X. (2023). Structured sparsity learning for efficient video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR52729.2023.00899.
  17. Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2017). Pruning convolutional neural networks for resource efficient inference. In International Conference on Learning Representations (ICLR). DOI: 10.48550/arXiv.1611.06440.
  18. Zhou, W., Chen, Z., Liu, Y., & Qiao, Y. (2022). Adaptive inference for efficient video super- resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR52688.2022.01320.
  19. Liu, S., Ma, X., Zhang, Y., & Lin, S. (2021). Dynamic temporal pyramid network: A closer look at efficient video super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). DOI: 10.1109/ICCV48922.2021.00937.
  20. Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E., Shen, H., ... & Guestrin, C. (2018). TVM: An automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI). DOI: 10.48550/arXiv.1802.04799.
  21. NVIDIA Corporation. (2023). TensorRT 8.6: Developer Guide. URL: https://docs.nvidia.com/deeplearning/tensorrt. DOI: 10.5281/zenodo.7863686.
  22. Jain, A., Shah, A., Hegarty, S., & Pienaar, J. (2020). Compiling deep learning models for custom ASICs and FPGAs with Vitis AI. In Proceedings of the ACM SIGDA International Symposium on Field- Programmable Gate Arrays. DOI: 10.1145/3386265.3400658.
  23. Yang, F., Shi, B., Yu, W., & Li, Y. (2020). Benchmarking deep video super-resolution models on large-scale datasets.