Mobile Application for Text Translation and Visualization in Augmented Reality Using Neural Networks

Dmytro Fedasyuk; Tetiana Tsymbaliuk; Volodymyr Rudniev

The study explores the development of a mobile application for text translation in augmented reality (AR). The primary goal is to integrate modern technologies to ensure accurate text recognition, high-quality translation, and proper visualization of the output directly on the original plane. This tool aims to simplify access to information and improve interaction with foreign-language texts in real time. The relevance of the research is driven by the need for fast and convenient solutions for intercultural communication, which is becoming increasingly important in the context of globalization.

The proposed architecture integrates PaddleOCR for optical text recognition, DeepL API for machine translation, and ARCore for augmented reality visualization. During development, algorithms were created to process text, translate it, and accurately position it in the real environment. To evaluate the solution's efficiency, testing was conducted under various conditions, including variations in lighting, camera angles, and different text fonts. Particular attention was given to ensuring that the translated text aligns correctly with the plane of the original.

The testing results confirmed the application’s effectiveness in real-world scenarios. At the same time, several areas for improvement were identified: enhancing performance in low-light conditions, ensuring stable text visualization, and improving support for complex fonts. Additionally, there is potential to refine algorithms for multilingual text processing. The suggested optimization paths aim to improve the overall functionality of the system and its adaptability to various usage scenarios.

The development holds promise for integration into tourism, education, and business. In the future, the application can be enhanced to support a broader range of text types and usage scenarios, providing users with new opportunities for convenient access to foreign-language materials in real time. The system's potential enables its application to be expanded, offering users an effective tool for addressing everyday challenges.

augmented reality

neural machine translation

Optical Character Recognition

text translation

MVVM

Amin, D., & Govilkar, S. (2015). Comparative study of augmented reality SDKs. International Journal on Computational Science & Applications, 5(1), 11–26. https://doi.org/10.5121/ijcsa.2015.5102
Amrhein, C., & Clematide, S. (2018). Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods. Journal for Language Technology and Computational Linguistics (JLCL), 33(1), 49–76. https://doi.org/10.21248/jlcl.33.2018.218.
Chandarana, J., & Kapadia, M. (2013). A Review of Optical Character Recognition. International journal of engineering research & technology (IJERT), 2 (12). URL: https://www.ijert.org/research/a-review-of- optical-character-recognition-IJERTV2IS120772.pdf
Chatzopoulos, J. D., Bermejo, C., Huang, Z., & Hui, P. (2017). Mobile Augmented Reality Survey: From Where We Are to Where We Go. IEEE Access, 5, 6917–6950. https://doi.org/10.1109/ACCESS.2017.2698164
Fragoso, V., Gauglitz, S., Zamora, S., Kleban, J., & Turk, M. (2011). TranslatAR: A mobile augmented reality translator. IEEE Workshop on Applications of Computer Vision (WACV), 497–502. https://doi.org/10. 1109/WACV.2011.5711545
Geng, X. (2024). Enhancing Translation Education Through Augmented Reality (AR). International Journal of Web-Based Learning and Teaching Technologies, 19, 1–22. https://doi.org/10.4018/IJWLTT.361651.
Hangarage, V., & Mukarambi, G. (2024). Text Localization and Enhancement of Mobile Camera based Complex Natural Bilingual Text Scene Images. Procedia Computer Science, 235, 2353–2361. https://doi.org/10. 1016/j.procs.2024.04.223.
Huang, Z., & Friderikos, V. (2023). Optimal service decomposition for mobile augmented reality with edge cloud support. Computer Communications, 202, 97-109. https://doi.org/10.1016/j.comcom.2023.02.002
Matei, O., Pop, P. C., & Valean, H. (2013). Optical character recognition inreal environments using neural networks and k-nearest neighbor. Applied Intelligence, 39 (4), 739–748. https://doi.org/10.1007/s10489-013-0456-2
Ogurtsova, O., & Shevchenko, O. (2023). Neural Machine Translation: Strengths and Weaknesses. European Science, 86–92. https://doi.org/10.30890/2709-2313.2023-18-04-019.
Ouali, I., Ben Halima, M., & Wali, A. (2022). Text detection and recognition using augmented reality and deep learning. International Conference on Advanced Information Networking and Applications, 449, 13–23. https://doi.org/10.1007/978-3-030-99584-3_2
Ouertani, H. C., & Tatwany, L. (2019). Augmented reality-based mobile application for real-time Arabic language translation. Communications in Science and Technology, 4 (1), 30–37. https://doi.org/10. 21924/cst.4.1.2019.88
Petter, M., Fragoso, V., Turk, M., & Baur, C. (2011). Automatic Text Detection for Mobile Augmented Reality Translation. IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 48–55. https://doi.org/10.1109/iccvw.2011.6130221
Pu, M., Abd Majid, N., & Idrus, B. (2017). Framework based on mobile augmented reality for translating food menu in Thai language to Malay language. International Journal on Advanced Science Engineering and Information Technology, 7 (1), 153–159. https://doi.org/10.18517/ijaseit.7.1.1797.
Sabu, A. M., & Das, A. S. (2018). A Survey on various Optical Character Recognition Techniques. Conference on Emerging Devices and Smart Systems (ICEDSS), 152–155. https://doi.org/10.1109/ICEDSS.2018.8544323. Sarhan, A., Ali, H., Wagdi, M., Ali, B., Adel, A., & Osama, R. (2024). CV Content Recognition and Organization Framework based onYOLOv8 and Tesseract-OCR Deep Learning Models. https://doi.org/10.21203/rs.3.rs-4947322/v1
Shekar, K. C., Cross, M. A., & Vasudevan, V. (2021). Optical Character Recognition and Neural Machine Translation Using Deep Learning Techniques. Innovations in Computer Science and Engineering. Springer, 171, 277–283. https://doi.org/10.1007/978-981-33-4543-0_30
Syahidi, A., & Tolle, H. (2021). Evaluation of User Experience in Translator Applications (Banjar-Indonesian and Indonesian-Banjar) Based on Mobile Augmented Reality Technology using the UX Honeycomb Method. International Journal of Game Theory, 6, 7–13. https://doi.org/10.21512/jggag.v6i1.7430.
Wang, Y. (2022). Analysis of the Combination of AR Technology and Translation System. International Conference on Social Development and Media Communication (SDMC 2021). https://doi.org/10. 2991/assehr.k.220105.035.
Yin, Y., Liu, G., & Zhang, S. (2024). Augmented Reality Text Translation: A Unity-Based Real-Time Approach. Science and Technology of Engineering, Chemistry and Environmental Protection. https://doi.org/10. 61173/60910s92.