ВИКОРИСТАННЯ ШУМІВ ДЛЯ СТАБІЛЬНОГО ГЕНЕРУВАННЯ ЗОБРАЖЕНЬ У ДИФУЗІЙНИХ МОДЕЛЯХ

Олексій Веретюк; Богдан Огерук; Назарій Андрущак

В основі цієї роботи лежить дослідження процесу генерування зображень за допомогою дифузійних моделей. Продемонстровано можливості покращення стабільності генерування необхідного зображення за допомогою підходу, що передбачає використання постійного шуму як регуляризатора та накладання на нього маски випадкового шуму, яка створюється навколо кожного пікселя вхідного зображення у випадковому радіусі. Реалізація моделей проведена за допомогою мови Python та бібліотек tensorflow, numpy та keras. Показано кінцеві результати, в яких вийшло досягти стабільності в генеруванні при збереженні варіативності.

дифузійні моделі

генерування зображень

стабільність генерування

[1] Zhang, M., & Li, J. (2021). A commentary of GPT-3 in MIT Technology Review 2021. Fundamental Research, 1(6), 831-833. https://doi.org/10.1016/j.fmre.2021.11.011

[2] Puteikis, K., & Mameniškienė, R. (2024). Artificial intelligence: Can it help us better grasp the idea of epilepsy? An exploratory dialogue with ChatGPT and DALL· E 2. Epilepsy & Behavior, 156, 109822. https://doi.org/10.1016/j.yebeh.2024.109822

[3] Borji, A. (2022). Generated faces in the wild: Quantitative comparison of stable diffusion, midjourney and dall-e 2. arXiv preprint arXiv:2210.00586.

[4] Sun, Z., Fang, H., Cao, J., Zhao, X., & Wang, D. (2024, October). Rethinking Image Editing Detection in the Era of Generative AI Revolution. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 3538-3547). https://doi.org/10.1145/3664647.3681445

[5] Archana Balkrishna, Y. (2024). An analysis on the use of image design with generative AI technologies. International Journal of Trend in Scientific Research and Development, 8(1), 596-599.

[6] Van Daele, D., Decleyre, N., Dubois, H., & Meert, W. (2021, May). An automated engineering assistant: Learning parsers for technical drawings. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 17, pp. 15195-15203). https://doi.org/10.1609/aaai.v35i17.17783

[7] Banh, L., & Strobel, G. (2023). Generative artificial intelligence. Electronic Markets, 33(1), 63. https://doi.org/10.1007/s12525-023-00680-1

[8] Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., & Bharath, A. A. (2018). Generative adversarial networks: An overview. IEEE signal processing magazine, 35(1), 53-65. https://doi.org/10.1109/MSP.2017.2765202

[9] Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., ... & Yang, M. H. (2023). Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 56(4), 1-39.

[10] Batzolis, G., Stanczuk, J., Schönlieb, C. B., & Etmann, C. (2021). Conditional image generation with score-based diffusion models. arXiv preprint arXiv:2111.13606.

[11] Wang, Z., Zheng, H., He, P., Chen, W., & Zhou, M. (2022). Diffusion-gan: Training gans with diffusion. arXiv preprint arXiv:2206.02262.

[12] Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34, 8780-8794.

[13] Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".