Забезпечення кібербезпеки систем штучного інтелекту: аналіз вразливостей, атак і контрзаходів

Олексій Неретін; Вячеслав Харченко

Останніми роками багато компаній почали інтегрувати системи штучного інтелекту (СШІ) в свої інфраструктури. CШІ використовують у вразливих сферах суспільства, таких як судова система, критична інфраструктура, відеоспостереження тощо. Це зумовлює необхідність досто- вірного оцінювання і гарантованого забезпечення кібербезпеки СШІ. У дослідженні проаналізовано стан справ щодо кібербезпеки цих систем. Класифіковано можливі типи атак і детально розглянуто основні з них. Проаналізовано загрози і атаки за рівнем тяжкості й оцінено ризики безпеки з використанням методу IMECA. Виявлено, що найвищі ризики небезпеки «Змагальних атак» та атак «Отруєння даних», але контрзаходи щодо них не на належному рівні. Зроблено висновок, що існує потреба в формалізації та стандартизації життєвого циклу розроблення та використання безпечних СШІ. Обґрунтовано напрями подальших досліджень щодо необхідності розроблення методів оцінювання і забезпечення кібербезпеки СШІ, зокрема для систем, які надають штучний інтелект як сервіс.

штучний інтелект

кібербезпека

змагальні атаки

отруєння і витік даних

троянські атаки

атаки на модель

крадіжки и отруєння моделей

контрзаходи

Herping, S. (2019). Securing Artificial Intelligence – Part I. https://www.stiftung-nv.de/sites/default/ files/securing_artificial_intelligence.pdf
PwC: The macroeconomic impact of artificial intelligence. (2018). https://www.pwc.co.uk/economic- services/assets/macroeconomic-impact-of-ai-technical-report-feb-18.pdf
Comiter, M. (2019). Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It. Belfer Center for Science and International Affairs, Harvard Kennedy School. https://www.belfercenter.org/sites/default/files/2019-08/AttackingAI/AttackingAI.pdf
Povolny, S. (2020). Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles. McAfee Labs. https://www.mcafee.com/blogs/other-blogs/mcafee-labs/model-hacking-adas-to-pave-safer-roads-for-autonomous- vehicles/
Lohn, A. (2020). Hacking AI. Center for Security and Emerging Technology. https://doi.org/ 10.51593/2020CA006
Lohn, A. (2021). Poison in the Well. Center for Security and Emerging Technology. https://doi.org/ 10.51593/2020CA013
Ruef, M. (2020). Hacking Artificial Intelligence – Influencing and Cases of Manipulation. https://www.researchgate.net/publication/338764153_Hacking_Artificial_Intelligence_-_Influencing_and_Cases_of_Manipulation
Kim, A. (2020). The Impact of Platform Vulnerabilities in AI Systems. Massachusetts Institute of Technology. https://dspace.mit.edu/bitstream/handle/1721.1/129159/1227275868-MIT.pdf
Hartmann, K., & Steup, C. (2020). Hacking the AI – the Next Generation of Hijacked Systems. In 12 International Conference on Cyber Conflict (CyCon). https://doi.org/10.23919/CyCon49761.2020.9131724
Bursztein, E. (2018). Attacks against machine learning – an overview. Personal Site and Blog featuresing blog posts publications and talks. https://elie.net/blog/ai/attacks-against-machine-learning-an-overview/
Ansah, H. (2021). Adversarial Attacks on Neural Networks: Exploring the Fast Gradient Sign Method. Neptune blog. https://neptune.ai/blog/adversarial-attacks-on-neural-networks-exploring-the-fast-gradient-sign-method
Griffin, J. (2019). Researchers hack AI video analytics with color printout. https://www.securityinfowatch.com/ video-surveillance/video-analytics/article/21080107/researchers-hack-ai-video-analytics-with-color-printout
Thys, S., Ranst, W.V., & Goedemé, T. (2019). Fooling automated surveillance cameras: adversarial patches to attack person detection. arXiv preprint arXiv:1904.08653. https://doi.org/10.48550/arXiv.1904.08653
Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Prakash, A., Kohno, T., & Song, D. (2018). Robust Physical-World Attacks on Deep Learning Models. arXiv preprint arXiv:1707.08945. https://doi.org/10.48550/arXiv.1707.08945
Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Tramer, F., Prakash, A., Kohno, T., & Song, D. (2018). Physical Adversarial Examples for Object Detectors. arXiv preprint arXiv:1807.07769. https://doi.org/10.48550/arXiv.1807.07769
Su, J., Vargas, D. V., & Sakurai, K. (2019). Attacking convolutional neural network using differential evolution. IPSJ Transactions on Computer Vision and Applications. https://doi.org/10.1186/s41074-019-0053-3
Goodfellow, I.J., Shlens, J., & Szegedy, C. (2015). Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572. https://doi.org/10.48550/arXiv.1412.6572
Papernot, N., McDaniel, P., & Goodfellow, I.J. (2016). Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples. arXiv preprint arXiv:1605.07277. https://doi.org/ 10.48550/arXiv.1605.07277
Catak, F.O., & Yayilgan, S.Y. (2021). Deep Neural Network based Malicious Network Activity Detection Under Adversarial Machine Learning Attacks. In International Conference on Intelligent Technologies and Applications, 280-291. https://doi.org/10.1007/978-3-030-71711-7_23
Volborth, M. (2019). Detecting backdoor attacks on artificial neural networks. https://ece.duke.edu/ about/news/detecting-backdoor-attacks-artificial-neural-networks
Vincent, J. (2020). Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day. The Verge. https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist
Ji, Y., Liu, Z., Hu, X., Wang, P., & Zhang, Y. (2019). Programmable Neural Network Trojan for Pre- Trained Feature Extractor. arXiv preprint arXiv:1901.07766. https://doi.org/10.48550/arXiv.1901.07766
Yang, Z., Iyer, N., Reimann, J., & Virani, N. (2019). Design of intentional backdoors in sequential models. arXiv preprint arXiv:1902.09972. https://doi.org/10.48550/arXiv.1902.09972
Gu, T., Dolan-Gavitt, B., & Garg, S. (2017). Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733. https://doi.org/10.48550/arXiv.1708.06733
Biggio, B., Nelson, B., & Laskov, P. (2013). Poisoning Attacks against Support Vector Machines. arXiv preprint arXiv:1206.6389. https://doi.org/10.48550/arXiv.1206.6389
Jagielski, M., Oprea, A., Biggio, B., Liu, C., NitaRotaru, C., & Li, B. (2018). Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symposium on Security and Privacy (SP), 19–35. https://doi.org/10.1109/SP.2018.00057
Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., & Roli, F. (2015). Is feature selection secure against training data poisoning? In International Conference on Machine Learning, 1689–1698. https://doi.org/ 10.48550/arXiv.1804.07933
Fredrikson, M., Jha, S., & Ristenpart, T. (2015). Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In CCS '15: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 1322–1333. https://doi.org/10.1145/2810103.2813677
Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership Inference Attacks against Machine Learning Models. In the proceedings of the IEEE Symposium on Security and Privacy. https://doi.org/ 10.48550/arXiv.1610.05820
Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., & Backes, M. (2018). ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models. arXiv preprint arXiv:1806.01246. https://doi.org/10.48550/arXiv.1806.01246
Rahman, A., Rahman, T., Lagani`ere, R., Mohammed, N., & Wang, Y. (2018). Membership Inference Attack against Differentially Private Deep Learning Model. https://www.tdp.cat/issues16/tdp.a289a17.pdf
Song, L., Shokri, R., & Mittal, P. (2019). Privacy Risks of Securing Machine Learning Models against Adversarial Examples. arXiv preprint arXiv:1905.10291. https://doi.org/10.48550/arXiv.1905.10291
Hayes, J., Melis, L., Danezis, G., & De Cristofaro, E. (2018). LOGAN: Membership Inference Attacks Against Generative Models. arXiv preprint arXiv:1705.07663. https://doi.org/10.48550/arXiv.1705.07663
Singh, P. (2022). Data Leakage in Machine Learning: How it can be detected and minimize the risk. https://towardsdatascience.com/data-leakage-in-machine-learning-how-it-can-be-detected-and-minimize-the-risk- 8ef4e3a97562
Rakin, A.S., He, Z., & Fan, D. (2019). Bit-Flip Attack: Crushing Neural Network with Progressive Bit Search. arXiv preprint arXiv:1903.12269. https://doi.org/10.48550/arXiv.1903.12269
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., & Ristenpart, T. (2016). Stealing Machine Learning Models via Prediction APIs. Proceedings of the 25th USENIX Security Symposium. https://doi.org/10.48550/arXiv.1609.02943
Bhagoji, A.N., Chakraborty, S., Mittal, P., & Calo, S.B. (2019). Analyzing Federated Learning through an Adversarial Lens. In Proceedings of the 36th International Conference on Machine Learning, PMLR 97:634-643. http://proceedings.mlr.press/v97/bhagoji19a.html
Androulidakis, I., Kharchenko, V., & Kovalenko, A. (2016). IMECA-based Technique for Security Assessment of Private Communications: Technology and Training. https://doi.org/10.11610/isij.3505
Wolff, J. (2020). How to improve cybersecurity for artificial intelligence. The Brookings Institution. https://www.brookings.edu/research/how-to-improve-cybersecurity-for-artificial-intelligence/
Newman, J. C. (2019). Toward AI Security GLOBAL ASPIRATIONS FOR A MORE RESILIENT FUTURE. https://cltc.berkeley.edu/wp-content/uploads/2019/02/Toward_AI_Security.pdf
National Security Commission on Artificial Intelligence. First Quarter Recommendations (2020). https://drive.google.com/file/d/1wkPh8Gb5drBrKBg6OhGu5oNaTEERbKss/view
Pupillo, L., Fantin, S., Ferreira, A., & Polito, C. (2021). Artificial Intelligence and Cybersecurity. CEPS Task Force Report. https://www.ceps.eu/wp-content/uploads/2021/05/CEPS-TFR-Artificial-Intelligence-and-Cybersecurity.pdf
Neustadter, D. (2020). Why AI Needs Security. Synopsys Technical Bulletin. https://www.synopsys.com/ designware-ip/technical-bulletin/why-ai-needs-security-dwtb-q318.html
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., & McDaniel, P. (2020). Ensemble Adversarial Training: Attacks and Defenses. arXiv preprint arXiv:1705.07204. https://doi.org/10.48550/ arXiv.1705.07204
Yuan, X., He, P., Zhu, Q., & Li, X. (2018). Adversarial Examples: Attacks and Defenses for Deep Learning. arXiv preprint arXiv:1712.07107. https://doi.org/10.48550/arXiv.1712.07107
Dziugaite, G. K., Ghahramani, Z., & Roy, D. M. (2016). A study of the effect of JPG compression on adversarial images. arXiv preprint arXiv:1608.00853. https://doi.org/10.48550/arXiv.1608.00853
Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016). Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy (SP), 582–597. https://doi.org/10.1109/SP.2016.41
Ma, S., Liu, Y., Tao, G., Lee, W.C., & Zhang, X. (2019). NIC: Detecting Adversarial Samples with Neural Network Invariant Checking. In NDSS. https://www.ndss-symposium.org/ndss-paper/nic-detecting-adversarial- samples-with-neural-network-invariant-checking/
Xu, W., Evans, D., & Qi, Y. (2018). Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. In Network and Distributed Systems Security Symposium (NDSS). https://doi.org/10.14722/ndss.2018.23198
Liu, C., Li, B., Vorobeychik, Y., & Oprea, A. (2017). Robust linear regression against training data poisoning. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 91–102. https://doi.org/10.1145/3128572.3140447
Kharchenko, V., Fesenko, H., & Illiashenko, O. (2022). Quality Models for Artificial Intelligence Systems: Characteristic-Based Approach, Development and Application. https://doi.org/10.3390/s22134865
Kharchenko, V., Fesenko, H., & Illiashenko, O. (2022). Basic model of non-functional characteristics for assessment of artificial intelligence quality. Radioelectronic and computer systems. https://doi.org/10.32620/ reks.2022.2.11
Janbi, N., Katib, I., Albeshri, A., & Mehmood, R. (2020). Distributed Artificial Intelligence-as-a-Service (DAIaaS) for Smarter IoE and 6G Environments. https://doi.org/10.3390/s20205796