Аналіз методів навчання роботів-маніпуляторів для виконання складних траєкторій руху

Юрій Сенчук; Федір Матіко

У статті розглянуто актуальні підходи до навчання роботів-маніпуляторів, які застосовуються для виконання складних завдань у динамічних та змінних умовах середовища. Проведено порівняльний аналіз сучасних методів, визначено їхні основні переваги, недоліки, а також окреслено типові сфери їхнього практичного застосування, зокрема методи із залученням людини-інструктора, самонавчання та навчання з підкріпленням. Особливу увагу приділено питанню ефективності навчання, адаптивності роботів до нових умов, взаємодії з людиною та перенесення навичок з віртуального навчального середовища у реальне. На основі аналізу рекомендованим визначено імітаційне навчання, зокрема підхід навчання за демонстрацією, що дозволяє швидко та безпечно передавати навички від людини до робота без необхідності формалізації завдань. Крім того, в статті акцентовано увагу на проблемах адаптації навчених моделей до реальних умов і взаємодії роботів із людиною. Визначено ключові виклики, що стоять перед сучасними системами навчання роботів та сформульовано рекомендації щодо вибору оптимальних стратегій навчання залежно від типу завдань і доступних ресурсів.

Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483. https://doi.org/10.1016/j.robot.2008.10.024
Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2008). Robot programming by demonstration. In Springer handbook of Robotics (pp. 1371-1394). Springer. https://doi.org/10.1007/978-3-540-30301-5_60
Barekatain, A., Habibi, H., & Voos, H. (2024). A practical roadmap to learning from demonstration for robotic manipulators in manufacturing. Robotics, 13(3), 100. https://doi.org/10.3390/robotics13070100
Underactuated Robotics. (n.d.). Ch. 21 – Imitation Learning. Retrieved from https://underactuated.mit.edu/imitation.html.
Ross, S., Gordon, G., & Bagnell, D. A reduction of imitation learning and structured prediction to no-regret online learning. AISTATS. 2011. – P. 627–635.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd Edition). MIT Press.
Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. International Journal of Robotics Research, 32(11), 1238–1274. https://doi.org/10.1177/0278364913495721
Lillicrap, T. P., Hunt, J. J., Pritzel, A., et al. (2015). Continuous control with deep reinforcement learning.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms.
Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.
Ng, A., & Russell, S. (2000). Algorithms for inverse reinforcement learning. Proceedings of the 17th International Conference on Machine Learning (ICML), 663–670.
Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3), 387–434. https://doi.org/10.1007/s10458-005-2631-2
Nikolaidis, S., & Shah, J. A. (2013). Human-robot cross-training: computational formulation, modeling and evaluation of a human team training strategy. ACM/IEEE International Conference on Human-Robot Interaction. https://doi.org/10.1109/HRI.2013.6483499
DEEPCOBOT – Collective Efficient Deep Learning and Networked Control for Multiple Collaborative Robot Systems. UiA WISENET Lab. 2024. URL: https://deepcobot.uia.no/about
Zhu, Y., Mottaghi, R., Kolve, E., et al. (2018). Reinforcement and imitation learning for diverse visuomotor skills. Proceedings of Robotics: Science and Systems (RSS), 33–40. https://doi.org/10.15607/RSS.2018.XIV.009
Levine, S., Finn, C., Darrell, T., & Abbeel, P. (2016). End-to-End Training of Deep Visuomotor Policies. Journal of Machine Learning Research, 17(39), 1–40.
Yaqing Wang, Quanming Yao, James Kwok, Lionel M. Ni (2019). Generalizing from a Few Examples: A Survey on Few-Shot Learning. Science, 53(3), 1–34. https://doi.org/10.1145/3386252
Finn, C., Abbeel, P., & Levine, S. (2017). Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Proceedings of the 34th International Conference on Machine Learning (ICML), 1126–1135.
Naeem, S., Ali, A., & Anam, S. (2023). An unsupervised machine learning algorithms: Comprehensive review. International Journal of Computing and Digital Systems, 12(1), 1–10. https://doi.org/10.12785/ijcds/130172
Hjelm, R. D., Fedorov, A., Lavoie, E., et al. (2019). Learning deep representations by mutual information estimation and maximization. International Conference on Learning Representations (ICLR).
James, S., Davison, A. J., & Johns, E. (2019). "Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. https://doi.org/10.1109/CVPR.2019.01291
Pan, S. J., & Yang, Q. (2010). "A Survey on Transfer Learning." IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191