Multi-Agent Multi-Objective Deep Reinforcement Learning for Efficient and Effective Pilot Training

Johan Källström
Saab AB and Department of Computer Science, Linköping University, Linköping, Sweden

Fredrik Heintz
Department of Computer Science, Linköping University, Linköping, Sweden

Ladda ner artikelhttp://dx.doi.org/10.3384/ecp19162011

Ingår i: FT2019. Proceedings of the 10th Aerospace Technology Congress, October 8-9, 2019, Stockholm, Sweden

Linköping Electronic Conference Proceedings 162:11, s. 101-111

Visa mer +

Publicerad: 2019-10-23

ISBN: 978-91-7519-006-8

ISSN: 1650-3686 (tryckt), 1650-3740 (online)


The tactical systems and operational environment of modern fighter aircraft are becoming increasingly complex. Creating a realistic and relevant environment for pilot training using only live aircraft is difficult, impractical and highly expensive. The Live, Virtual and Constructive (LVC) simulation paradigm aims to address this challenge. LVC simulation means linking real aircraft, ground-based systems and soldiers (Live), manned simulators (Virtual) and computer controlled synthetic entities (Constructive). Constructive simulation enables realization of complex scenarios with a large number of autonomous friendly, hostile and neutral entities, which interact with each other as well as manned simulators and real systems. This reduces the need for personnel to act as role-players through operation of e.g. live or virtual aircraft, thus lowering the cost of training. Constructive simulation also makes it possible to improve the availability of training by embedding simulation capabilities in live aircraft, making it possible to train anywhere, anytime. In this paper we discuss how machine learning techniques can be used to automate the process of constructing advanced, adaptive behavior models for constructive simulations, to improve the autonomy of future training systems. We conduct a number of initial experiments, and show that reinforcement learning, in particular multi-agent and multi-objective deep reinforcement learning, allows synthetic pilots to learn to cooperate and prioritize among conflicting objectives in air combat scenarios. Though the results are promising, we conclude that further algorithm development is necessary to fully master the complex domain of air combat simulation.


pilot training, embedded training, LVC simulation, artificial intelligence, autonomy,sub-system and system technology


Inga referenser tillgängliga

Citeringar i Crossref