https://arxiv.org/pdf/1811.07029

H Mao, Z Zhang, Z Xiao, Z Gong - arXiv preprint arXiv:1811.07029, 2018

Modelling and exploiting teammates' policies in cooperative multi-agent systems have long been an interest and also a big challenge for the reinforcement learning (RL) community. The interest lies in the fact that if the agent knows the teammates' …