Improved image captioning via policy gradient optimization of spider
S Liu, Z Zhu, N Ye, S Guadarrama, K Murphy
Proceedings of the IEEE international conference on computer vision, 873-881, 2017
Emergent Coordination Through Competition
S Liu, G Lever, J Merel, S Tunyasuvunakool, N Heess, T Graepel
International Conference on Learning Representations (ICLR 2019), 2019
dm_control: Software and tasks for continuous control
S Tunyasuvunakool, A Muldal, Y Doron, S Liu, S Bohez, J Merel, T Erez, ...
Software Impacts 6, 100022, 2020
Hierarchical visuomotor control of humanoids
J Merel, A Ahuja, V Pham, S Tunyasuvunakool, S Liu, D Tirumala, ...
International Conference on Learning Representations (ICLR 2019), 2018
V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control
HF Song, A Abdolmaleki, JT Springenberg, A Clark, H Soyer, JW Rae, ...
International Conference on Learning Representations 2019, 2019
A generalized training approach for multiagent learning
P Muller, S Omidshafiei, M Rowland, K Tuyls, J Perolat, S Liu, D Hennes, ...
arXiv preprint arXiv:1909.12823, 2019
Observational learning by reinforcement learning
D Borsa, B Piot, R Munos, O Pietquin
arXiv preprint arXiv:1706.06617, 2017
From motor control to team play in simulated humanoid football
S Liu, G Lever, Z Wang, J Merel, SMA Eslami, D Hennes, WM Czarnecki, ...
Science Robotics 7 (69), eabo0235, 2022
Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems
P Sunehag, G Lever, S Liu, J Merel, N Heess, JZ Leibo, E Hughes, ...
ALIFE 2019: The 2019 Conference on Artificial Life, 103-110, 2019
The body is not a given: Joint agent policy learning and morphology evolution
D Banarse, Y Bachrach, S Liu, C Fernando, N Heess, P Kohli, G Lever, ...
Pick your battles: Interaction graphs as population-level objectives for strategic diversity
M Garnelo, WM Czarnecki, S Liu, D Tirumala, J Oh, G Gidel, ...
arXiv preprint arXiv:2110.04041, 2021
Launchpad: a programming model for distributed machine learning research
F Yang, G Barth-Maron, P Stańczyk, M Hoffman, S Liu, M Kroiss, A Pope, ...
arXiv preprint arXiv:2106.04516, 2021
NeuPL: Neural Population Learning
S Liu, L Marris, D Hennes, J Merel, N Heess, T Graepel
International Conference on Learning Representations 2022, 2022
Transferring task goals via hierarchical reinforcement learning
S Xie, A Galashov, S Liu, S Hou, R Pascanu, N Heess, YW Teh
Developing, evaluating and scaling learning agents in multi-agent environments
I Gemp, T Anthony, Y Bachrach, A Bhoopchand, K Bullard, J Connor, ...
AI Communications, 1-14, 2022
Simplex NeuPL: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games
S Liu, M Lanctot, L Marris, N Heess
International Conference on Machine Learning 2022, 2022
Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach
B Shahriari, A Abdolmaleki, A Byravan, A Friesen, S Liu, JT Springenberg, ...
arXiv preprint arXiv:2204.10256, 2022
