|Improved image captioning via policy gradient optimization of spider|
S Liu, Z Zhu, N Ye, S Guadarrama, K Murphy
Proceedings of the IEEE international conference on computer vision, 873-881, 2017
|Emergent Coordination Through Competition|
S Liu, G Lever, J Merel, S Tunyasuvunakool, N Heess, T Graepel
International Conference on Learning Representations (ICLR 2019), 2019
|dm_control: Software and tasks for continuous control|
S Tunyasuvunakool, A Muldal, Y Doron, S Liu, S Bohez, J Merel, T Erez, ...
Software Impacts 6, 100022, 2020
|Hierarchical visuomotor control of humanoids|
J Merel, A Ahuja, V Pham, S Tunyasuvunakool, S Liu, D Tirumala, ...
International Conference on Learning Representations (ICLR 2019), 2018
|V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control|
HF Song, A Abdolmaleki, JT Springenberg, A Clark, H Soyer, JW Rae, ...
International Conference on Learning Representations 2019, 2019
|A generalized training approach for multiagent learning|
P Muller, S Omidshafiei, M Rowland, K Tuyls, J Perolat, S Liu, D Hennes, ...
arXiv preprint arXiv:1909.12823, 2019
|Observational learning by reinforcement learning|
D Borsa, B Piot, R Munos, O Pietquin
arXiv preprint arXiv:1706.06617, 2017
|From motor control to team play in simulated humanoid football|
S Liu, G Lever, Z Wang, J Merel, SMA Eslami, D Hennes, WM Czarnecki, ...
Science Robotics 7 (69), eabo0235, 2022
|Reinforcement learning agents acquire flocking and symbiotic behaviour in simulated ecosystems|
P Sunehag, G Lever, S Liu, J Merel, N Heess, JZ Leibo, E Hughes, ...
ALIFE 2019: The 2019 Conference on Artificial Life, 103-110, 2019
|The body is not a given: Joint agent policy learning and morphology evolution|
D Banarse, Y Bachrach, S Liu, C Fernando, N Heess, P Kohli, G Lever, ...
|Pick your battles: Interaction graphs as population-level objectives for strategic diversity|
M Garnelo, WM Czarnecki, S Liu, D Tirumala, J Oh, G Gidel, ...
arXiv preprint arXiv:2110.04041, 2021
|Launchpad: a programming model for distributed machine learning research|
F Yang, G Barth-Maron, P Stańczyk, M Hoffman, S Liu, M Kroiss, A Pope, ...
arXiv preprint arXiv:2106.04516, 2021
|NeuPL: Neural Population Learning|
S Liu, L Marris, D Hennes, J Merel, N Heess, T Graepel
International Conference on Learning Representations 2022, 2022
|Transferring task goals via hierarchical reinforcement learning|
S Xie, A Galashov, S Liu, S Hou, R Pascanu, N Heess, YW Teh
|Developing, evaluating and scaling learning agents in multi-agent environments|
I Gemp, T Anthony, Y Bachrach, A Bhoopchand, K Bullard, J Connor, ...
AI Communications, 1-14, 2022
|Simplex NeuPL: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games|
S Liu, M Lanctot, L Marris, N Heess
International Conference on Machine Learning 2022, 2022
|Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach|
B Shahriari, A Abdolmaleki, A Byravan, A Friesen, S Liu, JT Springenberg, ...
arXiv preprint arXiv:2204.10256, 2022