Volgen
Bilal Piot
Bilal Piot
DeepMind
Geverifieerd e-mailadres voor google.com
Titel
Geciteerd door
Geciteerd door
Jaar
Bootstrap your own latent: A new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ...
arXiv preprint arXiv:2006.07733, 2020
34892020
Rainbow: Combining improvements in deep reinforcement learning
M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ...
Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018
20342018
Deep q-learning from demonstrations
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ...
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
9552018
Noisy networks for exploration
M Fortunato, MG Azar, B Piot, J Menick, I Osband, A Graves, V Mnih, ...
arXiv preprint arXiv:1706.10295, 2017
8652017
Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards
M Vecerik, T Hester, J Scholz, F Wang, O Pietquin, B Piot, N Heess, ...
arXiv preprint arXiv:1707.08817, 2017
6092017
Agent57: Outperforming the atari human benchmark
AP Badia, B Piot, S Kapturowski, P Sprechmann, A Vitvitskyi, ZD Guo, ...
International conference on machine learning, 507-517, 2020
4502020
k. kavukcuoglu, R
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Munos, and M. Valko,“Bootstrap your own latent-a new approach to self …, 2020
308*2020
Never give up: Learning directed exploration strategies
AP Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, S Kapturowski, ...
arXiv preprint arXiv:2002.06038, 2020
2172020
Acme: A research framework for distributed reinforcement learning
MW Hoffman, B Shahriari, J Aslanides, G Barth-Maron, N Momchev, ...
arXiv preprint arXiv:2006.00979, 2020
1742020
Learning from demonstrations for real world reinforcement learning
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ...
arXiv preprint arXiv:1704.03732, 2017
1622017
Inverse reinforcement learning through structured classification
E Klein, M Geist, B Piot, O Pietquin
Advances in neural information processing systems 25, 2012
1092012
Bootstrap latent-predictive representations for multitask reinforcement learning
ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar
International Conference on Machine Learning, 3875-3886, 2020
1062020
Approximate dynamic programming for two-player zero-sum Markov games
J Perolat, B Scherrer, B Piot, O Pietquin
International Conference on Machine Learning, 1321-1329, 2015
972015
Bridging the gap between imitation learning and inverse reinforcement learning
B Piot, M Geist, O Pietquin
IEEE transactions on neural networks and learning systems 28 (8), 1814-1826, 2016
922016
Observe and look further: Achieving consistent performance on atari
T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ...
arXiv preprint arXiv:1805.11593, 2018
872018
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos
arXiv preprint arXiv:1704.04651, 2017
852017
Boosted bellman residual minimization handling expert demonstrations
B Piot, M Geist, O Pietquin
Machine Learning and Knowledge Discovery in Databases: European Conference …, 2014
822014
End-to-end optimization of goal-driven and visually grounded dialogue systems
F Strub, H De Vries, J Mary, B Piot, A Courville, O Pietquin
arXiv preprint arXiv:1703.05423, 2017
782017
Laugh-aware virtual agent and its impact on user amusement
R Niewiadomski, J Hofmann, J Urbain, T Platt, J Wagner, P Bilal, T Ito, ...
University of Zurich, 2013
722013
Neural predictive belief representations
ZD Guo, MG Azar, B Piot, BA Pires, R Munos
arXiv preprint arXiv:1811.06407, 2018
672018
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20