Ian Osband

Geciteerd door

	Alles	Sinds 2019
Citaties	8078	7235
h-index	26	25
i10-index	31	30

1600

800

400

1200

201520162017201820192020202120222023202427 74 223 474 755 1148 1366 1485 1521 958

Medeauteurs

Benjamin Van RoyStanford UniversityGeverifieerd e-mailadres voor stanford.edu
Zheng WenGoogle DeepMindGeverifieerd e-mailadres voor google.com
Vikranth DwaracherlaDeepMindGeverifieerd e-mailadres voor google.com
Xiuyuan LuGoogle DeepMindGeverifieerd e-mailadres voor google.com
Daniel RussoColumbia UniversityGeverifieerd e-mailadres voor gsb.columbia.edu
Alexander PritzelDeepmindGeverifieerd e-mailadres voor google.com
Morteza IbrahimiStanford UniversityGeverifieerd e-mailadres voor stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindGeverifieerd e-mailadres voor alumni.stanford.edu
Mohammad Gheshlaghi AzarCohereGeverifieerd e-mailadres voor google.com
Todd HesterWaymoGeverifieerd e-mailadres voor waymo.com
Bilal PiotGoogle DeepmindGeverifieerd e-mailadres voor google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Geverifieerd e-mailadres voor univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindGeverifieerd e-mailadres voor nyu.edu
Rémi MunosGoogle DeepMindGeverifieerd e-mailadres voor inria.fr
Marc LanctotResearch Scientist, Google DeepMindGeverifieerd e-mailadres voor google.com

Volgen

Ian Osband

OpenAI

Geverifieerd e-mailadres voor openai.com - Homepage

Reinforcement Learning


Titel Sorteren op citaties Sorteren op jaar Sorteren op titel	Geciteerd door Geciteerd door	Jaar
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1468	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1227	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1133	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	825	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	416	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	332	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	331	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	269	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	220	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	195	2014
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	180	2019
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	179	2017
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	166*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	140	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	121	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	113	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	101	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	101	2013
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	90	2024
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	90	2019

Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.

Artikelen 1–20

Citaties per jaar

Dubbele citaties

Samengevoegde citaties

Medeauteurs toevoegenMedeauteurs

Volgen

Geciteerd door

Medeauteurs