Volgen
Andrei Lupu
Andrei Lupu
University of Oxford & FAIR, Meta AI
Geverifieerd e-mailadres voor mail.mcgill.ca
Titel
Geciteerd door
Geciteerd door
Jaar
The llama 3 herd of models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
13182024
Trajectory diversity for zero-shot coordination
A Lupu, B Cui, H Hu, J Foerster
International conference on machine learning, 7204-7213, 2021
1082021
Gifting in multi-agent reinforcement learning
A Lupu, D Precup
Proceedings of the 19th International Conference on autonomous agents and …, 2020
572020
Option-critic in cooperative multi-agent systems
J Chakravorty, N Ward, J Roy, M Chevalier-Boisvert, S Basu, A Lupu, ...
arXiv preprint arXiv:1911.12825, 2019
372019
Jaxmarl: Multi-agent rl environments and algorithms in jax
A Rutherford, B Ellis, M Gallici, J Cook, A Lupu, G Ingvarsson, T Willi, ...
The Thirty-eight Conference on Neural Information Processing Systems …, 2024
36*2024
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
M Samvelyan*, SC Raparthy*, A Lupu*, E Hambro, AH Markosyan, ...
arXiv preprint arXiv:2402.16822, 2024
292024
Adversarial Diversity in Hanabi
B Cui*, A Lupu*, S Sokota, H Hu, DJ Wu, JN Foerster
The Eleventh International Conference on Learning Representations, 2022
172022
Grounding aleatoric uncertainty for unsupervised environment design
M Jiang, M Dennis, J Parker-Holder, A Lupu, H Küttler, E Grefenstette, ...
Advances in Neural Information Processing Systems 35, 32868-32881, 2022
162022
Leveraging Observations in Bandits: Between Risks and Benefits
A Lupu, A Durand, D Precup
62019
Behaviour distillation
A Lupu, C Lu, J Liesen, RT Lange, J Foerster
arXiv preprint arXiv:2406.15042, 2024
32024
Imitation upper confidence bound for bandits on a graph
A Lupu, D Precup
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
32018
Discovering Minimal Reinforcement Learning Environments
J Liesen, C Lu, A Lupu, JN Foerster, H Sprekeler, RT Lange
arXiv preprint arXiv:2406.12589, 2024
12024
Self-explaining deviations for coordination
H Hu, S Sokota, D Wu, A Bakhtin, A Lupu, B Cui, J Foerster
Advances in Neural Information Processing Systems 35, 38400-38410, 2022
12022
Off-Team Learning
B Cui, H Hu, A Lupu, S Sokota, JN Foerster
Advances in Neural Information Processing Systems, 2022
12022
Gifting in Multi-Agent Reinforcement Learning (Student Abstract)
A Lupu, D Precup
Proceedings of the AAAI Conference on Artificial Intelligence 34 (10), 13871 …, 2020
12020
Les réseaux de Bragg dans des nanofils de silicium... et dans la salle de cours
A Lupu, R Adams, R Ashrafi, LR Chen
Vanier College, 2016
12016
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
L Alberts, B Ellis, A Lupu, J Foerster
arXiv preprint arXiv:2410.21159, 2024
2024
Leveraging Observational Learning for Exploration in Bandits
A Lupu, A Durand, D Precup
Proceedings of the 17th International Conference on Autonomous Agents and …, 2018
2018
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
B Ellis, MT Jackson, A Lupu, AD Goldie, M Fellows, S Whiteson, ...
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0
Leveraging Observational Learning for Exploration in Bandits
A Durand, A Lupu, D Precup
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20