Stochastic multi-armed bandits with unrestricted delay distributions T Lancewicki, S Segal, T Koren, Y Mansour International Conference on Machine Learning, 5969-5978, 2021 | 36 | 2021 |
Learning adversarial markov decision processes with delayed feedback T Lancewicki, A Rosenberg, Y Mansour Proceedings of the AAAI Conference on Artificial Intelligence 36 (7), 7281-7289, 2022 | 20 | 2022 |
Regret minimization and convergence to equilibria in general-sum markov games L Erez, T Lancewicki, U Sherman, T Koren, Y Mansour International Conference on Machine Learning, 9343-9373, 2023 | 18 | 2023 |
Near-optimal regret for adversarial mdp with delayed bandit feedback T Jin, T Lancewicki, H Luo, Y Mansour, A Rosenberg Advances in Neural Information Processing Systems 35, 33469-33481, 2022 | 17 | 2022 |
Delay-adapted policy optimization and improved regret for adversarial MDP with delayed bandit feedback T Lancewicki, A Rosenberg, D Sotnikov International Conference on Machine Learning, 18482-18534, 2023 | 2 | 2023 |
Cooperative online learning in stochastic and adversarial MDPs T Lancewicki, A Rosenberg, Y Mansour International Conference on Machine Learning, 11918-11968, 2022 | 2 | 2022 |
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs D van der Hoeven, L Zierahn, T Lancewicki, A Rosenberg, ... The Thirty Sixth Annual Conference on Learning Theory, 1285-1321, 2023 | | 2023 |
Towards Natural Language-Driven Industrial Assembly Using Foundation Models O Joglekar, S Kozlovsky, T Lancewicki, V Tchuiev, Z Feldman, ... ICLR 2024 Workshop on Large Language Model (LLM) Agents, 0 | | |