Volgen
Yuda Song
Yuda Song
Geverifieerd e-mailadres voor andrew.cmu.edu - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
Hybrid rl: Using both offline and online data can make rl efficient
Y Song, Y Zhou, A Sekhari, JA Bagnell, A Krishnamurthy, W Sun
arXiv preprint arXiv:2210.06718, 2022
852022
Efficient reinforcement learning in block mdps: A model-free representation learning approach
X Zhang, Y Song, M Uehara, M Wang, A Agarwal, W Sun
International Conference on Machine Learning, 26517-26547, 2022
732022
Transform2act: Learning a transform-and-control policy for efficient agent design
Y Yuan, Y Song, Z Luo, W Sun, K Kitani
arXiv preprint arXiv:2110.03659, 2021
402021
Provable benefits of representational transfer in reinforcement learning
A Agarwal, Y Song, W Sun, K Wang, M Wang, X Zhang
The Thirty Sixth Annual Conference on Learning Theory, 2114-2187, 2023
312023
Pc-mlp: Model-based reinforcement learning with policy cover guided exploration
Y Song, W Sun
International Conference on Machine Learning, 9801-9811, 2021
252021
Representation learning for general-sum low-rank markov games
C Ni, Y Song, X Zhang, C Jin, M Wang
arXiv preprint arXiv:2210.16976, 2022
18*2022
The virtues of laziness in model-based rl: A unified objective and algorithms
A Vemula, Y Song, A Singh, D Bagnell, S Choudhury
International Conference on Machine Learning, 34978-35005, 2023
122023
Provably efficient model-based policy adaptation
Y Song, A Mavalankar, W Sun, S Gao
arXiv preprint arXiv:2006.08051, 2020
112020
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
Y Song, G Swamy, A Singh, JA Bagnell, W Sun
arXiv preprint arXiv:2406.01462, 2024
9*2024
Offline data enhanced on-policy policy gradient with provable guarantees
Y Zhou, A Sekhari, Y Song, W Sun
arXiv preprint arXiv:2311.08384, 2023
72023
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Y Song, L Wu, DJ Foster, A Krishnamurthy
arXiv preprint arXiv:2405.19269, 2024
12024
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Y Song, H Zhang, C Eisenach, S Kakade, D Foster, U Ghai
arXiv preprint arXiv:2412.02674, 2024
2024
Hybrid Reinforcement Learning from Offline Observation Alone
Y Song, JA Bagnell, A Singh
arXiv preprint arXiv:2406.07253, 2024
2024
Online No-regret Model-Based Meta RL for Personalized Navigation
Y Song, Y Yuan, W Sun, K Kitani
arXiv preprint arXiv:2204.01925, 2022
2022
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–14