Contextual combinatorial cascading bandits S Li, B Wang, S Zhang, W Chen International conference on machine learning, 1245-1253, 2016 | 147 | 2016 |
Private Q-Learning with Functional Noise in Continuous Spaces B Wang, N Hegde The Multi-disciplinary Conference on Reinforcement Learning and Decision …, 2019 | 71* | 2019 |
Shapley counterfactual credits for multi-agent reinforcement learning J Li, K Kuang, B Wang, F Liu, L Chen, F Wu, J Xiao Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021 | 58 | 2021 |
PAID: Prioritizing app issues for developers by tracking user reviews over versions C Gao, B Wang, P He, J Zhu, Y Zhou, MR Lyu 2015 IEEE 26th international symposium on software reliability engineering …, 2015 | 51 | 2015 |
Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control K Young, B Wang, ME Taylor International Joint Conference on Artificial Intelligence (IJCAI) 2019, 2018 | 31* | 2018 |
Deconfounded value decomposition for multi-agent reinforcement learning J Li, K Kuang, B Wang, F Liu, L Chen, C Fan, F Wu, J Xiao International Conference on Machine Learning, 12843-12856, 2022 | 20 | 2022 |
Multilinear extension of -submodular functions B Wang, H Zhou arXiv e-prints, arXiv: 2107.07103, 2021 | 16 | 2021 |
Beyond winning and losing: modeling human motivations and behaviors using inverse reinforcement learning B Wang, T Sun, SX Zheng Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2019., 2018 | 16* | 2018 |
Semantically aligned task decomposition in multi-agent reinforcement learning W Li, D Qiao, B Wang, X Wang, B Jin, H Zha arXiv preprint arXiv:2305.10865, 2023 | 14 | 2023 |
Learning fair representations via distance correlation minimization D Guo, C Wang, B Wang, H Zha IEEE Transactions on Neural Networks and Learning Systems 35 (2), 2139-2152, 2022 | 12 | 2022 |
Learning from good trajectories in offline multi-agent reinforcement learning Q Tian, K Kuang, F Liu, B Wang Proceedings of the AAAI Conference on Artificial Intelligence 37 (10), 11672 …, 2023 | 10 | 2023 |
Improved regret bounds for linear adversarial mdps via linear optimization F Kong, X Zhang, B Wang, S Li arXiv preprint arXiv:2302.06834, 2023 | 10 | 2023 |
Online policy optimization for robust mdp J Dong, J Li, B Wang, J Zhang arXiv preprint arXiv:2209.13841, 2022 | 10 | 2022 |
Combinatorial bandits under strategic manipulations J Dong, K Li, S Li, B Wang Proceedings of the Fifteenth ACM International Conference on Web Search and …, 2022 | 10 | 2022 |
Learning adversarial linear mixture markov decision processes with bandit feedback and unknown transition C Zhao, R Yang, B Wang, S Li The Eleventh International Conference on Learning Representations, 2023 | 8 | 2023 |
Information design in multi-agent reinforcement learning Y Lin, W Li, H Zha, B Wang Advances in Neural Information Processing Systems 36, 25584-25597, 2023 | 5 | 2023 |
Algorithms and theory for supervised gradual domain adaptation J Dong, S Zhou, B Wang, H Zhao arXiv preprint arXiv:2204.11644, 2022 | 5 | 2022 |
Policy optimization with second-order advantage information J Li, B Wang International Joint Conference on Artificial Intelligence (IJCAI) 2018 …, 2018 | 4 | 2018 |
Online Influence Maximization under Decreasing Cascade Model F Kong, J Xie, B Wang, T Yao, S Li arXiv preprint arXiv:2305.15428, 2023 | 3 | 2023 |
Learning adversarial low-rank markov decision processes with unknown transition and full-information feedback C Zhao, R Yang, B Wang, X Zhang, S Li Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |