Show and tell: A neural image caption generator O Vinyals, A Toshev, S Bengio, D Erhan Proceedings of the IEEE conference on computer vision and pattern …, 2015 | 7923 | 2015 |
Deeppose: Human pose estimation via deep neural networks A Toshev, C Szegedy Proceedings of the IEEE conference on computer vision and pattern …, 2014 | 4011 | 2014 |
Deep neural networks for object detection C Szegedy, A Toshev, D Erhan Advances in neural information processing systems 26, 2013 | 2094 | 2013 |
Scalable object detection using deep neural networks D Erhan, C Szegedy, A Toshev, D Anguelov Proceedings of the IEEE conference on computer vision and pattern …, 2014 | 1632 | 2014 |
Generation and comprehension of unambiguous object descriptions J Mao, J Huang, A Toshev, O Camburu, AL Yuille, K Murphy Proceedings of the IEEE conference on computer vision and pattern …, 2016 | 1424 | 2016 |
Do as i can, not as i say: Grounding language in robotic affordances M Ahn, A Brohan, N Brown, Y Chebotar, O Cortes, B David, C Finn, C Fu, ... arXiv preprint arXiv:2204.01691, 2022 | 1379 | 2022 |
Show and tell: Lessons learned from the 2015 mscoco image captioning challenge O Vinyals, A Toshev, S Bengio, D Erhan IEEE transactions on pattern analysis and machine intelligence 39 (4), 652-663, 2016 | 1140 | 2016 |
Towards accurate multi-person pose estimation in the wild G Papandreou, T Zhu, N Kanazawa, A Toshev, J Tompson, C Bregler, ... Proceedings of the IEEE conference on computer vision and pattern …, 2017 | 1103 | 2017 |
No fuss distance metric learning using proxies Y Movshovitz-Attias, A Toshev, TK Leung, S Ioffe, S Singh Proceedings of the IEEE international conference on computer vision, 360-368, 2017 | 792 | 2017 |
Deep convolutional ranking for multilabel image annotation Y Gong, Y Jia, T Leung, A Toshev, S Ioffe arXiv preprint arXiv:1312.4894, 2013 | 571 | 2013 |
The unreasonable effectiveness of noisy data for fine-grained recognition J Krause, B Sapp, A Howard, H Zhou, A Toshev, T Duerig, J Philbin, ... Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016 | 440 | 2016 |
Do as i can, not as i say: Grounding language in robotic affordances A Brohan, Y Chebotar, C Finn, K Hausman, A Herzog, D Ho, J Ibarz, ... Conference on robot learning, 287-318, 2023 | 439 | 2023 |
Cascaded models for articulated pose estimation B Sapp, A Toshev, B Taskar Computer Vision–ECCV 2010: 11th European Conference on Computer Vision …, 2010 | 309 | 2010 |
Visual representations for semantic target driven navigation A Mousavian, A Toshev, M Fišer, J Košecká, A Wahid, J Davidson 2019 International Conference on Robotics and Automation (ICRA), 8846-8852, 2019 | 245 | 2019 |
Chained predictions using convolutional neural networks G Gkioxari, A Toshev, N Jaitly Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016 | 232 | 2016 |
Objectnav revisited: On evaluation of embodied agents navigating to objects D Batra, A Gokaslan, A Kembhavi, O Maksymets, R Mottaghi, M Savva, ... arXiv preprint arXiv:2006.13171, 2020 | 228 | 2020 |
Interactive gibson benchmark: A benchmark for interactive navigation in cluttered environments F Xia, WB Shen, C Li, P Kasimbeg, ME Tchapmi, A Toshev, ... IEEE Robotics and Automation Letters 5 (2), 713-720, 2020 | 225 | 2020 |
Scene memory transformer for embodied agents in long-horizon tasks K Fang, A Toshev, L Fei-Fei, S Savarese Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 216 | 2019 |
Object detection using deep neural networks C Szegedy, D Erhan, AT Toshev US Patent 9,275,308, 2016 | 161 | 2016 |
MM1: methods, analysis and insights from multimodal LLM pre-training B McKinzie, Z Gan, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ... European Conference on Computer Vision, 304-323, 2025 | 148 | 2025 |