Peihao Chen

Cited by

	All	Since 2020
Citations	1815	1801
h-index	17	17
i10-index	18	18

740

370

185

555

201920202021202220232024202510 85 226 309 390 731 58

Public access

View all

14 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Chuang GanUMass Amherst | MIT-IBM Watson AI LabVerified email at csail.mit.edu
Runhao Zeng (曾润浩)Tenured Associate Professor, Shenzhen MSU-BIT UniversityVerified email at smbu.edu.cn
Wenbing HuangAssociate Professor, Renmin University of ChinaVerified email at ruc.edu.cn
Antonio TorralbaProfessor of Computer Science, MITVerified email at csail.mit.edu
David CoxVP, AI Models; IBM Director, MIT-IBM Watson AI Lab, IBM ResearchVerified email at ibm.com
Hang ZhaoAssistant Professor, Tsinghua UniversityVerified email at csail.mit.edu
Joshua B. TenenbaumMITVerified email at mit.edu
Qingyao WuSchool of Software Engineering, South China University of TechnologyVerified email at scut.edu.cn
Guangyao ShenTsinghua UniversityVerified email at mails.tsinghua.edu.cn

Peihao Chen

Researcher at Robotics X Lab, Tencent

Verified email at tencent.com - Homepage

Embodied AI Multi-Modal Video Understanding


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Dense regression network for video grounding R Zeng, H Xu, W Huang, P Chen, M Tan, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020	320	2020
3d-llm: Injecting the 3d world into large language models Y Hong, H Zhen, P Chen, S Zheng, Y Du, Z Chen, C Gan Advances in Neural Information Processing Systems 36, 20482-20494, 2023	249	2023
Location-aware graph convolutional networks for video question answering D Huang, P Chen, R Zeng, Q Du, M Tan, C Gan Proceedings of the AAAI Conference on Artificial Intelligence 34 (07), 11021 …, 2020	211	2020
Self-supervised moving vehicle tracking with stereo sound C Gan, H Zhao, P Chen, D Cox, A Torralba Proceedings of the IEEE/CVF international conference on computer vision …, 2019	173	2019
Foley music: Learning to generate music from videos C Gan, D Huang, P Chen, JB Tenenbaum, A Torralba Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020	154	2020
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning P Chen, D Huang, D He, X Long, R Zeng, S Wen, M Tan, C Gan AAAI Conference on Artificial Intelligence, 2021, 2020	130	2020
Generating visually aligned sound from videos P Chen, Y Zhang, M Tan, H Xiao, D Huang, C Gan IEEE Transactions on Image Processing 29, 8292-8302, 2020	100	2020
Breaking winner-takes-all: Iterative-winners-out networks for weakly supervised temporal action localization R Zeng, C Gan, P Chen, W Huang, Q Wu, M Tan IEEE Transactions on Image Processing 28 (12), 5797-5808, 2019	95	2019
Relation attention for temporal action localization P Chen, C Gan, G Shen, W Huang, R Zeng, M Tan IEEE Transactions on Multimedia 22 (10), 2723-2733, 2019	84	2019
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation P Chen, D Ji, K Lin, R Zeng, TH Li, M Tan, C Gan NeurIPS 2022, 2022	56	2022
3d-vla: A 3d vision-language-action generative world model H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du, Y Hong, C Gan arXiv preprint arXiv:2403.09631, 2024	48	2024
Vesper: A compact and effective pretrained model for speech emotion recognition W Chen, X Xing, P Chen, X Xu IEEE Transactions on Affective Computing, 2024	37	2024
Masked motion encoding for self-supervised video representation learning X Sun, P Chen, L Chen, C Li, TH Li, M Tan, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	35	2023
Learning Active Camera for Multi-Object Navigation P Chen, D Ji, K Lin, W Hu, W Huang, TH Li, M Tan, C Gan NeurIPS 2022, 2022	26	2022
Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models P Chen, X Sun, H Zhi, R Zeng, TH Li, G Liu, M Tan, C Gan arXiv preprint arXiv:2308.07997, 2023	22	2023
Learning vision-and-language navigation from youtube videos K Lin, P Chen, D Huang, TH Li, M Tan, C Gan Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	22	2023
Multiply: A multisensory object-centric embodied large language model in 3d world Y Hong, Z Zheng, P Chen, Y Wang, J Li, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	19	2024
Covlm: Composing visual entities and relationships in large language models via communicative decoding J Li, D Chen, Y Hong, Z Chen, P Chen, Y Shen, C Gan arXiv preprint arXiv:2311.03354, 2023	11	2023
FGPrompt: fine-grained goal prompting for image-goal navigation X Sun, P Chen, J Fan, J Chen, T Li, M Tan Advances in Neural Information Processing Systems 36, 2024	7	2024
RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation Z Yang, J Liu, P Chen, A Cherian, TK Marks, J Le Roux, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	5	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors