Shizhe Chen

צוטט על ידי

	הכל	מאז 2020
ציטוטים ביבליוגרפיים	3460	3050
H-index	27	27
i10-index	48	44

1100

550

275

825

201620172018201920202021202220232024202522 55 127 190 223 345 554 775 1011 136

גישה ציבורית

הצג הכל

41 מאמרים

10 מאמרים

זמין

לא זמין

על סמך ייפוי כח מהמממנים

מחברים משותפים

Qin Jin中国人民大学信息学院כתובת אימייל מאומתת בדומיין ruc.edu.cn
Cordelia SchmidResearch director INRIA כתובת אימייל מאומתת בדומיין inria.fr
Ivan LaptevProfessor at MBZUAI, on leave from INRIAכתובת אימייל מאומתת בדומיין inria.fr
Alex HauptmannCarnegie Mellon Universityכתובת אימייל מאומתת בדומיין cs.cmu.edu
Ruihua SongRenmin University of Chinaכתובת אימייל מאומתת בדומיין ruc.edu.cn

עקוב אחר

Shizhe Chen

INRIA Paris

כתובת אימייל מאומתת בדומיין inria.fr - דף הבית

Computer Vision Vision-and-Language


כותרת מיון לפי ציטוט ביבליוגרפי מיון לפי שנה מיון לפי כותרת	צוטט על ידי צוטט על ידי	שנה
Fine-grained video-text retrieval with hierarchical graph reasoning‏ S Chen, Y Zhao, Q Jin, Q Wu‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020‏	380	2020
Say as you wish: Fine-grained control of image caption generation with abstract scene graphs‏ S Chen, Q Jin, P Wang, Q Wu‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020‏	275	2020
History aware multimodal transformer for vision-and-language navigation‏ S Chen, PL Guhur, C Schmid, I Laptev‏ Advances in neural information processing systems 34, 5834-5847, 2021‏	241	2021
Speech emotion recognition with acoustic and lexical features‏ Q Jin, C Li, S Chen, H Wu‏ 2015 IEEE international conference on acoustics, speech and signal …, 2015‏	226	2015
Multimodal multi-task learning for dimensional and continuous emotion recognition‏ S Chen, Q Jin, J Zhao, S Wang‏ Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, 19-26, 2017‏	174	2017
Think global, act local: Dual-scale graph transformer for vision-and-language navigation‏ S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev‏ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022‏	159	2022
Airbert: In-domain pretraining for vision-and-language navigation‏ PL Guhur, M Tapaswi, S Chen, I Laptev, C Schmid‏ Proceedings of the IEEE/CVF international conference on computer vision …, 2021‏	155	2021
Multi-modal dimensional emotion recognition using recurrent neural networks‏ S Chen, Q Jin‏ Proceedings of the 5th International Workshop on Audio/Visual Emotion …, 2015‏	146	2015
WenLan: Bridging vision and language by large-scale multi-modal pre-training‏ Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ...‏ arXiv preprint arXiv:2103.06561, 2021‏	145	2021
Elaborative rehearsal for zero-shot action recognition‏ S Chen, D Huang‏ Proceedings of the IEEE/CVF international conference on computer vision …, 2021‏	126	2021
Describing videos using multi-modal fusion‏ Q Jin, J Chen, S Chen, Y Xiong, A Hauptmann‏ Proceedings of the 24th ACM international conference on Multimedia, 1087-1091, 2016‏	119	2016
Instruction-driven history-aware policies for robotic manipulations‏ PL Guhur, S Chen, RG Pinel, M Tapaswi, I Laptev, C Schmid‏ Conference on Robot Learning, 175-187, 2023‏	106	2023
Sketch, ground, and refine: Top-down dense video captioning‏ C Deng, S Chen, D Chen, Y He, Q Wu‏ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021‏	82	2021
Multi-modal conditional attention fusion for dimensional emotion prediction‏ S Chen, Q Jin‏ Proceedings of the 24th ACM international conference on Multimedia, 571-575, 2016‏	78	2016
Video captioning with guidance of multimodal latent topics‏ S Chen, J Chen, Q Jin, A Hauptmann‏ Proceedings of the 25th ACM international conference on Multimedia, 1838-1846, 2017‏	74	2017
Language conditioned spatial relation reasoning for 3d object grounding‏ S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev‏ Advances in neural information processing systems 35, 20522-20535, 2022‏	66	2022
Few-shot action recognition with hierarchical matching and contrastive learning‏ S Zheng, S Chen, Q Jin‏ European conference on computer vision, 297-313, 2022‏	60	2022
Multi-modal multi-cultural dimensional continues emotion recognition in dyadic interactions‏ J Zhao, R Li, S Chen, Q Jin‏ Proceedings of the 2018 on audio/visual emotion challenge and workshop, 65-72, 2018‏	56	2018
Learning from unlabeled 3d environments for vision-and-language navigation‏ S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev‏ European Conference on Computer Vision, 638-655, 2022‏	46	2022
Unpaired cross-lingual image caption generation with self-supervised rewards‏ Y Song, S Chen, Y Zhao, Q Jin‏ Proceedings of the 27th ACM international conference on multimedia, 784-792, 2019‏	46	2019

המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.

מאמרים 1–20

ציטוטים ביבליוגרפיים בשנה

ציטוטים ביביליוגרפיים כפולים

ציטוטים ביביליוגרפיים שמוזגו

הוסף מחברים שותפיםמחברים משותפים

עקוב אחר

צוטט על ידי

מחברים משותפים