Longteng Guo

Посилання

	Усі	З 2020
Цитування	1331	1316
h-індекс	13	13
i10-індекс	16	16

480

240

120

360

201820192020202120222023202420254 9 48 157 261 312 469 69

Доступні для всіх

Переглянути всі

15 статей

2 статті

доступні

недоступні

За умовами фінансування

Співавтори

Jing Liu 刘静Professor in Institute of Automation of the Chinese Academy Sciences (CASIA)Підтверджена електронна адреса в nlpr.ia.ac.cn
Xinxin Zhu 朱欣鑫Institute of Automation of the Chinese Academy Sciences (CASIA)Підтверджена електронна адреса в nlpr.ia.ac.cn
Xingjian HeInstitute of Automation of the Chinese Academy Sciences (CASIA)Підтверджена електронна адреса в nlpr.ia.ac.cn
Jinhui Tang (唐金辉)Nanjing University of Science and TechnologyПідтверджена електронна адреса в acm.org
Shuai ShaoTencentПідтверджена електронна адреса в tencent.com
Sihan ChenInstitute of Automation, Chinese Academy of SciencesПідтверджена електронна адреса в nlpr.ia.ac.cn
Zhiwei FangBusiness Growth BU, JD.COMПідтверджена електронна адреса в jd.com
Jun FuBeijing, China
Hanqing LuNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences

Підписатись

Longteng Guo

Інші імена郭龙腾

Associate Professor, Institute of Automation of the Chinese Academy Sciences (CASIA)

Підтверджена електронна адреса в nlpr.ia.ac.cn - Домашня сторінка

Multimodal Learning


Назва Сортувати за цитуваннями Сортувати за роком Сортувати за назвою	Посилання Посилання	Рік
Normalized and geometry-aware self-attention network for image captioning L Guo, J Liu, X Zhu, P Yao, S Lu, H Lu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	265	2020
Cptr: Full transformer network for image captioning W Liu, S Chen, L Guo, X Zhu, J Liu arXiv preprint arXiv:2101.10804, 2021	226	2021
Mscap: Multi-style image captioning with unpaired stylized text L Guo, J Liu, P Yao, J Li, H Lu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019	136	2019
Aligning linguistic words and visual semantic units for image captioning L Guo, J Liu, J Tang, J Li, W Luo, H Lu Proceedings of the 27th ACM international conference on multimedia, 765-773, 2019	118	2019
Valor: Vision-audio-language omni-perception pretraining model and dataset S Chen, X He, L Guo, X Zhu, W Wang, J Tang, J Liu arXiv preprint arXiv:2304.08345, 2023	98	2023
Vl-mamba: Exploring state space models for multimodal learning Y Qiao, Z Yu, L Guo, S Chen, Z Zhao, M Sun, Q Wu, J Liu arXiv preprint arXiv:2403.13600, 2024	64	2024
Non-autoregressive image captioning with counterfactuals-critical multi-agent learning L Guo, J Liu, X Zhu, X He, J Jiang, H Lu arXiv preprint arXiv:2005.04690, 2020	61	2020
Chatbridge: Bridging modalities with large language model as a language catalyst Z Zhao, L Guo, T Yue, S Chen, S Shao, X Zhu, Z Yuan, J Liu arXiv preprint arXiv:2305.16103, 2023	54	2023
Show, tell, and polish: Ruminant decoding for image captioning L Guo, J Liu, S Lu, H Lu IEEE Transactions on Multimedia 22 (8), 2149-2162, 2019	50	2019
Opt: Omni-perception pre-trainer for cross-modal understanding and generation J Liu, X Zhu, F Liu, L Guo, Z Zhao, M Sun, W Wang, H Lu, S Zhou, J Zhang, ... arXiv preprint arXiv:2107.00249, 2021	47	2021
Boosted transformer for image captioning J Li, P Yao, L Guo, W Zhang Applied Sciences 9 (16), 3260, 2019	43	2019
Sketch-based image retrieval using generative adversarial networks L Guo, J Liu, Y Wang, Z Luo, W Wen, H Lu Proceedings of the 25th ACM international conference on Multimedia, 1267-1268, 2017	38	2017
AutoCaption: Image captioning with neural architecture search X Zhu, W Wang, L Guo, J Liu arXiv preprint arXiv:2012.09742, 2020	19	2020
Mamo: Fine-grained vision-language representations learning with masked multimodal modeling Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu Proceedings of the 46th International ACM SIGIR Conference on Research and …, 2023	12	2023
CM-MaskSD: Cross-modality masked self-distillation for referring image segmentation W Wang, X He, Y Zhang, L Guo, J Shen, J Li, J Liu IEEE Transactions on Multimedia 26, 6906-6916, 2024	10	2024
Sc-tune: Unleashing self-consistent referential comprehension in large vision language models T Yue, J Cheng, L Guo, X Dai, Z Zhao, X He, G Xiong, Y Lv, J Liu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	10	2024
Needle in a video haystack: A scalable synthetic framework for benchmarking video mllms Z Zhao, H Lu, Y Huo, Y Du, T Yue, L Guo, B Wang, W Chen, J Liu arXiv e-prints, arXiv: 2406.09367, 2024	9	2024
Fast sequence generation with multi-agent reinforcement learning L Guo, J Liu, X Zhu, H Lu arXiv preprint arXiv:2101.09698, 2021	9	2021
Mm21 pre-training for video understanding challenge: Video captioning with pretraining techniques S Chen, X Zhu, D Hao, W Liu, J Liu, Z Zhao, L Guo, J Liu Proceedings of the 29th ACM International Conference on Multimedia, 4853-4857, 2021	8	2021
Valor: Vision-audio-language omni-perception pretraining model and dataset J Liu, S Chen, X He, L Guo, X Zhu, W Wang, J Tang IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024	7	2024

У даний момент система не може виконати операцію. Спробуйте пізніше.

Статті 1–20

Кількість бібліографічних посилань на рік

Повторювані посилання

Об’єднані посилання

Додати співавторівСпівавтори

Підписатись

Посилання

Співавтори