Підписатись
Longteng Guo
Longteng Guo
Associate Professor, Institute of Automation of the Chinese Academy Sciences (CASIA)
Підтверджена електронна адреса в nlpr.ia.ac.cn - Домашня сторінка
Назва
Посилання
Посилання
Рік
Normalized and geometry-aware self-attention network for image captioning
L Guo, J Liu, X Zhu, P Yao, S Lu, H Lu
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
2652020
Cptr: Full transformer network for image captioning
W Liu, S Chen, L Guo, X Zhu, J Liu
arXiv preprint arXiv:2101.10804, 2021
2262021
Mscap: Multi-style image captioning with unpaired stylized text
L Guo, J Liu, P Yao, J Li, H Lu
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019
1362019
Aligning linguistic words and visual semantic units for image captioning
L Guo, J Liu, J Tang, J Li, W Luo, H Lu
Proceedings of the 27th ACM international conference on multimedia, 765-773, 2019
1182019
Valor: Vision-audio-language omni-perception pretraining model and dataset
S Chen, X He, L Guo, X Zhu, W Wang, J Tang, J Liu
arXiv preprint arXiv:2304.08345, 2023
982023
Vl-mamba: Exploring state space models for multimodal learning
Y Qiao, Z Yu, L Guo, S Chen, Z Zhao, M Sun, Q Wu, J Liu
arXiv preprint arXiv:2403.13600, 2024
642024
Non-autoregressive image captioning with counterfactuals-critical multi-agent learning
L Guo, J Liu, X Zhu, X He, J Jiang, H Lu
arXiv preprint arXiv:2005.04690, 2020
612020
Chatbridge: Bridging modalities with large language model as a language catalyst
Z Zhao, L Guo, T Yue, S Chen, S Shao, X Zhu, Z Yuan, J Liu
arXiv preprint arXiv:2305.16103, 2023
542023
Show, tell, and polish: Ruminant decoding for image captioning
L Guo, J Liu, S Lu, H Lu
IEEE Transactions on Multimedia 22 (8), 2149-2162, 2019
502019
Opt: Omni-perception pre-trainer for cross-modal understanding and generation
J Liu, X Zhu, F Liu, L Guo, Z Zhao, M Sun, W Wang, H Lu, S Zhou, J Zhang, ...
arXiv preprint arXiv:2107.00249, 2021
472021
Boosted transformer for image captioning
J Li, P Yao, L Guo, W Zhang
Applied Sciences 9 (16), 3260, 2019
432019
Sketch-based image retrieval using generative adversarial networks
L Guo, J Liu, Y Wang, Z Luo, W Wen, H Lu
Proceedings of the 25th ACM international conference on Multimedia, 1267-1268, 2017
382017
AutoCaption: Image captioning with neural architecture search
X Zhu, W Wang, L Guo, J Liu
arXiv preprint arXiv:2012.09742, 2020
192020
Mamo: Fine-grained vision-language representations learning with masked multimodal modeling
Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu
Proceedings of the 46th International ACM SIGIR Conference on Research and …, 2023
122023
CM-MaskSD: Cross-modality masked self-distillation for referring image segmentation
W Wang, X He, Y Zhang, L Guo, J Shen, J Li, J Liu
IEEE Transactions on Multimedia 26, 6906-6916, 2024
102024
Sc-tune: Unleashing self-consistent referential comprehension in large vision language models
T Yue, J Cheng, L Guo, X Dai, Z Zhao, X He, G Xiong, Y Lv, J Liu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
102024
Needle in a video haystack: A scalable synthetic framework for benchmarking video mllms
Z Zhao, H Lu, Y Huo, Y Du, T Yue, L Guo, B Wang, W Chen, J Liu
arXiv e-prints, arXiv: 2406.09367, 2024
92024
Fast sequence generation with multi-agent reinforcement learning
L Guo, J Liu, X Zhu, H Lu
arXiv preprint arXiv:2101.09698, 2021
92021
Mm21 pre-training for video understanding challenge: Video captioning with pretraining techniques
S Chen, X Zhu, D Hao, W Liu, J Liu, Z Zhao, L Guo, J Liu
Proceedings of the 29th ACM International Conference on Multimedia, 4853-4857, 2021
82021
Valor: Vision-audio-language omni-perception pretraining model and dataset
J Liu, S Chen, X He, L Guo, X Zhu, W Wang, J Tang
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
72024
У даний момент система не може виконати операцію. Спробуйте пізніше.
Статті 1–20