Xiaoqian Shen

Zitiert von

	Alle	Seit 2020
Zitate	3374	3374
h-index	10	10
i10-index	10	10

2600

1300

650

1950

20222023202420259 528 2599 232

Öffentlicher Zugriff

Alle anzeigen

2 Artikel

0 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Mohamed Elhoseiny, Ph.D.Associate Professor, KAUST (hiring postdocs & grad students)Bestätigte E-Mail-Adresse bei kaust.edu.sa
Deyao ZhuResearch Scientist, ByteDanceBestätigte E-Mail-Adresse bei bytedance.com
Jun ChenKAUSTBestätigte E-Mail-Adresse bei kaust.edu.sa
Xiang LiLecturer, University of Reading (hiring PhD/Interns)Bestätigte E-Mail-Adresse bei reading.ac.uk
Yunyang XiongUniversity of Wisconsin-MadisonBestätigte E-Mail-Adresse bei wisc.edu
Vikas ChandraReality Labs, MetaBestätigte E-Mail-Adresse bei meta.com
Li Erran LiIEEE Fellow and ACM Fellow, AWS AI, AmazonBestätigte E-Mail-Adresse bei cs.columbia.edu
Hu XuMeta AI (FAIR Labs)Bestätigte E-Mail-Adresse bei meta.com
Zhuang LiuResearch Scientist, FAIR, MetaBestätigte E-Mail-Adresse bei berkeley.edu
Ivan SkorokhodovSnap Inc.Bestätigte E-Mail-Adresse bei snap.com
Gamaleldin ElsayedResearch Scientist, Google DeepMindBestätigte E-Mail-Adresse bei google.com
Li-Jia LiChief AI, IEEE FellowBestätigte E-Mail-Adresse bei healthunity.org

Folgen

Xiaoqian Shen

CS PhD @ KAUST

Bestätigte E-Mail-Adresse bei kaust.edu.sa - Startseite

Generative Models Vision-Language


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
Minigpt-4: Enhancing vision-language understanding with advanced large language models D Zhu, J Chen, X Shen, X Li, M Elhoseiny ICLR'24, 2024	2500	2024
Minigpt-v2: large language model as a unified interface for vision-language multi-task learning J Chen, D Zhu, X Shen, X Li, Z Liu, P Zhang, R Krishnamoorthi, ... arXiv preprint arXiv:2310.09478, 2023	521	2023
Chatgpt asks, blip-2 answers: Automatic questioning towards enriched visual descriptions D Zhu, J Chen, K Haydarov, X Shen, W Zhang, M Elhoseiny TMLR, 2024	97	2024
Hrs-bench: Holistic, reliable and scalable benchmark for text-to-image models EM Bakr, X Shen, P Sun, FF Khan, LE Li, M Elhoseiny ICCV'23, 2023	59	2023
Minigpt4-video: Advancing multimodal llms for video understanding with interleaved visual-textual tokens K Ataallah, X Shen, E Abdelrahman, E Sleiman, D Zhu, J Ding, ... CVPRW'24, 2024	47	2024
Mostgan-v: Video generation with temporal motion styles X Shen, X Li, M Elhoseiny CVPR'23, 2023	38	2023
Multi-ConDoS: Multimodal contrastive domain sharing generative adversarial networks for self-supervised medical image segmentation J Zhang, S Zhang, X Shen, T Lukasiewicz, Z Xu IEEE Transactions on Medical Imaging, 2023	34	2023
KeMRE: Knowledge-enhanced medical relation extraction for Chinese medicine instructions T Qi, S Qiu, X Shen, H Chen, S Yang, H Wen, Y Zhang, Y Wu, Y Huang Journal of Biomedical Informatics 120, 103834, 2021	21	2021
Exploring hierarchical graph representation for large-scale zero-shot image classification K Yi, X Shen, Y Gou, M Elhoseiny ECCV'22, 2022	19	2022
Longvu: Spatiotemporal adaptive compression for long video-language understanding X Shen, Y Xiong, C Zhao, L Wu, J Chen, C Zhu, Z Liu, F Xiao, ... arXiv preprint arXiv:2410.17434, 2024	16	2024
StoryGPT-V: Large Language Models as Consistent Story Visualizers X Shen, M Elhoseiny arXiv preprint arXiv:2312.02252, 2023	9	2023
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos K Ataallah, X Shen, E Abdelrahman, E Sleiman, M Zhuge, J Ding, D Zhu, ... ECCV’24, 2024	5	2024
Affective visual dialog: A large-scale benchmark for emotional reasoning based on visually grounded conversations K Haydarov, X Shen, A Madasu, M Salem, J Li, G Elsayed, M Elhoseiny ECCV'24, 2024	4	2024
Adversarial Text to Continuous Image Generation K Haydarov, A Muhamed, X Shen, J Lazarevic, I Skorokhodov, ... CVPR'24, 2024	4*	2024
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling Z Ye, J Liu, R Peng, J Cao, Z Chen, Y Zhang, Z Xuan, M Zhou, X Shen, ... arXiv preprint arXiv:2408.03695, 2024		2024
iMotion-LLM: Motion Prediction Instruction Tuning A Felemban, EM Bakr, X Shen, J Ding, A Mohamed, M Elhoseiny arXiv preprint arXiv:2406.06211, 2024		2024
EmoTalker: Audio Driven Emotion Aware Talking Head Generation X Shen, FF Khan, M Elhoseiny Proceedings of the Asian Conference on Computer Vision, 1900-1917, 2024		2024
ReferPix2Pix: Guiding Multi-Modal LLMs for Image Editing with Referential Pixel Grounding X Shen, M Elhoseiny

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–18

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren