- Academic Search

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Enregistrer Citer Cité 139 fois Autres articles Les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large models for time series and spatio-temporal data: A survey and outlook

M **, Q Wen, Y Liang, C Zhang, S Xue, X Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Temporal data, notably time series and spatio-temporal data, are prevalent in real-world
applications. They capture dynamic system measurements and are produced in vast …

Enregistrer Citer Cité 115 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large scale foundation models for intelligent manufacturing applications: a survey

H Zhang, SD Semujju, Z Wang, X Lv, K Xu… - Journal of Intelligent …, 2025 - Springer

Although the applications of artificial intelligence especially deep learning have greatly
improved various aspects of intelligent manufacturing, they still face challenges for broader …

Enregistrer Citer Cité 5 fois Autres articles Les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arxiv preprint arxiv …, 2024 - arxiv.org

The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …

Enregistrer Citer Cité 189 fois Autres articles Les 4 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

When urban region profiling meets large language models

Y Yan, H Wen, S Zhong, W Chen, H Chen… - arxiv preprint arxiv …, 2023 - arxiv.org

Urban region profiling from web-sourced data is of utmost importance for urban planning
and sustainable development. We are witnessing a rising trend of LLMs for various fields …

Enregistrer Citer Cité 17 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unlocking memorization in large language models with dynamic soft prompting

Z Wang, R Bao, Y Wu, J Taylor, C **ao, F Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org

Pretrained large language models (LLMs) have revolutionized natural language processing
(NLP) tasks such as summarization, question answering, and translation. However, LLMs …

Enregistrer Citer Cité 5 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Urbanclip: Learning text-enhanced urban region profiling with contrastive language-image pretraining from the web

Y Yan, H Wen, S Zhong, W Chen, H Chen… - Proceedings of the …, 2024 - dl.acm.org

Urban region profiling from web-sourced data is of utmost importance for urban computing.
We are witnessing a blossom of LLMs for various fields, especially in multi-modal data …

Enregistrer Citer Cité 29 fois Autres articles Les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LifelongMemory: Leveraging LLMs for Answering Queries in Egocentric Videos

Y Wang, Y Yang, M Ren - arxiv preprint arxiv:2312.05269, 2023 - arxiv.org

The egocentric video natural language query (NLQ) task involves localizing a temporal
window in an egocentric video that provides an answer to a posed query, which has wide …

Enregistrer Citer Cité 9 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

ChatCam: Embracing LLMs for Contextual Chatting-to-Camera with Interest-Oriented Video Summarization

K **ao, Y Gao, F Li, W Xu, P Chen… - Proceedings of the ACM on …, 2024 - dl.acm.org

Cameras are ubiquitous in society, with users increasingly looking to extract insights about
the physical world. Current human-to-camera interaction methods, while advanced, still …

Enregistrer Citer Autres articles

Video Question Answering: A survey of the state-of-the-art

PJ Jeshmol, BC Kovoor - Journal of Visual Communication and Image …, 2024 - Elsevier

Abstract Video Question Answering (VideoQA) emerges as a prominent trend in the domain
of Artificial Intelligence, Computer Vision, and Natural Language Processing. It involves …

Enregistrer Citer Autres articles

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Retrieving-to-answer: Zero-shot video question answering with frozen large language models

Foundation Models Defining a New Era in Vision: a Survey and Outlook

Large models for time series and spatio-temporal data: A survey and outlook

Large scale foundation models for intelligent manufacturing applications: a survey

Retrieval-augmented generation for ai-generated content: A survey

When urban region profiling meets large language models

Unlocking memorization in large language models with dynamic soft prompting

Urbanclip: Learning text-enhanced urban region profiling with contrastive language-image pretraining from the web

LifelongMemory: Leveraging LLMs for Answering Queries in Egocentric Videos

ChatCam: Embracing LLMs for Contextual Chatting-to-Camera with Interest-Oriented Video Summarization

Video Question Answering: A survey of the state-of-the-art