- Academic Search

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Speichern Zitieren Zitiert von: 197 Ähnliche Artikel Alle 7 Versionen Bibliothekssuche HTML-Version

[Free GPT-4]

[PDF] arxiv.org

A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Speichern Zitieren Zitiert von: 71 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] thecvf.com

Vipergpt: Visual inference via python execution for reasoning

D Surís, S Menon, C Vondrick - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Answering visual queries is a complex task that requires both visual processing and
reasoning. End-to-end models, the dominant approach for this task, do not explicitly …

Speichern Zitieren Zitiert von: 415 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Large language models as commonsense knowledge for large-scale task planning

Z Zhao, WS Lee, D Hsu - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Large-scale task planning is a major challenge. Recent work exploits large language
models (LLMs) directly as a policy and shows surprisingly interesting results. This paper …

Speichern Zitieren Zitiert von: 178 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] sciencedirect.com

Explainability in deep reinforcement learning

A Heuillet, F Couthouis, N Díaz-Rodríguez - Knowledge-Based Systems, 2021 - Elsevier

A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature
relevance techniques to explain a deep neural network (DNN) output or explaining models …

Speichern Zitieren Zitiert von: 405 Ähnliche Artikel Alle 12 Versionen

[Free GPT-4]

[PDF] thecvf.com

Fine-grained video-text retrieval with hierarchical graph reasoning

S Chen, Y Zhao, Q **, Q Wu - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com

Cross-modal retrieval between videos and texts has attracted growing attentions due to the
rapid emergence of videos on the web. The current dominant approach is to learn a joint …

Speichern Zitieren Zitiert von: 376 Ähnliche Artikel Alle 10 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Act as you wish: Fine-grained control of motion diffusion model with hierarchical semantic graphs

P **, Y Wu, Y Fan, Z Sun, W Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Most text-driven human motion generation methods employ sequential modeling
approaches, eg, transformer, to extract sentence-level text representations automatically and …

Speichern Zitieren Zitiert von: 25 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] aaai.org

Video as conditional graph hierarchy for multi-granular question answering

J **ao, A Yao, Z Liu, Y Li, W Ji, TS Chua - Proceedings of the AAAI …, 2022 - ojs.aaai.org

Video question answering requires the models to understand and reason about both the
complex video and language data to correctly derive the answers. Existing efforts have been …

Speichern Zitieren Zitiert von: 126 Ähnliche Artikel Alle 6 Versionen HTML-Version

Region-aware image captioning via interaction learning

AA Liu, Y Zhai, N Xu, W Nie, W Li… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Image captioning is one of the primary goals in computer vision which aims to automatically
generate natural descriptions for images. Intuitively, human visual system can notice some …

Speichern Zitieren Zitiert von: 116 Ähnliche Artikel

[Free GPT-4]

[PDF] aaai.org

When radiology report generation meets knowledge graph

Y Zhang, X Wang, Z Xu, Q Yu, A Yuille, D Xu - Proceedings of the AAAI …, 2020 - aaai.org

Automatic radiology report generation has been an attracting research problem towards
computer-aided diagnosis to alleviate the workload of doctors in recent years. Deep learning …

Speichern Zitieren Zitiert von: 334 Ähnliche Artikel Alle 10 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Language-conditioned graph networks for relational reasoning

Vision-language pre-training: Basics, recent advances, and future trends

A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

Vipergpt: Visual inference via python execution for reasoning

Large language models as commonsense knowledge for large-scale task planning

Explainability in deep reinforcement learning

Fine-grained video-text retrieval with hierarchical graph reasoning

Act as you wish: Fine-grained control of motion diffusion model with hierarchical semantic graphs

Video as conditional graph hierarchy for multi-granular question answering

Region-aware image captioning via interaction learning

When radiology report generation meets knowledge graph