Google Tudós

FL Chen, DZ Zhang, ML Han, XY Chen, J Shi… - Machine Intelligence …, 2023 - Springer

In the past few years, the emergence of pre-training models has brought uni-modal fields
such as computer vision (CV) and natural language processing (NLP) to a new era …

Mentés Hivatkozás Idézetek száma: 220 Kapcsolódó cikkek Mind a(z) 8 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

X-llm: Bootstrap** advanced large language models by treating multi-modalities as foreign languages

F Chen, M Han, H Zhao, Q Zhang, J Shi, S Xu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4,
based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous …

Mentés Hivatkozás Idézetek száma: 112 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

GoG: Relation-aware graph-over-graph network for visual dialog

F Chen, X Chen, F Meng, P Li, J Zhou - arxiv preprint arxiv:2109.08475, 2021 - arxiv.org

Visual dialog, which aims to hold a meaningful conversation with humans about a given
image, is a challenging task that requires models to reason the complex dependencies …

Mentés Hivatkozás Idézetek száma: 34 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

The dialog must go on: Improving visual dialog via generative self-training

GC Kang, S Kim, JH Kim, D Kwak… - Proceedings of the …, 2023 - openaccess.thecvf.com

Visual dialog (VisDial) is a task of answering a sequence of questions grounded in an
image, using the dialog history as context. Prior work has trained the dialog agents solely on …

Mentés Hivatkozás Idézetek száma: 16 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving cross-modal understanding in visual dialog via contrastive learning

F Chen, X Chen, S Xu, B Xu - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Visual Dialog is a challenging vision-language task since the visual dialog agent needs to
answer a series of questions after reasoning over both the image content and dialog history …

Mentés Hivatkozás Idézetek száma: 24 Kapcsolódó cikkek Mind a(z) 5 változat

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Unsupervised and pseudo-supervised vision-language alignment in visual dialog

F Chen, D Zhang, X Chen, J Shi, S Xu… - Proceedings of the 30th …, 2022 - dl.acm.org

Visual dialog requires models to give reasonable answers according to a series of coherent
questions and related visual concepts in images. However, most current work either focuses …

Mentés Hivatkozás Idézetek száma: 16 Kapcsolódó cikkek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

KBGN: Knowledge-bridge graph network for adaptive vision-text reasoning in visual dialogue

X Jiang, S Du, Z Qin, Y Sun, J Yu - Proceedings of the 28th ACM …, 2020 - dl.acm.org

Visual dialogue is a challenging task that needs to extract implicit information from both
visual (image) and textual (dialogue history) contexts. Classical approaches pay more …

Mentés Hivatkozás Idézetek száma: 38 Kapcsolódó cikkek Mind a(z) 4 változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Reasoning with multi-structure commonsense knowledge in visual dialog

S Zhang, X Jiang, Z Yang, T Wan… - Proceedings of the …, 2022 - openaccess.thecvf.com

Visual Dialog requires an agent to engage in a conversation with humans grounded in an
image. Many studies on Visual Dialog focus on the understanding of the dialog history or the …

Mentés Hivatkozás Idézetek száma: 15 Kapcsolódó cikkek Mind a(z) 8 változat HTML-változat

Learning dual encoding model for adaptive visual understanding in visual dialogue

J Yu, X Jiang, Z Qin, W Zhang, Y Hu… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Different from Visual Question Answering task that requires to answer only one question
about an image, Visual Dialogue task involves multiple rounds of dialogues which cover a …

Mentés Hivatkozás Idézetek száma: 29 Kapcsolódó cikkek Mind a(z) 5 változat

HVLM: Exploring human-like visual cognition and language-memory network for visual dialog

K Sun, C Guo, H Zhang, Y Li - Information Processing & Management, 2022 - Elsevier

Visual dialog, a visual-language task, enables an AI agent to engage in conversation with
humans grounded in a given image. To generate appropriate answers for a series of …

Mentés Hivatkozás Idézetek száma: 12 Kapcsolódó cikkek Mind a(z) 2 változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Dmrm: A dual-channel multi-hop reasoning model for visual dialog

Vlp: A survey on vision-language pre-training

X-llm: Bootstrap** advanced large language models by treating multi-modalities as foreign languages

GoG: Relation-aware graph-over-graph network for visual dialog

The dialog must go on: Improving visual dialog via generative self-training

Improving cross-modal understanding in visual dialog via contrastive learning

Unsupervised and pseudo-supervised vision-language alignment in visual dialog

KBGN: Knowledge-bridge graph network for adaptive vision-text reasoning in visual dialogue

Reasoning with multi-structure commonsense knowledge in visual dialog

Learning dual encoding model for adaptive visual understanding in visual dialogue

HVLM: Exploring human-like visual cognition and language-memory network for visual dialog