- Academic Search

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org

The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

Uložit Citovat Počet citací tohoto článku: 488 Související články Všechny verze (počet: 11) Hledat knihovnu

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Boostmis: Boosting medical image semi-supervised learning with adaptive pseudo labeling and informative active annotation

W Zhang, L Zhu, J Hallinan, S Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

In this paper, we propose a novel semi-supervised learning (SSL) framework named
BoostMIS that combines adaptive pseudo labeling and informative active annotation to …

Uložit Citovat Počet citací tohoto článku: 116 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

J Li, K Pan, Z Ge, M Gao, W Ji, W Zhang… - The Twelfth …, 2023 - openreview.net

Recent advancements in Multimodal Large Language Models (MLLMs) have been utilizing
Visual Prompt Generators (VPGs) to convert visual features into tokens that LLMs can …

Uložit Citovat Počet citací tohoto článku: 70 Související články Všechny verze (počet: 2) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video grounding

M Li, H Wang, W Zhang, J Miao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Spatio-temporal video grounding aims to localize the aligned visual tube corresponding to a
language query. Existing techniques achieve such alignment by exploiting dense boundary …

Uložit Citovat Počet citací tohoto článku: 36 Související články Všechny verze (počet: 3) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Revisiting the domain shift and sample uncertainty in multi-source active domain transfer

W Zhang, Z Lv, H Zhou, JW Liu, J Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a
new target domain by actively selecting a limited number of target data to annotate. This …

Uložit Citovat Počet citací tohoto článku: 17 Související články Všechny verze (počet: 3) Zobrazit jako HTML

Hierarchical representation network with auxiliary tasks for video captioning and video question answering

L Gao, Y Lei, P Zeng, J Song, M Wang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Recently, integrating vision and language for in-depth video understanding eg, video
captioning and video question answering, has become a promising direction for artificial …

Uložit Citovat Počet citací tohoto článku: 78 Související články Všechny verze (počet: 4)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Duet: A tuning-free device-cloud collaborative parameters generation framework for efficient device model generalization

Z Lv, W Zhang, S Zhang, K Kuang, F Wang… - Proceedings of the …, 2023 - dl.acm.org

Device Model Generalization (DMG) is a practical yet under-investigated research topic for
on-device machine learning applications. It aims to improve the generalization ability of pre …

Uložit Citovat Počet citací tohoto článku: 57 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Referring expression comprehension: A survey of methods and datasets

Y Qiao, C Deng, Q Wu - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org

Referring expression comprehension (REC) aims to localize a target object in an image
described by a referring expression phrased in natural language. Different from the object …

Uložit Citovat Počet citací tohoto článku: 98 Související články Všechny verze (počet: 5)

Unified adaptive relevance distinguishable attention network for image-text matching

K Zhang, Z Mao, AA Liu, Y Zhang - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Image-text matching, as a fundamental cross-modal task, bridges the gap between vision
and language. The core is to accurately learn semantic alignment to find relevant shared …

Uložit Citovat Počet citací tohoto článku: 58 Související články Všechny verze (počet: 2)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Gradient-regulated meta-prompt learning for generalizable vision-language models

J Li, M Gao, L Wei, S Tang, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-
training models to adapt to downstream tasks in a parameter-and data-efficient way, by …

Uložit Citovat Počet citací tohoto článku: 26 Související články Všechny verze (počet: 5) Zobrazit jako HTML

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Frame augmented alternating attention network for video question answering

Graph neural networks: foundation, frontiers and applications

Boostmis: Boosting medical image semi-supervised learning with adaptive pseudo labeling and informative active annotation

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video grounding

Revisiting the domain shift and sample uncertainty in multi-source active domain transfer

Hierarchical representation network with auxiliary tasks for video captioning and video question answering

Duet: A tuning-free device-cloud collaborative parameters generation framework for efficient device model generalization

Referring expression comprehension: A survey of methods and datasets

Unified adaptive relevance distinguishable attention network for image-text matching

Gradient-regulated meta-prompt learning for generalizable vision-language models