- Academic Search

Y Wei, D Hu, Y Tian, X Li - arxiv preprint arxiv:2208.09579, 2022 - arxiv.org

Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …

Lưu Trích dẫn Trích dẫn 68 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Causal reasoning meets visual representation learning: A prospective study

Y Liu, YS Wei, H Yan, GB Li, L Lin - Machine Intelligence Research, 2022 - Springer

Visual representation learning is ubiquitous in various real-world applications, including
visual comprehension, video understanding, multi-modal analysis, human-computer …

Lưu Trích dẫn Trích dẫn 49 bài viết Bài viết có liên quan Tất cả 8 phiên bản

Avoid-df: Audio-visual joint learning for detecting deepfake

W Yang, X Zhou, Z Chen, B Guo, Z Ba… - IEEE Transactions …, 2023 - ieeexplore.ieee.org

Recently, deepfakes have raised severe concerns about the authenticity of online media.
Prior works for deepfake detection have made many efforts to capture the intra-modal …

Lưu Trích dẫn Trích dẫn 103 bài viết Bài viết có liên quan Tất cả 2 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Semi-supervised and unsupervised deep visual learning: A survey

Y Chen, M Mancini, X Zhu… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

State-of-the-art deep learning models are often trained with a large amount of costly labeled
training data. However, requiring exhaustive manual annotations may degrade the model's …

Lưu Trích dẫn Trích dẫn 137 bài viết Bài viết có liên quan Tất cả 19 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sound to visual scene generation by audio-to-visual latent alignment

K Sung-Bin, A Senocak, H Ha… - Proceedings of the …, 2023 - openaccess.thecvf.com

How does audio describe the world around us? In this paper, we propose a method for
generating an image of a scene from sound. Our method addresses the challenges of …

Lưu Trích dẫn Trích dẫn 36 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Audio-visual generalised zero-shot learning with cross-modal attention and language

OB Mercea, L Riesch, A Koepke… - Proceedings of the …, 2022 - openaccess.thecvf.com

Learning to classify video data from classes not included in the training data, ie video-based
zero-shot learning, is challenging. We conjecture that the natural alignment between the …

Lưu Trích dẫn Trích dẫn 65 bài viết Bài viết có liên quan Tất cả 9 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Structured knowledge distillation for accurate and efficient object detection

L Zhang, K Ma - IEEE Transactions on Pattern Analysis and …, 2023 - ieeexplore.ieee.org

Knowledge distillation, which aims to transfer the knowledge learned by a cumbersome
teacher model to a lightweight student model, has become one of the most popular and …

Lưu Trích dẫn Trích dẫn 24 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sound-guided semantic image manipulation

SH Lee, W Roh, W Byeon, SH Yoon… - Proceedings of the …, 2022 - openaccess.thecvf.com

The recent success of the generative model shows that leveraging the multi-modal
embedding space can manipulate an image using text information. However, manipulating …

Lưu Trích dẫn Trích dẫn 58 bài viết Bài viết có liên quan Tất cả 10 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Modality-aware contrastive instance learning with self-distillation for weakly-supervised audio-visual violence detection

J Yu, J Liu, Y Cheng, R Feng, Y Zhang - Proceedings of the 30th ACM …, 2022 - dl.acm.org

Weakly-supervised audio-visual violence detection aims to distinguish snippets containing
multimodal violence events with video-level labels. Many prior works perform audio-visual …

Lưu Trích dẫn Trích dẫn 43 bài viết Bài viết có liên quan Tất cả 4 phiên bản

A general dynamic knowledge distillation method for visual analytics

Z Tu, X Liu, X **ao - IEEE Transactions on Image Processing, 2022 - ieeexplore.ieee.org

Existing knowledge distillation (KD) method normally fixes the weight of the teacher network,
and uses the knowledge from the teacher network to guide the training of the student …

Lưu Trích dẫn Trích dẫn 37 bài viết Bài viết có liên quan Tất cả 4 phiên bản

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Distilling audio-visual knowledge by compositional contrastive learning

Learning in audio-visual context: A review, analysis, and new perspective

Causal reasoning meets visual representation learning: A prospective study

Avoid-df: Audio-visual joint learning for detecting deepfake

Semi-supervised and unsupervised deep visual learning: A survey

Sound to visual scene generation by audio-to-visual latent alignment

Audio-visual generalised zero-shot learning with cross-modal attention and language

Structured knowledge distillation for accurate and efficient object detection

Sound-guided semantic image manipulation

Modality-aware contrastive instance learning with self-distillation for weakly-supervised audio-visual violence detection

A general dynamic knowledge distillation method for visual analytics