- Academic Search

MZ Alom, TM Taha, C Yakopcic, S Westberg, P Sidike… - electronics, 2019 - mdpi.com

In recent years, deep learning has garnered tremendous success in a variety of application
domains. This new field of machine learning has been growing rapidly and has been …

保存引用被引用数: 1847 関連記事全 9 バージョンキャッシュ

[Free GPT-4]

[PDF] saulius-grazulis.lt

The history began from alexnet: A comprehensive survey on deep learning approaches

MZ Alom, TM Taha, C Yakopcic, S Westberg… - arxiv preprint arxiv …, 2018 - arxiv.org

Deep learning has demonstrated tremendous success in variety of application domains in
the past few years. This new field of machine learning has been growing rapidly and applied …

保存引用被引用数: 1611 関連記事全 8 バージョン HTMLバージョン

Clip4clip: An empirical study of clip for end to end video clip retrieval and captioning

H Luo, L Ji, M Zhong, Y Chen, W Lei, N Duan, T Li - Neurocomputing, 2022 - Elsevier

Video clip retrieval and captioning tasks play an essential role in multimodal research and
are the fundamental research problem for multimodal understanding and generation. The …

保存引用被引用数: 566 関連記事全 5 バージョン

[Free GPT-4]

[PDF] thecvf.com

Ai choreographer: Music conditioned 3d dance generation with aist++

R Li, S Yang, DA Ross… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …

保存引用被引用数: 502 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Exploring and distilling posterior and prior knowledge for radiology report generation

F Liu, X Wu, S Ge, W Fan, Y Zou - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Automatically generating radiology reports can improve current clinical practice in diagnostic
radiology. On one hand, it can relieve radiologists from the heavy burden of report writing; …

保存引用被引用数: 371 関連記事全 8 バージョン HTMLバージョン

Clip4clip: An empirical study of clip for end to end video clip retrieval

H Luo, L Ji, M Zhong, Y Chen, W Lei, N Duan… - arxiv preprint arxiv …, 2021 - arxiv.org

Video-text retrieval plays an essential role in multi-modal research and has been widely
used in many real-world web applications. The CLIP (Contrastive Language-Image Pre …

保存引用被引用数: 331 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

保存引用被引用数: 214 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips

A Miech, D Zhukov, JB Alayrac… - Proceedings of the …, 2019 - openaccess.thecvf.com

Learning text-video embeddings usually requires a dataset of video clips with manually
provided captions. However, such datasets are expensive and time consuming to create and …

保存引用被引用数: 1309 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Centerclip: Token clustering for efficient text-video retrieval

S Zhao, L Zhu, X Wang, Y Yang - … of the 45th International ACM SIGIR …, 2022 - dl.acm.org

Recently, large-scale pre-training methods like CLIP have made great progress in multi-
modal research such as text-video retrieval. In CLIP, transformers are vital for modeling …

保存引用被引用数: 126 関連記事全 5 バージョン

[Free GPT-4]

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

保存引用被引用数: 225 関連記事全 8 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Video paragraph captioning using hierarchical recurrent neural networks

[HTML][HTML] A state-of-the-art survey on deep learning theory and architectures

The history began from alexnet: A comprehensive survey on deep learning approaches

Clip4clip: An empirical study of clip for end to end video clip retrieval and captioning

Ai choreographer: Music conditioned 3d dance generation with aist++

Exploring and distilling posterior and prior knowledge for radiology report generation

Clip4clip: An empirical study of clip for end to end video clip retrieval

End-to-end dense video captioning with parallel decoding

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips

Centerclip: Token clustering for efficient text-video retrieval

Attention, please! A survey of neural attention models in deep learning