Google Akademik

G Joshi, R Walambe, K Kotecha - IEEE Access, 2021 - ieeexplore.ieee.org

Artificial Intelligence techniques powered by deep neural nets have achieved much success
in several application domains, most significantly and notably in the Computer Vision …

Kaydet Alıntı yap Alıntılanma sayısı: 198 İlgili makaleler 5 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Visualization and visual analytics approaches for image and video datasets: A survey

S Afzal, S Ghani, MM Hittawe, SF Rashid… - ACM Transactions on …, 2023 - dl.acm.org

Image and video data analysis has become an increasingly important research area with
applications in different domains such as security surveillance, healthcare, augmented and …

Kaydet Alıntı yap Alıntılanma sayısı: 51 İlgili makaleler 3 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Multimodal few-shot learning with frozen language models

M Tsimpoukelli, JL Menick, S Cabi… - Advances in …, 2021 - proceedings.neurips.cc

When trained at sufficient scale, auto-regressive language models exhibit the notable ability
to learn a new language task after being prompted with just a few examples. Here, we …

Kaydet Alıntı yap Alıntılanma sayısı: 785 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluation of text generation: A survey

A Celikyilmaz, E Clark, J Gao - arxiv preprint arxiv:2006.14799, 2020 - arxiv.org

The paper surveys evaluation methods of natural language generation (NLG) systems that
have been developed in the last few years. We group NLG evaluation methods into three …

Kaydet Alıntı yap Alıntılanma sayısı: 432 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Learning with noisy correspondence for cross-modal matching

Z Huang, G Niu, X Liu, W Ding… - Advances in Neural …, 2021 - proceedings.neurips.cc

Cross-modal matching, which aims to establish the correspondence between two different
modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and …

Kaydet Alıntı yap Alıntılanma sayısı: 134 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Bicro: Noisy correspondence rectification for multi-modality data via bi-directional cross-modal similarity consistency

S Yang, Z Xu, K Wang, Y You, H Yao… - Proceedings of the …, 2023 - openaccess.thecvf.com

As one of the most fundamental techniques in multimodal learning, cross-modal matching
aims to project various sensory modalities into a shared feature space. To achieve this …

Kaydet Alıntı yap Alıntılanma sayısı: 29 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A review of deep learning for video captioning

M Abdar, M Kollati, S Kuraparthi… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …

Kaydet Alıntı yap Alıntılanma sayısı: 21 İlgili makaleler 3 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Video description: A comprehensive survey of deep learning approaches

G Rafiq, M Rafiq, GS Choi - Artificial Intelligence Review, 2023 - Springer

Video description refers to understanding visual content and transforming that acquired
understanding into automatic textual narration. It bridges the key AI fields of computer vision …

Kaydet Alıntı yap Alıntılanma sayısı: 30 İlgili makaleler 5 sürümün hepsi

Generative AI in mobile networks: a survey

A Karapantelakis, P Alizadeh, A Alabassi, K Dey… - Annals of …, 2024 - Springer

This paper provides a comprehensive review of recent challenges and results in the field of
generative AI with application to mobile telecommunications networks. The objective is to …

Kaydet Alıntı yap Alıntılanma sayısı: 52 İlgili makaleler

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PSNet: Parallel symmetric network for video salient object detection

R Cong, W Song, J Lei, G Yue, Y Zhao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

For the video salient object detection (VSOD) task, how to excavate the information from the
appearance modality and the motion modality has always been a topic of great concern. The …

Kaydet Alıntı yap Alıntılanma sayısı: 42 İlgili makaleler 6 sürümün hepsi

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Visual to text: Survey of image and video captioning

A review on explainability in multimodal deep neural nets

Visualization and visual analytics approaches for image and video datasets: A survey

Multimodal few-shot learning with frozen language models

Evaluation of text generation: A survey

Learning with noisy correspondence for cross-modal matching

Bicro: Noisy correspondence rectification for multi-modality data via bi-directional cross-modal similarity consistency

A review of deep learning for video captioning

Video description: A comprehensive survey of deep learning approaches

Generative AI in mobile networks: a survey

PSNet: Parallel symmetric network for video salient object detection