Google Tudós

S Wang, Y Zhu, H Liu, Z Zheng, C Chen, J Li - ACM Computing Surveys, 2024 - dl.acm.org

Large Language Models (LLMs) have recently transformed both the academic and industrial
landscapes due to their remarkable capacity to understand, analyze, and generate texts …

Mentés Hivatkozás Idézetek száma: 103 Kapcsolódó cikkek Mind a(z) 2 változat

[Free GPT-4]

[PDF] arxiv.org

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arxiv preprint arxiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

Mentés Hivatkozás Idézetek száma: 207 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]

[PDF] nowpublishers.com

Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Mentés Hivatkozás Idézetek száma: 197 Kapcsolódó cikkek Mind a(z) 7 változat Könyvtári keresés HTML-változat

[Free GPT-4]

[PDF] arxiv.org

Clipcap: Clip prefix for image captioning

R Mokady, A Hertz, AH Bermano - arxiv preprint arxiv:2111.09734, 2021 - arxiv.org

Image captioning is a fundamental task in vision-language understanding, where the model
predicts a textual informative caption to a given input image. In this paper, we present a …

Mentés Hivatkozás Idézetek száma: 767 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Scaling up vision-language pre-training for image captioning

X Hu, Z Gan, J Wang, Z Yang, Z Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

In recent years, we have witnessed significant performance boost in the image captioning
task based on vision-language pre-training (VLP). Scale is believed to be an important factor …

Mentés Hivatkozás Idézetek száma: 317 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Mentés Hivatkozás Idézetek száma: 395 Kapcsolódó cikkek Mind a(z) 11 változat

[Free GPT-4]

[PDF] thecvf.com

Adabins: Depth estimation using adaptive bins

SF Bhat, I Alhashim, P Wonka - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We address the problem of estimating a high quality dense depth map from a single RGB
input image. We start out with a baseline encoder-decoder convolutional neural network …

Mentés Hivatkozás Idézetek száma: 1010 Kapcsolódó cikkek Mind a(z) 13 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Meshed-memory transformer for image captioning

M Cornia, M Stefanini, L Baraldi… - Proceedings of the …, 2020 - openaccess.thecvf.com

Transformer-based architectures represent the state of the art in sequence modeling tasks
like machine translation and language understanding. Their applicability to multi-modal …

Mentés Hivatkozás Idézetek száma: 1202 Kapcsolódó cikkek Mind a(z) 13 változat HTML-változat

[Free GPT-4]

[PDF] aaai.org

Dual-level collaborative transformer for image captioning

Y Luo, J Ji, X Sun, L Cao, Y Wu, F Huang… - Proceedings of the …, 2021 - ojs.aaai.org

Descriptive region features extracted by object detection networks have played an important
role in the recent advancements of image captioning. However, they are still criticized for the …

Mentés Hivatkozás Idézetek száma: 329 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]

[PDF] thecvf.com

Rstnet: Captioning with adaptive attention on visual and non-visual words

X Zhang, X Sun, Y Luo, J Ji, Y Zhou… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recent progress on visual question answering has explored the merits of grid features for
vision language tasks. Meanwhile, transformer-based models have shown remarkable …

Mentés Hivatkozás Idézetek száma: 260 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Knowledge editing for large language models: A survey

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

Vision-language pre-training: Basics, recent advances, and future trends

Clipcap: Clip prefix for image captioning

Scaling up vision-language pre-training for image captioning

From show to tell: A survey on deep learning-based image captioning

Adabins: Depth estimation using adaptive bins

Meshed-memory transformer for image captioning

Dual-level collaborative transformer for image captioning

Rstnet: Captioning with adaptive attention on visual and non-visual words