- Academic Search

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

บันทึก อ้างอิง อ้างโดย107 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluation of text generation: A survey

A Celikyilmaz, E Clark, J Gao - arxiv preprint arxiv:2006.14799, 2020 - arxiv.org

The paper surveys evaluation methods of natural language generation (NLG) systems that
have been developed in the last few years. We group NLG evaluation methods into three …

บันทึก อ้างอิง อ้างโดย430 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Dip: Dual incongruity perceiving network for sarcasm detection

C Wen, G Jia, J Yang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Sarcasm indicates the literal meaning is contrary to the real attitude. Considering the
popularity and complementarity of image-text data, we investigate the task of multi-modal …

บันทึก อ้างอิง อ้างโดย41 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] A systematic literature review on image captioning

R Staniūtė, D Šešok - Applied Sciences, 2019 - mdpi.com

Natural language problems have already been investigated for around five years. Recent
progress in artificial intelligence (AI) has greatly improved the performance of models …

บันทึก อ้างอิง อ้างโดย74 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ แคช

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Language models can see: Plugging visual controls in text generation

Y Su, T Lan, Y Liu, F Liu, D Yogatama, Y Wang… - arxiv preprint arxiv …, 2022 - arxiv.org

Generative language models (LMs) such as GPT-2/3 can be prompted to generate text with
remarkable quality. While they are designed for text-prompted generation, it remains an …

บันทึก อ้างอิง อ้างโดย101 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

Zero-shot video object segmentation with co-attention siamese networks

X Lu, W Wang, J Shen, D Crandall… - IEEE transactions on …, 2020 - ieeexplore.ieee.org

We introduce a novel network, called CO-attention siamese network (COSNet), to address
the zero-shot video object segmentation task in a holistic fashion. We exploit the inherent …

บันทึก อ้างอิง อ้างโดย166 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gpt-4v (ision) as a social media analysis engine

H Lyu, J Huang, D Zhang, Y Yu, X Mou, J Pan… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent research has offered insights into the extraordinary capabilities of Large Multimodal
Models (LMMs) in various general vision and language tasks. There is growing interest in …

บันทึก อ้างอิง อ้างโดย34 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ ดูในรูปแบบ HTML

Emotional video captioning with vision-based emotion interpretation network

P Song, D Guo, X Yang, S Tang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Effectively summarizing and re-expressing video content by natural languages in a more
human-like fashion is one of the key topics in the field of multimedia content understanding …

บันทึก อ้างอิง อ้างโดย21 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Style-aware contrastive learning for multi-style image captioning

Y Zhou, G Long - arxiv preprint arxiv:2301.11367, 2023 - arxiv.org

Existing multi-style image captioning methods show promising results in generating a
caption with accurate visual content and desired linguistic style. However, existing methods …

บันทึก อ้างอิง อ้างโดย36 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Human-like controllable image captioning with verb-specific semantic roles

L Chen, Z Jiang, J **ao, W Liu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Abstract Controllable Image Captioning (CIC)--generating image descriptions following
designated control signals--has received unprecedented attention over the last few years …

บันทึก อ้างอิง อ้างโดย83 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Multimodal research in vision and language: A review of current and emerging trends

Evaluation of text generation: A survey

Dip: Dual incongruity perceiving network for sarcasm detection

[HTML][HTML] A systematic literature review on image captioning

Language models can see: Plugging visual controls in text generation

Zero-shot video object segmentation with co-attention siamese networks

Gpt-4v (ision) as a social media analysis engine

Emotional video captioning with vision-based emotion interpretation network

Style-aware contrastive learning for multi-style image captioning

Human-like controllable image captioning with verb-specific semantic roles