Μελετητής Google

W Shi, Z Hu, Y Bin, J Liu, Y Yang, SK Ng, L Bing… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated impressive reasoning capabilities,
particularly in textual mathematical problem-solving. However, existing open-source image …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 39 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ChartAdapter: Large Vision-Language Model for Chart Summarization

P Xu, Y Ding, W Fan - arxiv preprint arxiv:2412.20715, 2024 - arxiv.org

Chart summarization, which focuses on extracting key information from charts and
interpreting it in natural language, is crucial for generating and delivering insights through …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 1 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

MPT: Multi-grained Prompt Tuning for Text-Video Retrieval

H Zhang, P Zeng, L Gao, J Song, HT Shen - Proceedings of the 32nd …, 2024 - dl.acm.org

Recently, significant advancements have been made in supporting text-video retrieval by
transferring large-scale image-text pre-training models through model adaptation, ie, full fine …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

MagicVFX: Visual Effects Synthesis in Just Minutes

J Guo, L Gao, J Zhu, J Zhang, S Li, J Song - Proceedings of the 32nd …, 2024 - dl.acm.org

Visual effects synthesis is crucial in the film and television industry, which aims at enhancing
raw footage with virtual elements for greater expressiveness. As the demand for detailed …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning

S Li, C Jiang, S Wang, Y Long, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known
attribute-object pairs. The primary challenge in CZSL tasks lies in the significant …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CognArtive: Large Language Models for Automating Art Analysis and Decoding Aesthetic Elements

A Khadangi, A Sartipi, I Tchappi, G Fridgen - arxiv preprint arxiv …, 2025 - arxiv.org

Art, as a universal language, can be interpreted in diverse ways, with artworks embodying
profound meanings and nuances. The advent of Large Language Models (LLMs) and the …

Αποθήκευση Παράθεση Σχετικά άρθρα Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Gallerygpt: Analyzing paintings with large multimodal models

Math-llava: Bootstrap** mathematical reasoning for multimodal large language models

ChartAdapter: Large Vision-Language Model for Chart Summarization

MPT: Multi-grained Prompt Tuning for Text-Video Retrieval

MagicVFX: Visual Effects Synthesis in Just Minutes

Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning

CognArtive: Large Language Models for Automating Art Analysis and Decoding Aesthetic Elements