- Academic Search

X Qu, Q Chen, W Wei, J Sun, J Dong - arxiv preprint arxiv:2408.00555, 2024 - arxiv.org

Despite the remarkable ability of large vision-language models (LVLMs) in image
comprehension, these models frequently generate plausible yet factually incorrect …

Save Cite Cited by 9 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Decomposed prototype learning for few-shot scene graph generation

X Li, J **ao, G Chen, Y Feng, Y Yang, AA Liu… - ACM Transactions on …, 2024 - dl.acm.org

Today's scene graph generation (SGG) models typically require abundant manual
annotations to learn new predicate types. Therefore, it is difficult to apply them to real-world …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

LayoutEnc: Leveraging Enhanced Layout Representations for Transformer-based Complex Scene Synthesis

X Cui, Q Sun, M Wang, L Li, W Zhou, H Li - ACM Transactions on …, 2025 - dl.acm.org

In complex scene synthesis, the effective representation of layouts is paramount. This paper
introduces LayoutEnc, an advanced approach specifically designed to enhance layout …

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

Unified view empirical study for large pretrained model on cross-domain few-shot learning

Alleviating hallucination in large vision-language models with active retrieval augmentation

Decomposed prototype learning for few-shot scene graph generation

LayoutEnc: Leveraging Enhanced Layout Representations for Transformer-based Complex Scene Synthesis