- Academic Search

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Save Cite Cited by 219 Related articles All 6 versions Free GPT-4 DeepSeek Library Search View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Imagine that! abstract-to-intricate text-to-image synthesis with scene graph hallucination diffusion

S Wu, H Fei, H Zhang, TS Chua - Advances in Neural …, 2024 - proceedings.neurips.cc

In this work, we investigate the task of text-to-image (T2I) synthesis under the abstract-to-
intricate setting, ie, generating intricate visual content from simple abstract text prompts …

Save Cite Cited by 47 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Relational context learning for human-object interaction detection

S Kim, D Jung, M Cho - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Recent state-of-the-art methods for HOI detection typically build on transformer architectures
with two decoder branches, one for human-object pair detection and the other for interaction …

Save Cite Cited by 48 Related articles All 5 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Empowering dynamics-aware text-to-video diffusion with large language models

H Fei, S Wu, W Ji, H Zhang, TS Chua - arxiv preprint arxiv:2308.13812, 2023 - arxiv.org

Text-to-video (T2V) synthesis has gained increasing attention in the community, in which the
recently emerged diffusion models (DMs) have promisingly shown stronger performance …

Save Cite Cited by 26 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Graph neural networks in vision-language image understanding: A survey

H Senior, G Slabaugh, S Yuan, L Rossi - The Visual Computer, 2024 - Springer

Abstract 2D image understanding is a complex problem within computer vision, but it holds
the key to providing human-level scene comprehension. It goes further than identifying the …

Save Cite Cited by 18 Related articles All 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation

C Zhang, S Stepputtis, J Campbell… - Proceedings of the …, 2024 - openaccess.thecvf.com

Being able to understand visual scenes is a precursor for many downstream tasks including
autonomous driving robotics and other vision-based approaches. A common approach …

Save Cite Cited by 12 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

H Fei, S Wu, W Ji, H Zhang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Text-to-video (T2V) synthesis has gained increasing attention in the community in
which the recently emerged diffusion models (DMs) have promisingly shown stronger …

Save Cite Cited by 39 Related articles View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generalized unbiased scene graph generation

X Lyu, L Gao, J **e, P Zeng, Y Tian, J Shao… - arxiv preprint arxiv …, 2023 - arxiv.org

Existing Unbiased Scene Graph Generation (USGG) methods only focus on addressing the
predicate-level imbalance that high-frequency classes dominate predictions of rare ones …

Save Cite Cited by 7 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

Hi-SIGIR: Hierachical Semantic-Guided Image-to-image Retrieval via Scene Graph

Y Wang, P Dai, X Jia, Z Zeng, R Li, X Cao - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Image-to-image retrieval, a fundamental task, aims at matching similar images based on a
query image. Existing methods with convolutional neural networks are usually sensitive to …

Save Cite Cited by 4 Related articles

Learning multimodal relationship interaction for visual relationship detection

Z Liu, WS Zheng - Pattern Recognition, 2022 - Elsevier

Visual relationship detection aims to recognize visual relationships in scenes as triplets<
subject-predicate-object>. Previous works have shown remarkable progress by introducing …

Save Cite Cited by 8 Related articles All 3 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Image-to-image retrieval by learning similarity between scene graphs

Multimodal foundation models: From specialists to general-purpose assistants

Imagine that! abstract-to-intricate text-to-image synthesis with scene graph hallucination diffusion

Relational context learning for human-object interaction detection

Empowering dynamics-aware text-to-video diffusion with large language models

Graph neural networks in vision-language image understanding: A survey

HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation

Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

Generalized unbiased scene graph generation

Hi-SIGIR: Hierachical Semantic-Guided Image-to-image Retrieval via Scene Graph

Learning multimodal relationship interaction for visual relationship detection