- Academic Search

Z Qiu, W Liu, H Feng, Z Liu, TZ **ao, KM Collins… - arxiv preprint arxiv …, 2024 - arxiv.org

Against the backdrop of enthusiasm for large language models (LLMs), there is an urgent
need to scientifically assess their capabilities and shortcomings. This is nontrivial in part …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ChatGarment: Garment Estimation, Generation and Editing via Large Language Models

S Bian, C Xu, Y **u, A Grigorev, Z Liu, C Lu… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce ChatGarment, a novel approach that leverages large vision-language models
(VLMs) to automate the estimation, generation, and editing of 3D garments from images or …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

GRS: Generating Robotic Simulation Tasks from Real-World Images

A Zook, FY Sun, J Spjut, V Blukis, S Birchfield… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce GRS (Generating Robotic Simulation tasks), a novel system to address the
challenge of real-to-sim in robotics, computer vision, and AR/VR. GRS enables the creation …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reconstructing Animals and the Wild

P Kulits, MJ Black, S Zuffi - arxiv preprint arxiv:2411.18807, 2024 - arxiv.org

The idea of 3D reconstruction as scene understanding is foundational in computer vision.
Reconstructing 3D scenes from 2D visual observations requires strong priors to …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

R Wu, W Su, J Liao - arxiv preprint arxiv:2411.16602, 2024 - arxiv.org

Scalable Vector Graphics (SVG) has become the de facto standard for vector graphics in
digital design, offering resolution independence and precise control over individual …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

W Zhao, YP Cao, J Xu, Y Dong, Y Shan - arxiv preprint arxiv:2412.15200, 2024 - arxiv.org

Procedural Content Generation (PCG) is powerful in creating high-quality 3D contents, yet
controlling it to produce desired shapes is difficult and often requires extensive parameter …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception

JR Waite, MZ Hasan, Q Liu, Z Jiang, C Hegde… - arxiv preprint arxiv …, 2025 - arxiv.org

Vision-language model (VLM) fine-tuning for application-specific visual grounding based on
natural language instructions has become one of the most popular approaches for learning …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

[PDF][PDF] R2D3: Imparting Spatial Reasoning by Reconstructing 3D Scenes from 2D Images

A Ray, D Bashkirova, R Tan, KH Zeng, BA Plummer… - openreview.net

Cognitive scientists herald 3D spatial reasoning as a fundamental foundation for all
intellectual processes. Multimodal large language models (MLMs), which have been widely …

Zapisz Cytuj Powiązane artykuły Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Re-thinking inverse graphics with large language models

Can Large Language Models Understand Symbolic Graphics Programs?

ChatGarment: Garment Estimation, Generation and Editing via Large Language Models

GRS: Generating Robotic Simulation Tasks from Real-World Images

Reconstructing Animals and the Wild

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception

[PDF][PDF] R2D3: Imparting Spatial Reasoning by Reconstructing 3D Scenes from 2D Images