- Academic Search

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis

Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Gem Citer Citeret af 41 Relaterede artikler Alle 3 versioner

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Autoregressive image generation without vector quantization

T Li, Y Tian, H Li, M Deng, K He - Advances in Neural …, 2025 - proceedings.neurips.cc

Conventional wisdom holds that autoregressive models for image generation are typically
accompanied by vector-quantized tokens. We observe that while a discrete-valued space …

Gem Citer Citeret af 87 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation

Z Fu, TZ Zhao, C Finn - ar** stone on the path toward more capable and robust robotic manipulation policies …

Gem Citer Citeret af 118 Relaterede artikler Alle 11 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

ivideogpt: Interactive videogpts are scalable world models

J Wu, S Yin, N Feng, X He, D Li… - Advances in Neural …, 2025 - proceedings.neurips.cc

World models empower model-based agents to interactively explore, reason, and plan
within imagined environments for real-world decision-making. However, the high demand …

Gem Citer Citeret af 16 Relaterede artikler Alle 5 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

3d diffuser actor: Policy diffusion with 3d scene representations

TW Ke, N Gkanatsios, K Fragkiadaki - arxiv preprint arxiv:2402.10885, 2024 - arxiv.org

Diffusion policies are conditional diffusion models that learn robot action distributions
conditioned on the robot and environment state. They have recently shown to outperform …

Gem Citer Citeret af 77 Relaterede artikler Alle 6 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Moka: Open-vocabulary robotic manipulation through mark-based visual prompting

F Liu, K Fang, P Abbeel, S Levine - First Workshop on Vision …, 2024 - openreview.net

Open-vocabulary generalization requires robotic systems to perform tasks involving complex
and diverse environments and task goals. While the recent advances in vision language …

Gem Citer Citeret af 61 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating real-world robot manipulation policies in simulation

X Li, K Hsu, J Gu, K Pertsch, O Mees, HR Walke… - arxiv preprint arxiv …, 2024 - arxiv.org

The field of robotics has made significant advances towards generalist robot manipulation
policies. However, real-world evaluation of such policies is not scalable and faces …

Gem Citer Citeret af 41 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open teach: A versatile teleoperation system for robotic manipulation

A Iyer, Z Peng, Y Dai, I Guzey, S Haldar… - arxiv preprint arxiv …, 2024 - arxiv.org

Open-sourced, user-friendly tools form the bedrock of scientific advancement across
disciplines. The widespread adoption of data-driven learning has led to remarkable …

Gem Citer Citeret af 34 Relaterede artikler Alle 4 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Octo: An open-source generalist robot policy

Real-world robot applications of foundation models: A review

Autoregressive image generation without vector quantization

Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation

ivideogpt: Interactive videogpts are scalable world models

3d diffuser actor: Policy diffusion with 3d scene representations

Moka: Open-vocabulary robotic manipulation through mark-based visual prompting

Evaluating real-world robot manipulation policies in simulation

Open teach: A versatile teleoperation system for robotic manipulation