- Academic Search

Y Fan, X Ma, R Wu, Y Du, J Li, Z Gao, Q Li - European Conference on …, 2024 - Springer

We explore how reconciling several foundation models (large language models and vision-
language models) with a novel unified memory mechanism could tackle the challenging …

Simpan Kutip Dirujuk 37 kali Artikel terkait 8 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Advances in 3d generation: A survey

X Li, Q Zhang, D Kang, W Cheng, Y Gao… - arxiv preprint arxiv …, 2024 - arxiv.org

Generating 3D models lies at the core of computer graphics and has been the focus of
decades of research. With the emergence of advanced neural representations and …

Simpan Kutip Dirujuk 39 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

An outlook into the future of egocentric vision

C Plizzari, G Goletto, A Furnari, S Bansal… - International Journal of …, 2024 - Springer

What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …

Simpan Kutip Dirujuk 39 kali Artikel terkait 10 versi

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Llm-seg: Bridging image segmentation and large language model reasoning

J Wang, L Ke - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com

Understanding human instructions to identify the target objects is vital for perception
systems. In recent years the advancements of Large Language Models (LLMs) have …

Simpan Kutip Dirujuk 11 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Egothink: Evaluating first-person perspective thinking capability of vision-language models

S Cheng, Z Guo, J Wu, K Fang, P Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Vision-language models (VLMs) have recently shown promising results in traditional
downstream tasks. Evaluation studies have emerged to assess their abilities with the …

Simpan Kutip Dirujuk 12 kali Artikel terkait 8 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Octopi: Object property reasoning with large tactile-language models

S Yu, K Lin, A **ao, J Duan, H Soh - arxiv preprint arxiv:2405.02794, 2024 - arxiv.org

Physical reasoning is important for effective robot manipulation. Recent work has
investigated both vision and language modalities for physical reasoning; vision can reveal …

Simpan Kutip Dirujuk 12 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Egochoir: Capturing 3d human-object interaction regions from egocentric views

Y Yang, W Zhai, C Wang, C Yu, Y Cao… - arxiv preprint arxiv …, 2024 - arxiv.org

Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …

Simpan Kutip Dirujuk 4 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Continual learning in the presence of repetition

H Hemati, L Pellegrini, X Duan, Z Zhao, F **a… - Neural Networks, 2025 - Elsevier

Continual learning (CL) provides a framework for training models in ever-evolving
environments. Although re-occurrence of previously seen objects or tasks is common in real …

Simpan Kutip Dirujuk 1 kali Artikel terkait 9 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Actionvos: Actions as prompts for video object segmentation

L Ouyang, R Liu, Y Huang, R Furuta, Y Sato - European Conference on …, 2024 - Springer

Delving into the realm of egocentric vision, the advancement of referring video object
segmentation (RVOS) stands as pivotal in understanding human activities. However …

Simpan Kutip Dirujuk 1 kali Artikel terkait 7 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

EAGLE: Egocentric AGgregated Language-video Engine

J Bi, Y Tang, L Song, A Vosoughi, N Nguyen… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid evolution of egocentric video analysis brings new insights into understanding
human activities and intentions from a first-person perspective. Despite this progress, the …

Simpan Kutip Dirujuk 2 kali Artikel terkait 5 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Egoobjects: A large-scale egocentric dataset for fine-grained object understanding

VideoAgent: A Memory-Augmented Multimodal Agent for Video Understanding

Advances in 3d generation: A survey

An outlook into the future of egocentric vision

Llm-seg: Bridging image segmentation and large language model reasoning

Egothink: Evaluating first-person perspective thinking capability of vision-language models

Octopi: Object property reasoning with large tactile-language models

Egochoir: Capturing 3d human-object interaction regions from egocentric views

[HTML][HTML] Continual learning in the presence of repetition

Actionvos: Actions as prompts for video object segmentation

EAGLE: Egocentric AGgregated Language-video Engine