Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
VideoAgent: A Memory-Augmented Multimodal Agent for Video Understanding
We explore how reconciling several foundation models (large language models and vision-
language models) with a novel unified memory mechanism could tackle the challenging …
language models) with a novel unified memory mechanism could tackle the challenging …
Advances in 3d generation: A survey
Generating 3D models lies at the core of computer graphics and has been the focus of
decades of research. With the emergence of advanced neural representations and …
decades of research. With the emergence of advanced neural representations and …
An outlook into the future of egocentric vision
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …
research in egocentric vision and the ever-anticipated future, where wearable computing …
Llm-seg: Bridging image segmentation and large language model reasoning
J Wang, L Ke - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
Understanding human instructions to identify the target objects is vital for perception
systems. In recent years the advancements of Large Language Models (LLMs) have …
systems. In recent years the advancements of Large Language Models (LLMs) have …
Egothink: Evaluating first-person perspective thinking capability of vision-language models
Vision-language models (VLMs) have recently shown promising results in traditional
downstream tasks. Evaluation studies have emerged to assess their abilities with the …
downstream tasks. Evaluation studies have emerged to assess their abilities with the …
Octopi: Object property reasoning with large tactile-language models
Physical reasoning is important for effective robot manipulation. Recent work has
investigated both vision and language modalities for physical reasoning; vision can reveal …
investigated both vision and language modalities for physical reasoning; vision can reveal …
Egochoir: Capturing 3d human-object interaction regions from egocentric views
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …
[HTML][HTML] Continual learning in the presence of repetition
H Hemati, L Pellegrini, X Duan, Z Zhao, F **a… - Neural Networks, 2025 - Elsevier
Continual learning (CL) provides a framework for training models in ever-evolving
environments. Although re-occurrence of previously seen objects or tasks is common in real …
environments. Although re-occurrence of previously seen objects or tasks is common in real …
Actionvos: Actions as prompts for video object segmentation
Delving into the realm of egocentric vision, the advancement of referring video object
segmentation (RVOS) stands as pivotal in understanding human activities. However …
segmentation (RVOS) stands as pivotal in understanding human activities. However …
EAGLE: Egocentric AGgregated Language-video Engine
The rapid evolution of egocentric video analysis brings new insights into understanding
human activities and intentions from a first-person perspective. Despite this progress, the …
human activities and intentions from a first-person perspective. Despite this progress, the …