Google Наука

Egoexolearn: A dataset for bridging asynchronous ego-and exo-centric view of procedural activitie...

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Egoexo-fitness: Towards egocentric and exocentric full-body action understanding

YM Li, WJ Huang, AL Wang, LA Zeng, JK Meng… - … on Computer Vision, 2024 - Springer

Abstract We present EgoExo-Fitness, a new full-body action understanding dataset,
featuring fitness sequence videos recorded from synchronized egocentric and fixed …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comprehensive survey of action quality assessment: Method and benchmark

K Zhou, R Cai, L Wang, HPH Shum, X Liang - arxiv preprint arxiv …, 2024 - arxiv.org

Action Quality Assessment (AQA) quantitatively evaluates the quality of human actions,
providing automated assessments that reduce biases in human judgment. Its applications …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Egovideo: Exploring egocentric foundation model and downstream adaptation

B Pei, G Chen, J Xu, Y He, Y Liu, K Pan… - arxiv preprint arxiv …, 2024 - arxiv.org

In this report, we present our solutions to the EgoVis Challenges in CVPR 2024, including
five tracks in the Ego4D challenge and three tracks in the EPIC-Kitchens challenge. Building …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Masked video and body-worn IMU autoencoder for egocentric action recognition

M Zhang, Y Huang, R Liu, Y Sato - European Conference on Computer …, 2024 - Springer

Compared with visual signals, Inertial Measurement Units (IMUs) placed on human limbs
can capture accurate motion signals while being robust to lighting variation and occlusion …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 7 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

T Kalluri, BP Majumder, M Chandraker - arxiv preprint arxiv:2403.05535, 2024 - arxiv.org

We introduce LaGTran, a novel framework that utilizes text supervision to guide robust
transfer of discriminative knowledge from labeled source to unlabeled target data with …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unlocking exocentric video-language data for egocentric video representation learning

ZY Dou, X Yang, T Nagarajan, H Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present EMBED (Egocentric Models Built with Exocentric Data), a method designed to
transform exocentric video-language data for egocentric video representation learning …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Y Huang, J Xu, B Pei, Y He, G Chen, L Yang… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Vinci, a real-time embodied smart assistant built upon an egocentric vision-
language model. Designed for deployment on portable devices such as smartphones and …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cg-bench: Clue-grounded question answering benchmark for long video understanding

G Chen, Y Liu, Y Huang, Y He, B Pei, J Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Most existing video understanding benchmarks for multimodal large language models
(MLLMs) focus only on short videos. The limited number of benchmarks for long video …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Egocentric Vehicle Dense Video Captioning

F Chen, C Xu, Q Jia, Y Wang, Y Liu, H Zhang… - Proceedings of the …, 2024 - dl.acm.org

Traditional dense video captioning predominantly focuses on edited exocentric footage.
These videos are filmed from an external perspective and generally feature distinct …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 3 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Videgothink: Assessing egocentric video understanding capabilities for embodied ai

S Cheng, K Fang, Y Yu, S Zhou, B Li, Y Tian… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in Multi-modal Large Language Models (MLLMs) have opened new
avenues for applications in Embodied AI. Building on previous work, EgoThink, we introduce …

Запазване Позоваване С позовавания в 4 Сродни статии Всички 3 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Egoexolearn: A dataset for bridging asynchronous ego-and exo-centric view of procedural activitie...

Egoexo-fitness: Towards egocentric and exocentric full-body action understanding

A comprehensive survey of action quality assessment: Method and benchmark

Egovideo: Exploring egocentric foundation model and downstream adaptation

Masked video and body-worn IMU autoencoder for egocentric action recognition

Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

Unlocking exocentric video-language data for egocentric video representation learning

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Cg-bench: Clue-grounded question answering benchmark for long video understanding

Egocentric Vehicle Dense Video Captioning

Videgothink: Assessing egocentric video understanding capabilities for embodied ai