Google Академія

N Robinson, B Tidd, D Campbell, D Kulić… - ACM Transactions on …, 2023 - dl.acm.org

Robotic vision, otherwise known as computer vision for robots, is a critical process for robots
to collect and interpret detailed information related to human actions, goals, and …

Зберегти Послатися Цитовано в 73 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A dataset of dynamic reverberant sound scenes with directional interferers for sound event localization and detection

A Politis, S Adavanne, D Krause, A Deleforge… - arxiv preprint arxiv …, 2021 - arxiv.org

This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on
Sound Event Localization and Detection (SELD). The dataset is based on emulation of real …

Зберегти Послатися Цитовано в 92 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Audiovisual fusion: Challenges and new approaches

AK Katsaggelos, S Bahaadini… - Proceedings of the …, 2015 - ieeexplore.ieee.org

In this paper, we review recent results on audiovisual (AV) fusion. We also discuss some of
the challenges and report on approaches to address them. One important issue in AV fusion …

Зберегти Послатися Цитовано в 147 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unicon: Unified context network for robust active speaker detection

Y Zhang, S Liang, S Yang, X Liu, Z Wu, S Shan… - Proceedings of the 29th …, 2021 - dl.acm.org

We propose a new efficient framework, the Unified Context Network (UniCon), for robust
active speaker detection (ASD). Traditional methods for ASD usually operate on each …

Зберегти Послатися Цитовано в 45 джерелах Пов’язані статті Кількість версій: 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Co-localization of audio sources in images using binaural features and locally-linear regression

A Deleforge, R Horaud… - … /ACM Transactions on …, 2015 - ieeexplore.ieee.org

This paper addresses the problem of localizing audio sources using binaural
measurements. We propose a supervised formulation that simultaneously localizes multiple …

Зберегти Послатися Цитовано в 65 джерелах Пов’язані статті Кількість версій: 38

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ChildBot: Multi-robot perception and interaction with children

N Efthymiou, PP Filntisis, P Koutras, A Tsiami… - Robotics and …, 2022 - Elsevier

In this paper, we present an integrated robotic system capable of participating in and
performing a wide range of educational and entertainment tasks collaborating with one or …

Зберегти Послатися Цитовано в 24 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] kuleuven.be

Who's speaking? Audio-supervised classification of active speakers in video

P Chakravarty, S Mirzaei, T Tuytelaars… - Proceedings of the …, 2015 - dl.acm.org

Active speakers have traditionally been identified in video by detecting their moving lips.
This paper demonstrates the same using spatio-temporal features that aim to capture other …

Зберегти Послатися Цитовано в 41 джерелах Пов’язані статті Кількість версій: 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mixture of inference networks for VAE-based audio-visual speech enhancement

M Sadeghi, X Alameda-Pineda - IEEE Transactions on Signal …, 2021 - ieeexplore.ieee.org

We address unsupervised audio-visual speech enhancement based on variational
autoencoders (VAEs), where the prior distribution of clean speech spectrogram is simulated …

Зберегти Послатися Цитовано в 24 джерелах Пов’язані статті Кількість версій: 37

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Prediction of who will be next speaker and when using mouth-opening pattern in multi-party conversation

R Ishii, K Otsuka, S Kumano, R Higashinaka… - Multimodal …, 2019 - mdpi.com

We investigated the mouth-opening transition pattern (MOTP), which represents the change
of mouth-opening degree during the end of an utterance, and used it to predict the next …

Зберегти Послатися Цитовано в 22 джерелах Пов’язані статті Кількість версій: 4 Кеш

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

Ava (a social robot): Design and performance of a robotic hearing apparatus

E Saffari, A Meghdari, B Vazirnezhad… - Social Robotics: 7th …, 2015 - Springer

Socially cognitive robots are supposed to communicate and interact with humans and other
robots in the most natural way. Listeners turn their heads to-ward speakers to enhance …

Зберегти Послатися Цитовано в 35 джерелах Пов’язані статті Кількість версій: 6

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Active-speaker detection and localization with microphones and cameras embedded into a robotic head

Robotic vision for human-robot interaction and collaboration: A survey and systematic review

A dataset of dynamic reverberant sound scenes with directional interferers for sound event localization and detection

Audiovisual fusion: Challenges and new approaches

Unicon: Unified context network for robust active speaker detection

Co-localization of audio sources in images using binaural features and locally-linear regression

ChildBot: Multi-robot perception and interaction with children

Who's speaking? Audio-supervised classification of active speakers in video

Mixture of inference networks for VAE-based audio-visual speech enhancement

[HTML][HTML] Prediction of who will be next speaker and when using mouth-opening pattern in multi-party conversation

Ava (a social robot): Design and performance of a robotic hearing apparatus