Google Академія

X Min, G Zhai, K Gu, X Yang - ACM Transactions on Multimedia …, 2016 - dl.acm.org

In this article, we propose to predict human eye fixation through incorporating both audio
and visual cues. Traditional visual attention models generally make the utmost of stimuli's …

Зберегти Послатися Цитовано в 165 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Fusion of magnetic and visual sensors for indoor localization: Infrastructure-free and more effective

Z Liu, L Zhang, Q Liu, Y Yin, L Cheng… - IEEE Transactions …, 2016 - ieeexplore.ieee.org

Accurate and infrastructure-free indoor positioning can be very useful in a variety of
applications. However, most existing approaches (eg, WiFi and infrared-based methods) for …

Зберегти Послатися Цитовано в 138 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Look&listen: Multi-modal correlation learning for active speaker detection and speech enhancement

J **ong, Y Zhou, P Zhang, L **e… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Active speaker detection and speech enhancement have become two increasingly attractive
topics in audio-visual scenario understanding. According to their respective characteristics …

Зберегти Послатися Цитовано в 24 джерелах Пов’язані статті Кількість версій: 4

Auxiliary classifier generative adversarial network with soft labels in imbalanced acoustic event detection

X **a, R Togneri, F Sohel… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

In acoustic event detection, the training data size of some acoustic events is often small and
imbalanced. To deal with this, this paper proposes generating the virtual training data …

Зберегти Послатися Цитовано в 59 джерелах Пов’язані статті Кількість версій: 4

Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

S Nainan, V Kulkarni - International Journal of Speech Technology, 2021 - Springer

Contemporary automatic speaker recognition (ASR) systems do not provide 100% accuracy
making it imperative to explore different techniques to improve it. Easy access to mobile …

Зберегти Послатися Цитовано в 36 джерелах Пов’язані статті Кількість версій: 2

Introduction of SVM algorithms and recent applications about fault diagnosis and other aspects

Z Yin, J Liu, M Krueger, H Gao - 2015 IEEE 13th International …, 2015 - ieeexplore.ieee.org

Support vector machine has obtained more and more attentions as a new method of
machine learning based on the statistic learning theory. At the same time, there are …

Зберегти Послатися Цитовано в 19 джерелах Пов’язані статті

Multimodal multi-channel on-line speaker diarization using sensor fusion through SVM

VP Minotto, CR Jung, B Lee - IEEE Transactions on Multimedia, 2015 - ieeexplore.ieee.org

Speaker diarization (SD) is the process of assigning speech segments of an audio stream to
its corresponding speakers, thus comprising the problem of voice activity detection (VAD) …

Зберегти Послатися Цитовано в 54 джерелах Пов’язані статті Кількість версій: 3

[Free GPT-4]
[DeepSeek]

[PDF] csic.es

Sound source localization in wide-range outdoor environment using distributed sensor network

MM Faraji, SB Shouraki, E Iranmehr… - IEEE Sensors …, 2019 - ieeexplore.ieee.org

Sound source localization has always been one of the most challenging subjects in different
fields of engineering, one of the most important of which being tracking of flying objects. This …

Зберегти Послатися Цитовано в 28 джерелах Пов’язані статті Кількість версій: 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arxiv preprint arxiv:2409.05659, 2024 - arxiv.org

Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multimodal fusion refiner networks

S Sankaran, D Yang, SN Lim - arxiv preprint arxiv:2104.03435, 2021 - arxiv.org

Tasks that rely on multi-modal information typically include a fusion module that combines
information from different modalities. In this work, we develop a Refiner Fusion Network …

Зберегти Послатися Цитовано в 17 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Simultaneous-speaker voice activity detection and localization using mid-fusion of SVM and HMMs

Fixation prediction through multimodal analysis

Fusion of magnetic and visual sensors for indoor localization: Infrastructure-free and more effective

Look&listen: Multi-modal correlation learning for active speaker detection and speech enhancement

Auxiliary classifier generative adversarial network with soft labels in imbalanced acoustic event detection

Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

Introduction of SVM algorithms and recent applications about fault diagnosis and other aspects

Multimodal multi-channel on-line speaker diarization using sensor fusion through SVM

Sound source localization in wide-range outdoor environment using distributed sensor network

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

Multimodal fusion refiner networks