Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arxiv preprint arxiv:2409.05659, 2024 - arxiv.org
Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

[HTML][HTML] EMTT-YOLO: An Efficient Multiple Target Detection and Tracking Method for Mariculture Network Based on Deep Learning

C Lv, H Yang, J Zhu - Journal of Marine Science and Engineering, 2024 - mdpi.com
Efficient multiple target tracking (MTT) is the key to achieving green, precision, and large-
scale aquaculture, marine exploration, and marine farming. The traditional MTT methods …

Audio-visual speaker tracking: Progress, challenges, and future directions

J Zhao, Y Xu, X Qian, D Berghi, P Wu, M Cui… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio-visual speaker tracking has drawn increasing attention over the past few years due to
its academic values and wide application. Audio and visual modalities can provide …

DiffDesign: Controllable Diffusion with Meta Prior for Efficient Interior Design Generation

Y Yang, J Wang, T Geng, W Qiang, C Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Interior design is a complex and creative discipline involving aesthetics, functionality,
ergonomics, and materials science. Effective solutions must meet diverse requirements …

Direction of arrival tracking of non-circular signals: probabilistic hypothesis density filter with an improved likelihood function

J Cao, X Zhang, D Li - Journal of Physics: Conference Series, 2024 - iopscience.iop.org
Conventional direction of arrival (DOA) tracking methods typically track circular signals with
a fixed number of sources without considering time-varying numbers of non-circular (NC) …