Temporal sentiment localization: Listen and look in untrimmed videos

Z Zhang, J Yang - Proceedings of the 30th ACM International …, 2022 - dl.acm.org
Video sentiment analysis aims to uncover the underlying attitudes of viewers, which has a
wide range of applications in real world. Existing works simply classify a video into a single …

SoundDet: Polyphonic moving sound event detection and localization from raw waveform

Y He, N Trigoni, A Markham - International Conference on …, 2021 - proceedings.mlr.press
We present a new framework SoundDet, which is an end-to-end trainable and light-weight
framework, for polyphonic moving sound event detection and localization. Prior methods …

Visual object detector for cow sound event detection

YR Pandeya, B Bhattarai, J Lee - IEEE Access, 2020 - ieeexplore.ieee.org
Sound event detection (SED) is a reasonable choice in a number of application domains
including cattle sheds, dense forests, or any dark environments where visual objects are …

Proposal-based few-shot sound event detection for speech and environmental sounds with perceivers

P Wolters, L Sizemore, C Daw, B Hutchinson… - arxiv preprint arxiv …, 2021 - arxiv.org
Many applications involve detecting and localizing specific sound events within long,
untrimmed documents, including keyword spotting, medical observation, and bioacoustic …

Human and Machine Performance in Counting Sound Classes in Single-Channel Soundscapes

J Abeßer, A Ullah, S Ziegler, S Grollmisch - Journal of the Audio …, 2023 - aes.org
Individual sounds are difficult to detect in complex soundscapes because of a strong
overlap. This article explores the task of estimating sound polyphony, which is defined here …

Internet of things (IoT) discovery using deep neural networks

E Lo, JH Kohl - Proceedings of the IEEE/CVF Winter …, 2020 - openaccess.thecvf.com
We present a novel approach to Internet of Things (IoT) discovery using Deep Neural
Network (DNN) based object detection. Traditional methods of IoT discovery are based on …

[HTML][HTML] A deep learning model for detecting and classifying multiple marine mammal species from passive acoustic data

Q Hamard, MT Pham, D Cazau, K Heerah - Ecological Informatics, 2024 - Elsevier
Underwater passive acoustics is used worldwide for multi-year monitoring of marine
mammals. Yet, the large amount of audio recordings raises the need to automate the …

Musicyolo: A sight-singing onset/offset detection framework based on object detection instead of spectrum frames

X Wang, W Xu, W Yang… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
In this paper, we propose MusicYOLO based on object detection to detect the onset and
offset in singing for the first time. The onset of the vocal is not as stable and clear as that of …

Audios Don't Lie: Multi-Frequency Channel Attention Mechanism for Audio Deepfake Detection

Y Feng - arxiv preprint arxiv:2412.09467, 2024 - arxiv.org
With the rapid development of artificial intelligence technology, the application of deepfake
technology in the audio field has gradually increased, resulting in a wide range of security …

[HTML][HTML] A monophonic cow sound annotation tool using a semi-automatic method on audio/video data

YR Pandeya, B Bhattarai, U Afzaal, JB Kim, J Lee - Livestock Science, 2022 - Elsevier
In this paper, we present a semi-automatic tool for labeling monophonic sound events with
specific reference to cow sounds. The proposed system takes as input audio or video data …