Multimedia data mining: state of the art and challenges

CA Bhatt, MS Kankanhalli - Multimedia Tools and Applications, 2011 - Springer
Advances in multimedia data acquisition and storage technology have led to the growth of
very large multimedia databases. Analyzing this huge amount of multimedia data to discover …

Real-world acoustic event detection

X Zhuang, X Zhou, MA Hasegawa-Johnson… - Pattern recognition …, 2010 - Elsevier
Acoustic Event Detection (AED) aims to identify both timestamps and types of events in an
audio stream. This becomes very challenging when going beyond restricted highlight events …

Time–frequency matrix feature extraction and classification of environmental audio signals

B Ghoraani, S Krishnan - IEEE transactions on audio, speech …, 2011 - ieeexplore.ieee.org
Audio feature extraction and classification are important tools for audio signal analysis in
many applications, such as multimedia indexing and retrieval, and auditory scene analysis …

Scream and gunshot detection in noisy environments

L Gerosa, G Valenzise, M Tagliasacchi… - 2007 15th European …, 2007 - ieeexplore.ieee.org
This paper describes an audio event detection system which automatically classifies an
audio event as ambient noise, scream or gunshot. The classification system uses two …

Acoustic event detection and classification

A Temko, C Nadeu, D Macho, R Malkin… - Computers in the human …, 2009 - Springer
The human activity that takes place in meeting rooms or classrooms is reflected in a rich
variety of acoustic events (AE), produced either by the human body or by objects handled by …

Stacked auto-encoders based visual features for speech/music classification

A Kumar, SS Solanki, M Chandra - Expert Systems with Applications, 2022 - Elsevier
With the rapid rise of online available content, multimedia signal processing has become an
important area of research. The output of the speech/music classifier (SMC) is further used …

Feature analysis and selection for acoustic event detection

X Zhuang, X Zhou, TS Huang… - … on acoustics, speech …, 2008 - ieeexplore.ieee.org
Speech perceptual features, such as Mel-frequency Cepstral Coefficients (MFCC), have
been widely used in acoustic event detection. However, the different spectral structures …

A large TV dataset for speech and music activity detection

YN Hung, CW Wu, I Orife, A Hipple, W Wolcott… - EURASIP Journal on …, 2022 - Springer
Automatic speech and music activity detection (SMAD) is an enabling task that can help
segment, index, and pre-process audio content in radio broadcast and TV programs …

Machine-learning based classification of speech and music

MKS Khan, WG Al-Khatib - Multimedia Systems, 2006 - Springer
The need to classify audio into categories such as speech or music is an important aspect of
many multimedia document retrieval systems. In this paper, we investigate audio features …

Enhanced audio classification leveraging pre-trained deep visual models

A Kumar, R Kumar, M Chandra - Engineering Applications of Artificial …, 2025 - Elsevier
The differentiation between speech and music poses a prevalent issue in audio analytic,
specifically in dividing audio streams into segments and accurately labeling them as either …