Google Academic

M Benzeghiba, R De Mori, O Deroo, S Dupont… - Speech …, 2007 - Elsevier

Major progress is being recorded regularly on both the technology and exploitation of
automatic speech recognition (ASR) and spoken language systems. However, there are still …

Salvați Citați Citat de 784 ori Articole cu conținut similar Toate cele 22 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Speech production knowledge in automatic speech recognition

S King, J Frankel, K Livescu, E McDermott… - The Journal of the …, 2007 - pubs.aip.org

Although much is known about how speech is produced, and research into speech
production has resulted in measured articulatory data, feature systems of different kinds, and …

Salvați Citați Citat de 255 ori Articole cu conținut similar Toate cele 19 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hawkes processes for events in social media

MA Rizoiu, Y Lee, S Mishra, L **e - Frontiers of multimedia research, 2017 - dl.acm.org

This chapter provides an accessible introduction for point processes, and especially Hawkes
processes, for modeling discrete, inter-dependent events over continuous time. We start by …

Salvați Citați Citat de 197 ori Articole cu conținut similar Toate cele 5 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep learning for video classification and captioning

Z Wu, T Yao, Y Fu, YG Jiang - Frontiers of multimedia research, 2017 - dl.acm.org

Today's digital contents are inherently multimedia: text, audio, image, video, and so on.
Video, in particular, has become a new way of communication between Internet users with …

Salvați Citați Citat de 168 ori Articole cu conținut similar Toate cele 6 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Making deep belief networks effective for large vocabulary continuous speech recognition

TN Sainath, B Kingsbury… - … IEEE Workshop on …, 2011 - ieeexplore.ieee.org

To date, there has been limited work in applying Deep Belief Networks (DBNs) for acoustic
modeling in LVCSR tasks, with past work using standard speech features. However, a …

Salvați Citați Citat de 251 ori Articole cu conținut similar Toate cele 9 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Real-world acoustic event detection

X Zhuang, X Zhou, MA Hasegawa-Johnson… - Pattern recognition …, 2010 - Elsevier

Acoustic Event Detection (AED) aims to identify both timestamps and types of events in an
audio stream. This becomes very challenging when going beyond restricted highlight events …

Salvați Citați Citat de 220 ori Articole cu conținut similar Toate cele 8 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Short-time phase spectrum in speech processing: A review and some experimental results

LD Alsteris, KK Paliwal - Digital signal processing, 2007 - Elsevier

Incorporating information from the short-time phase spectrum into a feature set for automatic
speech recognition (ASR) may possibly serve to improve recognition accuracy. Currently …

Salvați Citați Citat de 169 ori Articole cu conținut similar Toate cele 6 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU summer workshop

K Livescu, O Cetin… - … , Speech and Signal …, 2007 - ieeexplore.ieee.org

We report on investigations, conducted at the 2006 Johns Hopkins Workshop, into the use of
articulatory features (AFs) for observation and pronunciation models in speech recognition …

Salvați Citați Citat de 161 ori Articole cu conținut similar Toate cele 26 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cross-modal speaker verification and recognition: A multilingual perspective

S Nawaz, MS Saeed, P Morerio… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recent years have seen a surge in finding association between faces and voices within a
cross-modal biometric application along with speaker recognition. Inspired from this, we …

Salvați Citați Citat de 29 ori Articole cu conținut similar Toate cele 9 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] usc.edu

[PDF][PDF] A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice.

M Van Segbroeck, A Tsiartas, SS Narayanan - Interspeech, 2013 - sail.usc.edu

Reliable automatic detection of speech/non-speech activity in degraded, noisy audio signals
is a fundamental and challenging task in robust signal processing. As various speech …

Salvați Citați Citat de 84 ori Articole cu conținut similar Toate cele 5 versiuni

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Tandem acoustic modeling in large-vocabulary recognition

Automatic speech recognition and speech variability: A review

Speech production knowledge in automatic speech recognition

Hawkes processes for events in social media

Deep learning for video classification and captioning

Making deep belief networks effective for large vocabulary continuous speech recognition

Real-world acoustic event detection

Short-time phase spectrum in speech processing: A review and some experimental results

Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU summer workshop

Cross-modal speaker verification and recognition: A multilingual perspective

[PDF][PDF] A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice.