A comprehensive survey on optimizing deep learning models by metaheuristics

B Akay, D Karaboga, R Akay - Artificial Intelligence Review, 2022 - Springer
Deep neural networks (DNNs), which are extensions of artificial neural networks, can learn
higher levels of feature hierarchy established by lower level features by transforming the raw …

[LIBRO][B] C-ORAL-ROM: integrated reference corpora for spoken romance languages

E Cresti, M Moneglia - 2008 - degruyter.com
Title description The C-ORAL-ROM book and DVD provide a unique set of comparable
corpora of spontaneous speech for the main Romance languages, French, Italian …

A multichannel MMSE-based framework for speech source separation and noise reduction

M Souden, S Araki, K Kinoshita… - … on Audio, Speech …, 2013 - ieeexplore.ieee.org
We propose a new framework for joint multichannel speech source separation and acoustic
noise reduction. In this framework, we start by formulating the minimum-mean-square error …

A probabilistic model of phonological relationships from contrast to allophony

KC Hall - 2009 - rave.ohiolink.edu
This dissertation proposes a model of phonological relationships, the Probabilistic
Phonological Relationship Model (PPRM), that quantifies how predictably distributed two …

Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera

T Hori, S Araki, T Yoshioka, M Fujimoto… - IEEE transactions on …, 2011 - ieeexplore.ieee.org
This paper presents our real-time meeting analyzer for monitoring conversations in an
ongoing group meeting. The goal of the system is to recognize automatically “who is …

Neural architecture search for LF-MMI trained time delay neural networks

S Hu, X **e, M Cui, J Deng, S Liu, J Yu… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
State-of-the-art automatic speech recognition (ASR) system development is data and
computation intensive. The optimal design of deep neural networks (DNNs) for these …

[PDF][PDF] Neural Error Corrective Language Models for Automatic Speech Recognition.

T Tanaka, R Masumura, H Masataki, Y Aono - INTERSPEECH, 2018 - isca-archive.org
We present novel neural network based language models that can correct automatic speech
recognition (ASR) errors by using speech recognizer output as a context. These models …

Meeting recognition with asynchronous distributed microphone array using block-wise refinement of mask-based MVDR beamformer

S Araki, N Ono, K Kinoshita… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
This paper addresses a front-end system for speech recognition of spontaneous
conversational speech signals that are recorded with asynchronous distributed microphones …

Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments

N Ito, S Araki, M Delcroix… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Here we propose online adaptive beamforming for automatic speech recognition (ASR) in
meetings in noisy, reverberant environments. The proposed method is based on recently …

Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition

S Araki, M Okada, T Higuchi, A Ogawa… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org
This paper addresses a minimum variance distortionless response (MVDR) beamforming
based speech enhancement approach for meeting speech recognition. In a meeting …