Recent developments in opensmile, the munich open-source multimedia feature extractor

F Eyben, F Weninger, F Gross, B Schuller - Proceedings of the 21st ACM …, 2013 - dl.acm.org
We present recent developments in the openSMILE feature extraction toolkit. Version 2.0
now unites feature extraction paradigms from speech, music, and general sound events with …

Authentication of smartphone users using behavioral biometrics

A Alzubaidi, J Kalita - IEEE Communications Surveys & …, 2016 - ieeexplore.ieee.org
Smartphones and tablets have become ubiquitous in our daily lives. Smartphones, in
particular, have become more than personal assistants. These devices have provided new …

Decoding speech perception from non-invasive brain recordings

A Défossez, C Caucheteux, J Rapin, O Kabeli… - Nature Machine …, 2023 - nature.com
Decoding speech from brain activity is a long-awaited goal in both healthcare and
neuroscience. Invasive devices have recently led to major milestones in this regard: deep …

SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arxiv preprint arxiv …, 2021 - arxiv.org
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

High-performance brain-to-text communication via handwriting

FR Willett, DT Avansino, LR Hochberg, JM Henderson… - Nature, 2021 - nature.com
Brain–computer interfaces (BCIs) can restore communication to people who have lost the
ability to move or speak. So far, a major focus of BCI research has been on restoring gross …

Wespeaker: A research and production oriented speaker embedding learning toolkit

H Wang, C Liang, S Wang, Z Chen… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Speaker modeling is essential for many related tasks, such as speaker recognition and
speaker diarization. The dominant modeling approach is fixed-dimensional vector …

Enabling factorized piano music modeling and generation with the MAESTRO dataset

C Hawthorne, A Stasyuk, A Roberts, I Simon… - arxiv preprint arxiv …, 2018 - arxiv.org
Generating musical audio directly with neural networks is notoriously difficult because it
requires coherently modeling structure at many different timescales. Fortunately, most music …

Look, listen, and learn more: Design choices for deep audio embeddings

AL Cramer, HH Wu, J Salamon… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
A considerable challenge in applying deep learning to audio classification is the scarcity of
labeled data. An increasingly popular solution is to learn deep audio embeddings from large …

Clipper: A {Low-Latency} online prediction serving system

D Crankshaw, X Wang, G Zhou, MJ Franklin… - … USENIX Symposium on …, 2017 - usenix.org
Clipper: A Low-Latency Online Prediction Serving System Page 1 This paper is included in the
Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation …

The voice conversion challenge 2018: Promoting development of parallel and nonparallel methods

J Lorenzo-Trueba, J Yamagishi, T Toda, D Saito… - arxiv preprint arxiv …, 2018 - arxiv.org
We present the Voice Conversion Challenge 2018, designed as a follow up to the 2016
edition with the aim of providing a common framework for evaluating and comparing …