Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

Four generations of high-dimensional neural network potentials

J Behler - Chemical Reviews, 2021 - ACS Publications
Since their introduction about 25 years ago, machine learning (ML) potentials have become
an important tool in the field of atomistic simulations. After the initial decade, in which neural …

Transmorph: Transformer for unsupervised medical image registration

J Chen, EC Frey, Y He, WP Segars, Y Li, Y Du - Medical image analysis, 2022 - Elsevier
In the last decade, convolutional neural networks (ConvNets) have been a major focus of
research in medical image analysis. However, the performances of ConvNets may be limited …

Speech recognition using deep neural networks: A systematic review

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org
Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

Activation functions: Comparison of trends in practice and research for deep learning

C Nwankpa, W Ijomah, A Gachagan… - arxiv preprint arxiv …, 2018 - arxiv.org
Deep neural networks have been successfully used in diverse emerging domains to solve
real world complex problems with may more deep learning (DL) architectures, being …

Dataset condensation with gradient matching

B Zhao, KR Mopuri, H Bilen - arxiv preprint arxiv:2006.05929, 2020 - arxiv.org
As the state-of-the-art machine learning methods in many fields rely on larger datasets,
storing datasets and training models on them become significantly more expensive. This …

Deep learning in mining biological data

M Mahmud, MS Kaiser, TM McGinnity, A Hussain - Cognitive computation, 2021 - Springer
Recent technological advancements in data acquisition tools allowed life scientists to
acquire multimodal data from different biological application domains. Categorized in three …

Aishell-1: An open-source mandarin speech corpus and a speech recognition baseline

H Bu, J Du, X Na, B Wu, H Zheng - … of the oriental chapter of the …, 2017 - ieeexplore.ieee.org
An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the
largest corpus which is suitable for conducting the speech recognition research and building …

Single-image crowd counting via multi-column convolutional neural network

Y Zhang, D Zhou, S Chen, S Gao… - Proceedings of the …, 2016 - openaccess.thecvf.com
This paper aims to develop a method that can accurately estimate the crowd count from an
individual image with arbitrary crowd density and arbitrary perspective. To this end, we have …

{TensorFlow}: a system for {Large-Scale} machine learning

M Abadi, P Barham, J Chen, Z Chen, A Davis… - … USENIX symposium on …, 2016 - usenix.org
TensorFlow is a machine learning system that operates at large scale and in heterogeneous
environments. Tensor-Flow uses dataflow graphs to represent computation, shared state …