- Academic Search

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Speichern Zitieren Zitiert von: 228 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] scichina.com

From single-to multi-modal remote sensing imagery interpretation: A survey and taxonomy

X Sun, Y Tian, W Lu, P Wang, R Niu, H Yu… - Science China Information …, 2023 - Springer

Modality is a source or form of information. Through various modal information, humans can
perceive the world from multiple perspectives. Simultaneously, the observation of remote …

Speichern Zitieren Zitiert von: 55 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] mlr.press

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press

We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

Speichern Zitieren Zitiert von: 3817 Ähnliche Artikel Alle 11 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

XLS-R: Self-supervised cross-lingual speech representation learning at scale

A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …

Speichern Zitieren Zitiert von: 711 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Hubert: Self-supervised speech representation learning by masked prediction of hidden units

WN Hsu, B Bolte, YHH Tsai, K Lakhotia… - … ACM transactions on …, 2021 - ieeexplore.ieee.org

Self-supervised approaches for speech representation learning are challenged by three
unique problems:(1) there are multiple sound units in each input utterance,(2) there is no …

Speichern Zitieren Zitiert von: 3013 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] aaai.org

Ts2vec: Towards universal representation of time series

Z Yue, Y Wang, J Duan, T Yang, C Huang… - Proceedings of the …, 2022 - ojs.aaai.org

This paper presents TS2Vec, a universal framework for learning representations of time
series in an arbitrary semantic level. Unlike existing methods, TS2Vec performs contrastive …

Speichern Zitieren Zitiert von: 615 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] thecvf.com

Going deeper with image transformers

H Touvron, M Cord, A Sablayrolles… - Proceedings of the …, 2021 - openaccess.thecvf.com

Transformers have been recently adapted for large scale image classification, achieving
high scores shaking up the long supremacy of convolutional neural networks. However the …

Speichern Zitieren Zitiert von: 1211 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training

YA Chung, Y Zhang, W Han, CC Chiu… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Motivated by the success of masked language modeling (MLM) in pre-training natural
language processing models, we propose w2v-BERT that explores MLM for self-supervised …

Speichern Zitieren Zitiert von: 460 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] arxiv.org

Fleurs: Few-shot learning evaluation of universal representations of speech

A Conneau, M Ma, S Khanuja, Y Zhang… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of
Speech benchmark. FLEURS is an n-way parallel speech dataset in 102 languages built on …

Speichern Zitieren Zitiert von: 279 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] neurips.cc

Unsupervised speech recognition

A Baevski, WN Hsu, A Conneau… - Advances in Neural …, 2021 - proceedings.neurips.cc

Despite rapid progress in the recent past, current speech recognition systems still require
labeled training data which limits this technology to a small fraction of the languages spoken …

Speichern Zitieren Zitiert von: 331 Ähnliche Artikel Alle 6 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Self-training and pre-training are complementary for speech recognition

A review of deep learning techniques for speech processing

From single-to multi-modal remote sensing imagery interpretation: A survey and taxonomy

Robust speech recognition via large-scale weak supervision

XLS-R: Self-supervised cross-lingual speech representation learning at scale

Hubert: Self-supervised speech representation learning by masked prediction of hidden units

Ts2vec: Towards universal representation of time series

Going deeper with image transformers

W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training

Fleurs: Few-shot learning evaluation of universal representations of speech

Unsupervised speech recognition