Google Académico

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Guardar Citar Citado por 227 Artículos relacionados Las 6 versiones

[Free GPT-4]

[PDF] dtu.dk

Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

Guardar Citar Citado por 404 Artículos relacionados Las 10 versiones

[Free GPT-4]

[PDF] mlr.press

Data2vec: A general framework for self-supervised learning in speech, vision and language

A Baevski, WN Hsu, Q Xu, A Babu… - … on Machine Learning, 2022 - proceedings.mlr.press

While the general idea of self-supervised learning is identical across modalities, the actual
algorithms and objectives differ widely because they were developed with a single modality …

Guardar Citar Citado por 933 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] thecvf.com

Flava: A foundational language and vision alignment model

A Singh, R Hu, V Goswami… - Proceedings of the …, 2022 - openaccess.thecvf.com

State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic
pretraining for obtaining good performance on a variety of downstream tasks. Generally …

Guardar Citar Citado por 732 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Wavlm: Large-scale self-supervised pre-training for full stack speech processing

S Chen, C Wang, Z Chen, Y Wu, S Liu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …

Guardar Citar Citado por 1833 Artículos relacionados Las 5 versiones

[Free GPT-4]

[PDF] arxiv.org

Superb: Speech processing universal performance benchmark

S Yang, PH Chi, YS Chuang, CIJ Lai… - arxiv preprint arxiv …, 2021 - arxiv.org

Self-supervised learning (SSL) has proven vital for advancing research in natural language
processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on …

Guardar Citar Citado por 961 Artículos relacionados Las 11 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training

YA Chung, Y Zhang, W Han, CC Chiu… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Motivated by the success of masked language modeling (MLM) in pre-training natural
language processing models, we propose w2v-BERT that explores MLM for self-supervised …

Guardar Citar Citado por 462 Artículos relacionados Las 5 versiones

[Free GPT-4]

[PDF] mlr.press

Efficient self-supervised learning with contextualized target representations for vision, speech and language

A Baevski, A Babu, WN Hsu… - … Conference on Machine …, 2023 - proceedings.mlr.press

Current self-supervised learning algorithms are often modality-specific and require large
amounts of computational resources. To address these issues, we increase the training …

Guardar Citar Citado por 103 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Transfer learning based physics-informed neural networks for solving inverse problems in engineering structures under different loading scenarios

C Xu, BT Cao, Y Yuan, G Meschke - Computer Methods in Applied …, 2023 - Elsevier

Recently, a class of machine learning methods called physics-informed neural networks
(PINNs) has been proposed and gained prevalence in solving various scientific computing …

Guardar Citar Citado por 142 Artículos relacionados Las 6 versiones

[Free GPT-4]

[PDF] arxiv.org

A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding

Y Wang, A Boumadane, A Heba - arxiv preprint arxiv:2111.02735, 2021 - arxiv.org

Speech self-supervised models such as wav2vec 2.0 and HuBERT are making revolutionary
progress in Automatic Speech Recognition (ASR). However, they have not been totally …

Guardar Citar Citado por 183 Artículos relacionados Las 3 versiones Versión en HTML

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

A review of deep learning techniques for speech processing

Self-supervised speech representation learning: A review

Data2vec: A general framework for self-supervised learning in speech, vision and language

Flava: A foundational language and vision alignment model

Wavlm: Large-scale self-supervised pre-training for full stack speech processing

Superb: Speech processing universal performance benchmark

W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training

Efficient self-supervised learning with contextualized target representations for vision, speech and language

Transfer learning based physics-informed neural networks for solving inverse problems in engineering structures under different loading scenarios

A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding