- Academic Search

G Shen, M Watkins, A Alishahi, A Bisazza - arxiv preprint arxiv …, 2024 - arxiv.org

Interpretability research has shown that self-supervised Spoken Language Models (SLMs)
encode a wide variety of features in human speech from the acoustic, phonetic …

保存引用被引用次数：3 相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] mdpi.com

Computational Modelling of Tone Perception Based on Direct Processing of f₀ Contours

Y Chen, Y Gao, Y Xu - Brain Sciences, 2022 - mdpi.com

It has been widely assumed that in speech perception it is imperative to first detect a set of
distinctive properties or features and then use them to recognize phonetic units like …

保存引用被引用次数：10 相关文章所有 9 个版本网页快照

[Free GPT-4]

[PDF] arxiv.org

Decoupling recognition and transcription in mandarin asr

J Yuan, X Cai, D Gao, R Zheng… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Much of the recent literature on automatic speech recognition (ASR) is taking an end-to-end
approach. Unlike English where the writing system is closely related to sound, Chinese …

保存引用被引用次数：11 相关文章所有 3 个版本

[Free GPT-4]

[PDF] arxiv.org

Multi-variant consistency based self-supervised learning for robust automatic speech recognition

C Gao, G Cheng, P Zhang - arxiv preprint arxiv:2112.12522, 2021 - arxiv.org

Automatic speech recognition (ASR) has shown rapid advances in recent years but still
degrades significantly in far-field and noisy environments. The recent development of self …

保存引用被引用次数：4 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] isca-archive.org

[PDF][PDF] Improved contextualized speech representations for tonal analysis

J Yuan, X Cai, K Church - Proceedings of Interspeech, 2023 - isca-archive.org

We propose fine-tuning wav2vec2. 0 with a cross-entropy loss to classify tones in an
utterance on a frame-by-frame basis. Our study demonstrates that this approach not only …

保存引用被引用次数：3 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models

A de la Fuente, D Jurafsky - arxiv preprint arxiv:2408.13678, 2024 - arxiv.org

This study asks how self-supervised speech models represent suprasegmental categories
like Mandarin lexical tone, English lexical stress, and English phrasal accents. Through a …

保存引用相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Automated Tone Transcription and Clustering with Tone2Vec

Y Yang, Y Wang, ZQ Tang, J Yuan - arxiv preprint arxiv:2410.02324, 2024 - arxiv.org

Lexical tones play a crucial role in Sino-Tibetan languages. However, current phonetic
fieldwork relies on manual effort, resulting in substantial time and financial costs. This is …

保存引用相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] isca-archive.org

[PDF][PDF] Deep Prosodic Features in Tandem with Perceptual Judgments of Word Reduction for Tone Recognition in Conversed Speech

XL Lu, YF Liu - Proc. Interspeech 2024, 2024 - isca-archive.org

To tackle the tone classification problem in conversational speech, we propose a
transformer-based encoding network to classify tones in an utterance on a syllable-by …

保存引用相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] cmu.edu

[PDF][PDF] Low-Resource Speech Recognition for Thousands of Languages

X Li - 2023 - kilthub.cmu.edu

Recently, the performance of speech recognition has witnessed rapid improvement due to
modern architectures. Those models typically require thousands of hours of training data for …

保存引用被引用次数：1 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] aclanthology.org

Data Augmentation for the Post-Stroke Speech Transcription (PSST) Challenge: Sometimes Less is More

J Yuan, X Cai, K Church - … of the RaPID Workshop-Resources and …, 2022 - aclanthology.org

We employ the method of fine-tuning wav2vec2. 0 for recognition of phonemes in aphasic
speech. Our effort focuses on data augmentation, by supplementing data from both in …

保存引用相关文章所有 3 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Automatic recognition of suprasegmentals in speech

Encoding of lexical tone in self-supervised models of spoken language

Computational Modelling of Tone Perception Based on Direct Processing of f₀ Contours

Decoupling recognition and transcription in mandarin asr

Multi-variant consistency based self-supervised learning for robust automatic speech recognition

[PDF][PDF] Improved contextualized speech representations for tonal analysis

A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models

Automated Tone Transcription and Clustering with Tone2Vec

[PDF][PDF] Deep Prosodic Features in Tandem with Perceptual Judgments of Word Reduction for Tone Recognition in Conversed Speech

[PDF][PDF] Low-Resource Speech Recognition for Thousands of Languages

Data Augmentation for the Post-Stroke Speech Transcription (PSST) Challenge: Sometimes Less is More