Having beer after prayer? measuring cultural bias in large language models
As the reach of large language models (LMs) expands globally, their ability to cater to
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
Algorithms and theory for multiple-source adaptation
We present a number of novel contributions to the multiple-source adaptation problem. We
derive new normalized solutions with strong theoretical guarantees for the cross-entropy …
derive new normalized solutions with strong theoretical guarantees for the cross-entropy …
Validating large language models with relm
Although large language models (LLMs) have been touted for their ability to generate
natural-sounding text, there are growing concerns around possible negative effects of LLMs …
natural-sounding text, there are growing concerns around possible negative effects of LLMs …
Neural models of text normalization for speech applications
Abstract Machine learning, including neural network techniques, have been applied to
virtually every domain in natural language processing. One problem that has been …
virtually every domain in natural language processing. One problem that has been …
Hierarchical structure guides rapid linguistic predictions during naturalistic listening
The grammar, or syntax, of human language is typically understood in terms of abstract
hierarchical structures. However, theories of language processing that emphasize …
hierarchical structures. However, theories of language processing that emphasize …
RNN approaches to text normalization: A challenge
This paper presents a challenge to the community: given a large corpus of written text
aligned to its normalized spoken form, train an RNN to learn the correct normalization …
aligned to its normalized spoken form, train an RNN to learn the correct normalization …
Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework
This paper provides an analysis of several practical issues related to the theory and
implementation of Grapheme-to-Phoneme (G2P) conversion systems utilizing the Weighted …
implementation of Grapheme-to-Phoneme (G2P) conversion systems utilizing the Weighted …
The SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion
K Gorman, LFE Ashby, A Goyzueta… - Proceedings of the …, 2020 - aclanthology.org
We describe the design and findings of the SIGMORPHON 2020 shared task on multilingual
grapheme-to-phoneme conversion. Participants were asked to submit systems which take in …
grapheme-to-phoneme conversion. Participants were asked to submit systems which take in …
The Kestrel TTS text normalization system
P Ebden, R Sproat - Natural Language Engineering, 2015 - cambridge.org
This paper describes the Kestrel text normalization system, a component of the Google text-
to-speech synthesis (TTS) system. At the core of Kestrel are text-normalization grammars …
to-speech synthesis (TTS) system. At the core of Kestrel are text-normalization grammars …
Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early Century Paris Census
We aim to build a vast database (up to 9 million individuals) from the handwritten tabular
nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 …
nominal census of Paris of 1926, 1931 and 1936, each composed of about 100,000 …