- Academic Search

M Lusetti, T Ruzsics, A Göhring, T Samardžić, E Stark - 2018 - zora.uzh.ch

Text normalization is the task of map** non-canonical language, typical of speech
transcription and computer-mediated communication, to a standardized writing. It is an up …

Uložit Citovat Počet citací tohoto článku: 71 Související články Všechny verze (počet: 12) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Dialect-to-standard normalization: A large-scale multilingual evaluation

O Kuparinen, A Miletić, Y Scherrer - Findings of the Association for …, 2023 - aclanthology.org

Text normalization methods have been commonly applied to historical language or user-
generated content, but less often to dialectal transcriptions. In this paper, we introduce …

Uložit Citovat Počet citací tohoto článku: 11 Související články Všechny verze (počet: 9) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

All Mixed Up? Finding the Optimal Feature Set for General Readability Prediction and Its Application to English and Dutch

O De Clercq, V Hoste - Computational Linguistics, 2016 - direct.mit.edu

Readability research has a long and rich tradition, but there has been too little focus on
general readability prediction without targeting a specific audience or text genre. Moreover …

Uložit Citovat Počet citací tohoto článku: 73 Související články Všechny verze (počet: 10)

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

[PDF][PDF] Normalizing tweets with edit scripts and recurrent neural embeddings

G Chrupała - Proceedings of the 52nd Annual Meeting of the …, 2014 - aclanthology.org

Tweets often contain a large proportion of abbreviations, alternative spellings, novel words
and other non-canonical language. These features are problematic for standard language …

Uložit Citovat Počet citací tohoto článku: 91 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Digitising Swiss German: how to process and study a polycentric spoken language

Y Scherrer, T Samardžić, E Glaser - Language Resources and Evaluation, 2019 - Springer

Swiss dialects of German are, unlike many dialects of other standardised languages, widely
used in everyday communication. Despite this fact, automatic processing of Swiss German is …

Uložit Citovat Počet citací tohoto článku: 47 Související články Všechny verze (počet: 13)

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

[PDF][PDF] Normalising Slovene data: historical texts vs. user-generated content

N Ljubešic, K Zupan, D Fišer, T Erjavec - Proceedings of the 13th …, 2016 - academia.edu

The paper presents two manually annotated Slovene language text normalisation datasets,
one of historical texts and the other of tweets, and proposes several variants of character …

Uložit Citovat Počet citací tohoto článku: 63 Související články Všechny verze (počet: 8) Zobrazit jako HTML

Social media text normalization for Turkish

G ERYİǦİT… - Natural Language …, 2017 - cambridge.org

Text normalization is an indispensable stage in processing noncanonical language from
natural sources, such as speech, social media or short text messages. Research in this field …

Uložit Citovat Počet citací tohoto článku: 56 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Multi-modular domain-tailored OCR post-correction

S Schulz, J Kuhn - Proceedings of the 2017 Conference on …, 2017 - aclanthology.org

One of the main obstacles for many Digital Humanities projects is the low data availability.
Texts have to be digitized in an expensive and time consuming process whereas Optical …

Uložit Citovat Počet citací tohoto článku: 52 Související články Všechny verze (počet: 3) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

[PDF][PDF] Automatic normalisation of the Swiss German ArchiMob corpus using character-level machine translation

Y Scherrer, N Ljubešic - Proceedings of the 13th conference on …, 2016 - academia.edu

Abstract The Swiss German dialect corpus Archi-Mob poses great challenges for NLP and
corpus linguistic research due to the massive amount of variation found in the transcriptions …

Uložit Citovat Počet citací tohoto článku: 47 Související články Všechny verze (počet: 7) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Graph-based Turkish text normalization and its impact on noisy text processing

S Demir, B Topcu - Engineering Science and Technology, an International …, 2022 - Elsevier

User generated texts on the web are freely-available and lucrative sources of data for
language technology researchers. Unfortunately, these texts are often dominated by …

Uložit Citovat Počet citací tohoto článku: 10 Související články Všechny verze (počet: 2)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Normalization of Dutch user-generated content

Encoder-decoder methods for text normalization

Dialect-to-standard normalization: A large-scale multilingual evaluation

All Mixed Up? Finding the Optimal Feature Set for General Readability Prediction and Its Application to English and Dutch

[PDF][PDF] Normalizing tweets with edit scripts and recurrent neural embeddings

Digitising Swiss German: how to process and study a polycentric spoken language

[PDF][PDF] Normalising Slovene data: historical texts vs. user-generated content

Social media text normalization for Turkish

Multi-modular domain-tailored OCR post-correction

[PDF][PDF] Automatic normalisation of the Swiss German ArchiMob corpus using character-level machine translation

[HTML][HTML] Graph-based Turkish text normalization and its impact on noisy text processing