Arabizi detection and conversion to Arabic
K Darwish - arxiv preprint arxiv:1306.6755, 2013 - arxiv.org
Arabizi is Arabic text that is written using Latin characters. Arabizi is used to present both
Modern Standard Arabic (MSA) or Arabic dialects. It is commonly used in informal settings …
Modern Standard Arabic (MSA) or Arabic dialects. It is commonly used in informal settings …
A rule-based Kurdish text transliteration system
S Ahmadi - ACM Transactions on Asian and Low-Resource …, 2019 - dl.acm.org
In this article, we present a rule-based approach for transliterating two of the most used
orthographies in Sorani Kurdish. Our work consists of detecting a character in a word by …
orthographies in Sorani Kurdish. Our work consists of detecting a character in a word by …
Design challenges in named entity transliteration
We analyze some of the fundamental design challenges that impact the development of a
multilingual state-of-the-art named entity transliteration system, including curating bilingual …
multilingual state-of-the-art named entity transliteration system, including curating bilingual …
[PDF][PDF] A statistical model for unsupervised and semi-supervised transliteration mining
We propose a novel model to automatically extract transliteration pairs from parallel corpora.
Our model is efficient, language pair independent and mines transliteration pairs in a …
Our model is efficient, language pair independent and mines transliteration pairs in a …
[PDF][PDF] Report of NEWS 2010 transliteration mining shared task
This report documents the details of the Transliteration Mining Shared Task that was run as
a part of the Named Entities Workshop (NEWS 2010), an ACL 2010 workshop. The shared …
a part of the Named Entities Workshop (NEWS 2010), an ACL 2010 workshop. The shared …
Context-aware correction of spelling errors in Hungarian medical documents
Owing to the growing need of acquiring medical data from clinical records, processing such
documents is an important topic in natural language processing (NLP). However, for general …
documents is an important topic in natural language processing (NLP). However, for general …
Machine‐Based Transliterate of Ottoman to Latin‐Based Script
In this paper, a machine‐based transliterate is presented. The automatic transliteration of
Ottoman to the modern Latin Turkish script can open a big window for scientists in fields of …
Ottoman to the modern Latin Turkish script can open a big window for scientists in fields of …
Context-aware correction of spelling errors in Hungarian medical documents
In our paper, we present a method for automated correction of spelling errors in Hungarian
clinical records. We model the problem of spelling correction as a translation task, where the …
clinical records. We model the problem of spelling correction as a translation task, where the …
[PDF][PDF] Improved transliteration mining using graph reinforcement
A El Kahki, K Darwish, AS El Din… - Proceedings of the …, 2011 - aclanthology.org
Mining of transliterations from comparable or parallel text can enhance natural language
processing applications such as machine translation and cross language information …
processing applications such as machine translation and cross language information …
The attention automaton: Sensing collective user interests in social network communities
The vast quantity of information shared in social networks has brought us to an age of
attention scarcity, where getting users to be attentive to a message is not a given. In fact, it …
attention scarcity, where getting users to be attentive to a message is not a given. In fact, it …