Mind your inflections! Improving NLP for non-standard Englishes with Base-Inflection Encoding

S Tan, S Joty, LR Varshney, MY Kan - arxiv preprint arxiv:2004.14870, 2020 - arxiv.org
Inflectional variation is a common feature of World Englishes such as Colloquial Singapore
English and African American Vernacular English. Although comprehension by human …

Scope is all you need: Transforming LLMs for HPC Code

T Kadosh, N Hasabnis, VA Vo, N Schneider… - arxiv preprint arxiv …, 2023 - arxiv.org
With easier access to powerful compute resources, there is a growing trend in the field of AI
for software development to develop larger and larger language models (LLMs) to address a …

Morphological analysis and disambiguation for Gulf Arabic: The interplay between resources and methods

S Khalifa, N Zalmout, N Habash - Proceedings of the Twelfth …, 2020 - aclanthology.org
In this paper we present the first full morphological analysis and disambiguation system for
Gulf Arabic. We use an existing state-of-the-art morphological disambiguation system to …

The paradigm discovery problem

A Erdmann, M Elsner, S Wu, R Cotterell… - arxiv preprint arxiv …, 2020 - arxiv.org
This work treats the paradigm discovery problem (PDP), the task of learning an inflectional
morphological system from unannotated sentences. We formalize the PDP and develop …

Deep Active Learning for Morphophonological Processing

SM Mirbostani, Y Boreshban, S Khalifa… - Proceedings of the …, 2023 - aclanthology.org
Building a system for morphological processing is a challenging task in morphologically
complex languages like Arabic. Although there are some deep learning based models that …

Analysis of Subword based Word Representations Case Study: Fasttext Malayalam

MR Vivek, P Chandran - 2022 IEEE 19th India Council …, 2022 - ieeexplore.ieee.org
Representation learning has played a crucial role in various natural language processing
tasks since the advent of deep neural networks. Recent research shows that word …

Towards learning Arabic morphophonology

S Khalifa, J Kodner, O Rambow - Proceedings of the Seventh …, 2022 - aclanthology.org
One core challenge facing morphological inflection systems is capturing language-specific
morphophonological changes. This is particularly true of languages like Arabic which are …

Towards robust complexity indices in linguistic typology: A corpus-based assessment

YM Oh, F Pellegrino - Studies in Language, 2023 - jbe-platform.com
There is high hope that corpus-based approaches to language complexity will contribute to
explaining linguistic diversity. Several complexity indices have consequently been proposed …

Unsupervised Arabic dialect segmentation for machine translation

W Salloum, N Habash - Natural Language Engineering, 2022 - cambridge.org
Resource-limited and morphologically rich languages pose many challenges to natural
language processing tasks. Their highly inflected surface forms inflate the vocabulary size …

[PDF][PDF] Geometric Patterns in Text and Multilingual NLP

O Pelloni - 2023 - zora.uzh.ch
Linguistics as a science is concerned with classifying languages according to different
properties in order to understand how language works. There are several well-known ways …