LinCE: A centralized benchmark for linguistic code-switching evaluation

G Aguilar, S Kar, T Solorio - arxiv preprint arxiv:2005.04322, 2020 - arxiv.org
Recent trends in NLP research have raised an interest in linguistic code-switching (CS);
modern approaches have been proposed to solve a wide range of NLP tasks on multiple …

Fighting hate speech from bilingual hinglish speaker's perspective, a transformer-and translation-based approach.

S Biradar, S Saumya, A Chauhan - Social Network Analysis and Mining, 2022 - Springer
Many people have begun to use social media platforms due to the increased use of the
Internet over the previous decade. It has a lot of benefits, but it also comes with a lot of risks …

Does mapo tofu contain coffee? probing llms for food-related cultural knowledge

L Zhou, T Karidi, W Liu, N Garneau, Y Cao… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent studies have highlighted the presence of cultural biases in Large Language Models
(LLMs), yet often lack a robust methodology to dissect these phenomena comprehensively …

Multilingual code-switching for zero-shot cross-lingual intent prediction and slot filling

J Krishnan, A Anastasopoulos, H Purohit… - arxiv preprint arxiv …, 2021 - arxiv.org
Predicting user intent and detecting the corresponding slots from text are two key problems
in Natural Language Understanding (NLU). In the context of zero-shot learning, this task is …

Does aggression lead to hate? Detecting and reasoning offensive traits in hinglish code-mixed texts

A Sengupta, SK Bhattacharjee, MS Akhtar… - Neurocomputing, 2022 - Elsevier
Aggression is a prominent trait of human beings that can affect social harmony in a negative
way. The hate mongers misuse the freedom of speech in social media platforms to flood with …

A code-mixed task-oriented dialog dataset for medical domain

S Dowlagar, R Mamidi - Computer Speech & Language, 2023 - Elsevier
In the healthcare domain, medical and patient interactions form a crucial part of the
diagnosis. Initially, the AI models developed for healthcare centered only on monolingual …

Calcs 2021 shared task: Machine translation for code-switched data

S Chen, G Aguilar, A Srinivasan, M Diab… - arxiv preprint arxiv …, 2022 - arxiv.org
To date, efforts in the code-switching literature have focused for the most part on language
identification, POS, NER, and syntactic parsing. In this paper, we address machine …

Can you traducir this? machine translation for code-switched input

J Xu, F Yvon - arxiv preprint arxiv:2105.04846, 2021 - arxiv.org
Code-Switching (CSW) is a common phenomenon that occurs in multilingual geographic or
social contexts, which raises challenging problems for natural language processing tools …

Char2Subword: Extending the subword embedding space using robust character compositionality

G Aguilar, B McCann, T Niu, N Rajani, N Keskar… - arxiv preprint arxiv …, 2020 - arxiv.org
Byte-pair encoding (BPE) is a ubiquitous algorithm in the subword tokenization process of
language models as it provides multiple benefits. However, this process is solely based on …

A Comprehensive Understanding of Code-Mixed Language Semantics Using Hierarchical Transformer

T Suresh, A Sengupta, MS Akhtar… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Being a popular mode of text-based communication in multilingual communities, code
mixing in online social media has become an important subject to study. Learning the …