- Academic Search

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

Enregistrer Citer Cité 124 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Natural language understanding of devanagari script languages: Language identification, hate speech and its target detection

S Thapa, K Rauniyar, FA Jafri, S Adhikari… - Proceedings of the …, 2025 - aclanthology.org

The growing use of Devanagari-script languages such as Hindi, Nepali, Marathi, Sanskrit,
and Bhojpuri on social media presents unique challenges for natural language …

Enregistrer Citer Cité 17 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Aya dataset: An open-access collection for multilingual instruction tuning

S Singh, F Vargus, D Dsouza, BF Karlsson… - arxiv preprint arxiv …, 2024 - arxiv.org

Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …

Enregistrer Citer Cité 67 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Mc2: Towards transparent and culturally-aware nlp for minority languages in china

C Zhang, M Tao, Q Huang, J Lin, Z Chen… - Proceedings of the …, 2024 - aclanthology.org

Current large language models demonstrate deficiencies in understanding low-resource
languages, particularly the minority languages in China. This limitation stems from the …

Enregistrer Citer Cité 3 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bhasa: A holistic southeast asian linguistic and cultural evaluation suite for large language models

WQ Leong, JG Ngui, Y Susanto, H Rengarajan… - arxiv preprint arxiv …, 2023 - arxiv.org

The rapid development of Large Language Models (LLMs) and the emergence of novel
abilities with scale have necessitated the construction of holistic, diverse and challenging …

Enregistrer Citer Cité 20 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language Technology

R Hada, S Husain, V Gumma, H Diddee… - The 2024 ACM …, 2024 - dl.acm.org

Existing research in measuring and mitigating gender bias predominantly centers on
English, overlooking the intricate challenges posed by non-English languages and the …

Enregistrer Citer Cité 2 fois Autres articles Les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Airavata: Introducing hindi instruction-tuned llm

J Gala, T Jayakumar, JA Husain, MSUR Khan… - arxiv preprint arxiv …, 2024 - arxiv.org

We announce the initial release of" Airavata," an instruction-tuned LLM for Hindi. Airavata
was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make …

Enregistrer Citer Cité 12 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] cambridge.org

OffensEval 2023: Offensive language identification in the age of Large Language Models

M Zampieri, S Rosenthal, P Nakov… - Natural Language …, 2023 - cambridge.org

The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular,
attracting over 1300 participating teams. The two editions of the shared task helped advance …

Enregistrer Citer Cité 14 fois Autres articles Les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Too late to train, too early to use? a study on necessity and viability of low-resource bengali llms

T Mahfuz, SK Dey, R Naswan, H Adil… - arxiv preprint arxiv …, 2024 - arxiv.org

Each new generation of English-oriented Large Language Models (LLMs) exhibits
enhanced cross-lingual transfer capabilities and significantly outperforms older LLMs on low …

Enregistrer Citer Cité 3 fois Autres articles Les 4 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vacaspati: A diverse corpus of bangla literature

P Bhattacharyya, J Mondal, S Maji… - arxiv preprint arxiv …, 2023 - arxiv.org

Bangla (or Bengali) is the fifth most spoken language globally; yet, the state-of-the-art NLP in
Bangla is lagging for even simple tasks such as lemmatization, POS tagging, etc. This is …

Enregistrer Citer Cité 5 fois Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Towards leaving no indic language behind: Building monolingual corpora, benchmark and models...

Aya model: An instruction finetuned open-access multilingual language model

Natural language understanding of devanagari script languages: Language identification, hate speech and its target detection

Aya dataset: An open-access collection for multilingual instruction tuning

Mc2: Towards transparent and culturally-aware nlp for minority languages in china

Bhasa: A holistic southeast asian linguistic and cultural evaluation suite for large language models

Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language Technology

Airavata: Introducing hindi instruction-tuned llm

OffensEval 2023: Offensive language identification in the age of Large Language Models

Too late to train, too early to use? a study on necessity and viability of low-resource bengali llms

Vacaspati: A diverse corpus of bangla literature