Machine Intelligence in Africa: a survey

AA Tapo, A Traoré, S Danioko, H Tembine - arxiv preprint arxiv …, 2024 - arxiv.org
In the last 5 years, the availability of large audio datasets in African countries has opened
unlimited opportunities to build machine intelligence (MI) technologies that are closer to the …

Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs

S Kumar, V Balloli, M Ranjit, K Ahuja, T Ganu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are at the forefront of transforming numerous domains
globally. However, their inclusivity and effectiveness remain limited for non-Latin scripts and …

Script-Agnostic Language Identification

M Agarwal, J Otten, A Anastasopoulos - arxiv preprint arxiv:2406.17901, 2024 - arxiv.org
Language identification is used as the first step in many data collection and crawling efforts
because it allows us to sort online text into language-specific buckets. However, many …

You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes

J Magomere, S Ishida, T Afonja, A Salama… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks
such as text-image searches, interactions with chatbots, and content generation. As use …

The Human Labour of Data Work: Capturing Cultural Diversity through World Wide Dishes

SM Hall, S Dalal, R Sefala, F Yuehgoh… - arxiv preprint arxiv …, 2025 - arxiv.org
We provide a window into the process of constructing a dataset for machine learning (ML)
applications by reflecting on the process of building World Wide Dishes (WWD), an image …

State of NLP in Kenya: A Survey

CJ Amol, EA Chimoto, RD Gesicho, AM Gitau… - arxiv preprint arxiv …, 2024 - arxiv.org
Kenya, known for its linguistic diversity, faces unique challenges and promising
opportunities in advancing Natural Language Processing (NLP) technologies, particularly …

Natural language processing for African languages

DI Adelani - 2022 - publikationen.sulb.uni-saarland.de
Recent advances in pre-training of word embeddings and language models leverage large
amounts of unlabelled texts and self-supervised learning to learn distributed representations …

Machine Intelligence in Africa: a survey

H Tembine, AA Tapo, S Danioko, A Traoré - Authorea Preprints, 2024 - techrxiv.org
In the last 5 years, the availability of large audio datasets in African countries has opened
unlimited opportunities to build machine intelligence (MI) technologies that are closer to the …

Towards a pre-trained Question-Answering language model for Kinyarwanda

D Tuyizere, F Ihirwe, R Ihabwikuzo… - 2023 IEEE Third …, 2023 - ieeexplore.ieee.org
Kinyarwanda, as one of the official languages of Rwanda plays a significant role in the
country's cultural, educational, and administrative domains. With the increasing demand for …

Resource-lean transfer methods for cross-lingual information retrieval

RM Litschko - 2024 - madoc.bib.uni-mannheim.de
Cross-Lingual Information Retrieval (CLIR) is the task of finding relevant documents written
in a language different from the query language. Neural machine translation systems and …