Text stemming: Approaches, applications, and challenges

J Singh, V Gupta - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
Stemming is a process in which the variant word forms are mapped to their base form. It is
among the basic text pre-processing approaches used in Language Modeling, Natural …

KreolStem: A hybrid language-dependent stemmer for Kreol Morisien

B Gobin-Rahimbux, I Maudhoo… - Journal of Experimental …, 2024 - Taylor & Francis
Stemming is a technique used to transform words to their root forms. It is used in various
Natural Language Processing applications to improve performance and accuracy. In this …

The rule-based sundanese stemmer

AA Suryani, DH Widyantoro, A Purwarianti… - ACM Transactions on …, 2018 - dl.acm.org
Our research proposed an iterative Sundanese stemmer by removing the derivational affixes
prior to the inflexional. This scheme was chosen because, in the Sundanese affixation, a …

A comparative review of Urdu stemmers: Approaches and challenges

A Jabbar, S ul Islam, S Hussain, A Akhunzada… - Computer Science …, 2019 - Elsevier
With the advent of globalization epoch, the Internet-based resources for Urdu are increasing
in depth and breadth at a higher pace than ever and thus require a mechanism for …

Effect of Stopwords and Stemming Techniques in Urdu IR

SS Sahu, D Dutta, S Pal, I Rasheed - SN Computer Science, 2023 - Springer
This paper explores and evaluates the effect of different stopword removal and stemming
techniques in Urdu IR. The issues are examined from four viewpoints. Is there any …

[PDF][PDF] Comparative study of truncating and statistical stemming algorithms

S Memon, GA Mallah, KN Memon… - International Journal of …, 2020 - academia.edu
Search and indexing systems bear a significant quality called word stemming, is lump of
content excavating requests, IR frameworks and natural language handling frameworks. The …

Building a multilevel inflection handling stemmer to improve search effectiveness for Urdu language

A Jabbar, S Iqba, A Alaulamie, M Ilahi - IEEE Access, 2024 - ieeexplore.ieee.org
Stemming is an essential step in various Natural Language Processing (NLP) applications
and is used to reduce different variants of the query words to a standard form to avoid the …

An Extended Pattern Based Comprehensive Stemmer for the Urdu Language

M Ali, A Baqir, HH Raza Sherazi, S Khalid… - ACM Transactions on …, 2024 - dl.acm.org
The Urdu language is used by approximately 200 million people for spoken and written
communications on a daily basis. There is a substantial amount of unstructured Urdu textual …

LALITHA: A light weight Malayalam stemmer using suffix strip** method

U Prajitha, C Sreejith, PCR Raj - … International Conference on …, 2013 - ieeexplore.ieee.org
Stemming is the process of removing the affixes from inflections and to return the root form.
Malayalam is highly agglutinative in nature and hundreds of inflections are possible for each …

How low is too low? a monolingual take on lemmatisation in Indian languages

K Saunack, K Saurav… - Proceedings of the 2021 …, 2021 - aclanthology.org
Lemmatization aims to reduce the sparse data problem by relating the inflected forms of a
word to its dictionary form. Most prior work on ML based lemmatization has focused on high …