Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Pre-trained language models for text generation: A survey
Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …
NusaCrowd: Open source initiative for Indonesian NLP resources
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …
Indonesian languages, including opening access to previously non-public resources …
Towards robust automated math problem solving: a survey of statistical and deep learning approaches
Automated mathematical problem-solving represents a unique intersection of natural
language processing (NLP) and mathematical reasoning, posing significant challenges in …
language processing (NLP) and mathematical reasoning, posing significant challenges in …
Naamapadam: A large-scale named entity annotated data for Indic languages
We present, Naamapadam, the largest publicly available Named Entity Recognition (NER)
dataset for the 11 major Indian languages from two language families. The dataset contains …
dataset for the 11 major Indian languages from two language families. The dataset contains …
Airavata: Introducing hindi instruction-tuned llm
We announce the initial release of" Airavata," an instruction-tuned LLM for Hindi. Airavata
was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make …
was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make …
medit: Multilingual text editing via instruction tuning
We introduce mEdIT, a multi-lingual extension to CoEdIT--the recent state-of-the-art text
editing models for writing assistance. mEdIT models are trained by fine-tuning multi-lingual …
editing models for writing assistance. mEdIT models are trained by fine-tuning multi-lingual …
Dolphin: A challenging and diverse benchmark for Arabic NLG
We present Dolphin, a novel benchmark that addresses the need for a natural language
generation (NLG) evaluation framework dedicated to the wide collection of Arabic …
generation (NLG) evaluation framework dedicated to the wide collection of Arabic …
Pmindiasum: Multilingual and cross-lingual headline summarization for languages in india
This paper introduces PMIndiaSum, a multilingual and massively parallel summarization
corpus focused on languages in India. Our corpus provides a training and testing ground for …
corpus focused on languages in India. Our corpus provides a training and testing ground for …
V\= arta: A Large-Scale Headline-Generation Dataset for Indic Languages
We present V\= arta, a large-scale multilingual dataset for headline generation in Indic
languages. This dataset includes 41.8 million news articles in 14 different Indic languages …
languages. This dataset includes 41.8 million news articles in 14 different Indic languages …
Building pre-train llm dataset for the indic languages: a case study on hindi
S Parida, S Panwar, K Lata, S Mishra… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) demonstrated transformative capabilities in many
applications that require automatically generating responses based on human instruction …
applications that require automatically generating responses based on human instruction …